Welcome, Guest
Python Scripts for ComicRack

TOPIC: Bonelli (www.sergiobonelli.it) Scraper v4 BETA (Italian publisher)

Bonelli (www.sergiobonelli.it) Scraper v4 BETA (Italian publisher) 5 years 2 months ago #21632

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
A new release of the script to scrape data from www.sergiobonelli.it is ready !

*** Version updated to 4.0 ***

It will scrape (read, retrieve and save) data for eComics/fileless from the www.sergiobonellieditore.it website, the reference for the monthly (mainly) Italian (and not) adventure comics, home to Tex, Zagor, Nathan Never, Martin Mystére, Dampyr, Mister No and more…

It is (almost) fully configurable and despite the beta, should be able to scrape a good 90% of the available issues around.

Give it a try, report any bug and suggestions, they're always welcome !

!! READ THE MANUAL !!

Enjoy !

M

Scraper v4.0b

Manual
Last Edit: 4 months 2 weeks ago by mizio66.
The administrator has disabled public write access.
The following user(s) said Thank You: kenjio, rmagere, wolverine68, duque, alterego1026

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21642

  • kenjio
  • kenjio's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 597
  • Thank you received: 127
  • Karma: 32
Going to try this right away!
I'm baaaaaaaaaaaaaaack!!
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21646

  • kenjio
  • kenjio's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 597
  • Thank you received: 127
  • Karma: 32
Awesome x infinity

I just scraped the first 10 Issues of Dylan Dog Collezione Book, and currently scraping another 100, seems to be working flawlessly.

Grazie grazie grazie!!!
I'm baaaaaaaaaaaaaaack!!
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21647

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
be careful, DYD is one of the problems... the website is malformedin some parts,be careful around issue 100-106... maybe more...

ciao

M
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21648

  • kenjio
  • kenjio's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 597
  • Thank you received: 127
  • Karma: 32
I've noticed some of the covers seem to be missing, going to figure out how to fetch those
I'm baaaaaaaaaaaaaaack!!
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21649

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
give me the references, will see why...
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21651

  • kenjio
  • kenjio's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 597
  • Thank you received: 127
  • Karma: 32
the issues where the covers are not scraped are also lacking a cover on the website, so it's their problem.


Issue 32 does not get scraped, so I entered this

www.sergiobonellieditore.it/auto/scheda_...20&numero=32&subnum=

manually, and it still didn't work.


I also can't seem to get "Dylan Dog Super Book" to scrape..
I'm baaaaaaaaaaaaaaack!!
Last Edit: 5 years 2 months ago by kenjio.
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.00 BETA (Italian publisher) 5 years 2 months ago #21708

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
as i was saying... the webpage code is malformed (</b< & <b>) so i need to find a way to intercept those errors, as the HTMLParseError seems not able to...

Unless someone knows other ways to fix the problem in the page...

M
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.01 BETA (Italian publisher) 5 years 2 months ago #21783

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
version updated to 1.01b

Dome of the Quickscrape and DYD issues solved, hopefully... as indicated, the problem is on the malformed tags on the website, so nothing I can really do seriously...

Anyhow, a quick fix looks like it solved... fragile fix though... cross fingers.

Link on the first post as usual.

Enjoy,

M
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v1.1 BETA (Italian publisher) 5 years 1 month ago #22101

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 445
  • Thank you received: 138
  • Karma: 67
version updated to 1.1b

Please unistall any precedent release and reinstall after restarting CR

The multi-story problem should have been solved (try Agenzia Alfa) and also there is change for the Almanacco del Giallo.

!!! READ THE MANUAL !!! don't be a usual user...:ohmy:

Link on the first post as usual.

Enjoy,

M
Last Edit: 5 years 1 month ago by mizio66.
The administrator has disabled public write access.
The following user(s) said Thank You: kenjio
Time to create page: 0.214 seconds

Who's Online

We have 241 guests and 3 members online