Welcome, Guest
Python Scripts for ComicRack

TOPIC: Bonelli (www.sergiobonelli.it) Scraper v4 BETA (Italian publisher)

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 3 months ago #36430

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
Same here.
I uninstalled all Mizio scripts and installed first Diabolik (works great!), and then Bonelli.
There is something wrong with date: if I disable date, I can scrape all other fields.

Don't know why Mizio can't reproduce this problem and can scrape correctly... :blink:
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 3 months ago #36434

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
You are right - I have just disabled the date field and was able to scrape the other fields
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 3 months ago #36537

  • duque
  • duque's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 2
  • Karma: 0
Hi all,
first of all, thanks to Mizio66 for the great work.
I have same problem of luke_70it.
I'd ask if it's normal that the file "Collane_Diabolik.txt" is 340KB while the "Collane_Bonelli.txt" only 6KB.

(a struggle to post in english...):lol:

Ciao
Last Edit: 4 years 3 months ago by duque. Reason: Bad english
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 3 months ago #36543

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
I think that's normal as the Collane Bonelli are simpler than Diabolik.


On another note: I have notice that there is a problem with Almanacco Dell'Avventura due to the changes on the website.
Specifically Bonelli in their wisdom have split the same series into two: They have numbered both starting from 1 rather than interleaving them. However for some odd reason Mister No's Almanacco of 2013 is numbered as both 6 and 11.

I have simply created an additional line in collane bonelli to address the issue.
Last Edit: 4 years 3 months ago by rmagere.
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 3 months ago #36603

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
Ciao,
I report a strange behaviour with 1st issue in serie "Le storie".
I have fist 2 issues: 2nd is correctly scraped; the first one is scraped as 11 (next number to be published).

Luca
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v2 BETA (Italian publisher) 4 years 2 weeks ago #37524

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
Just wanted to flag:
  • The date issue is still present - For some reason I thought it was working but it is not i.e. if published date is selected to be scraped then the script fails, while when unselected the script works
  • The scraper appears to be scraping less information than it used to when it was able to get hold of the date information. E.g. I had scraped "Napoleone 6" when the scraper was able to get the date and it had also downloaded Pasquale del Vecchio as penciller, now it does not scrape the data nor the artist
  • Whenever I scrape Tex 17 (regular series)it picks up Maxi Tex 17 instead. This only happens with Tex 17 and works just fine with Tex 1 to 16 and 18+
  • Orfani is not picked up by the scraper however when I add the series in Collane_Bonelli it works without a problem

Thank you so much for all your work on these scripts (I have just started using CoA and I love it :woohoo:)
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v3 BETA (Italian publisher) 4 years 6 days ago #37650

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
Updated to v3.0b....

- Logic of scraping changed almost completely to adapt to the new site's structure
- QuickScrape fixed
- Partial generation (refresh) can be manually triggered with shift+click on button

Some comments:
The date issue is still present - For some reason I thought it was working but it is not i.e. if published date is selected to be scraped then the script fails, while when unselected the script works
For this i need the debug log. Activate the log in the configuration and start CR in debug mode. a new window will appear that will show behind the scene. To start CR in debug mode, run it as "C:\Program Files\ComicRack\ComicRack.exe" -dso -ssc
The scraper appears to be scraping less information than it used to when it was able to get hold of the date information. E.g. I had scraped "Napoleone 6" when the scraper was able to get the date and it had also downloaded Pasquale del Vecchio as penciller, now it does not scrape the data nor the artist
Should be fixed
Whenever I scrape Tex 17 (regular series)it picks up Maxi Tex 17 instead. This only happens with Tex 17 and works just fine with Tex 1 to 16 and 18+
Should be fixed
Orfani is not picked up by the scraper however when I add the series in Collane_Bonelli it works without a problem
should be fixed

The Collane file is regenerated and improved. No need to regeneare it, is included in the new version.

Please test it thoroughly... especially with Quickscrape and with albums with several stories (like Agenzia Alfa).

Let me know!

Link to DL the plugin and the Manual

Link
The administrator has disabled public write access.
The following user(s) said Thank You: rmagere, duque

Re: Bonelli (www.sergiobonelli.it) Scraper v3 BETA (Italian publisher) 4 years 2 days ago #37689

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
Thanks! :laugh:
I am currently on holiday but once back I will test it properly.
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v3 BETA (Italian publisher) 3 years 11 months ago #37869

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
Will test the other points over the weekend however attached are the log requested for the published data issue.
mizio66 wrote:
Some comments:
The date issue is still present - For some reason I thought it was working but it is not i.e. if published date is selected to be scraped then the script fails, while when unselected the script works
For this i need the debug log. Activate the log in the configuration and start CR in debug mode. a new window will appear that will show behind the scene. To start CR in debug mode, run it as "C:\Program Files\ComicRack\ComicRack.exe" -dso -ssc

Debug log from scraper (tried to scrape Tex 101-110, Greystorm 10-11, Tex 101-110 again with published data on)
Warning: Spoiler! [ Click to expand ]


Below is the output from the Comicrack window
Warning: Spoiler! [ Click to expand ]


Let me know if I need to try anything else to provide further inputs.

Thank you so much for your work on this tracker :)
The administrator has disabled public write access.

Re: Bonelli (www.sergiobonelli.it) Scraper v3 BETA (Italian publisher) 3 years 11 months ago #37877

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
I think it is related to the same error as before... For some reason, your date/time format is not recognized. Now i know where so i can patch it... I hope!
The administrator has disabled public write access.
Time to create page: 0.319 seconds

Who's Online

We have 270 guests and 2 members online