Welcome, Guest
Python Scripts for ComicRack

TOPIC: Diabolik (www.diabolik.it.it) Scraper v1.6 (Italian comic)

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36292

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
Thank you!
I'll be trying it out this weekend :)
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36372

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
Ciao,
now I'm pretty sure it's me!!

I've just tried latest Diabolik script on my "Il Grande Diabolik" comics, and I can only scrape title, publisher and language.

Any other information is not scraped...

In debug log I can find:

Caught IndexError: no such group
C:\Users\lazzarini\AppData\Roaming\cYo\ComicRack\Scripts\Diabolik Scraper\Diabolik.py,367,parseAlbumInfo

Arrrrrgh, Sigh, ...sob!

Luca

Edit: Diabolik (when I compile volume and number) works fine! Only Il Grande Diabolik fails...
Last Edit: 4 years 3 months ago by luke_70it.
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36374

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
You are going through a proxy?

Maybe this is the only thing i can think of... The html code is read differently than without in some cases...

Just a thought... Can you open the page that is scraping (the one of the album) and send me the source of it?

Thx

M
Last Edit: 4 years 3 months ago by mizio66.
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36375

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
No, I'm not using a proxy. I'm behind a firewall, but i don't think this may be an issue...

Regular Diabolik serie scrape ok. Only Il Grande Diabolik fails...
Can I just compile series name and number to scrape Il Grande Diabolik, isn't it?
Or do I need to compile volume too?

Thank you!!!

Luca
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36376

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
Just another very little issue with Diabolik scraper:
when I scrape "Anno XII - numero 6", which is a reprint of number 1, web link is retrieved as "www.diabolik.it/cronologia-diabolik_scheda.php?annata=Prima Serie&ID=13" instead of "www.diabolik.it/cronologia-diabolik_scheda.php?annata=Anno XII&ID=249".

You know, I'm a little maniac!! ;-P
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36377

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
mizio66 wrote:
Can you open the page that is scraping (the one of the album) and send me the source of it?

I don't understand what do you need. My english (and my Q.I.) is not so good...
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36378

  • luke_70it
  • luke_70it's Avatar
  • Offline
  • Senior Boarder
  • Posts: 64
  • Thank you received: 1
  • Karma: 2
I managed to make it work!

If I disable "Leg.deposit" from scraper configuration, i can correctly scrape Il Grande Diabolik!!!

For Diabolik, I choosed only a few field, because comics was already scraped using your wonderful scrapetxt, so I enabled only synopsis and notes...

Thank you
Luca
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36391

  • rmagere
  • rmagere's Avatar
  • Offline
  • Gold Boarder
  • Posts: 223
  • Thank you received: 24
  • Karma: 7
Just tested on "Il Grande Diabolik".

I can scrape with everything selected, however it only scrapes the Title and Publisher (which might be the intended behaviour). However it does state that errors were encountered:
Caught IndexError: no such group
C:\Users\Ocram\AppData\Roaming\cYo\ComicRack\Scripts\Diabolik Scraper\Diabolik.py,367,parseAlbumInfo
As per Luke's finding, deselecting Dep. Leg. resolves the issue


Additionally I still seem to be unable to scrape the year/month/day correctly (though I am now getting the title, story, etc)
Last Edit: 4 years 3 months ago by rmagere.
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36397

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
The problems are with the dates... Date fields... So, i need to find out why you have the problem and i don't...

Same seems to happen with the Bonelli script...

We'll see...
The administrator has disabled public write access.

Re: Diabolik (www.diabolik.it.it) Scraper v1.30 BETA (Italian comic) 4 years 3 months ago #36403

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
see the Bonelli topic, please... there I explained how we can try to solve this...

same principles applies to this script.

thanks,

M
The administrator has disabled public write access.
Time to create page: 0.215 seconds

Who's Online

We have 273 guests and one member online