Welcome, Guest
Python Scripts for ComicRack

TOPIC: Bedetheque Scraper 2 - v4.9

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 3 months ago #22990

  • PouPou
  • PouPou's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 9
  • Karma: 0
mizio66 wrote:
i will introduce a limit, customizable, to the number of scrapes per session... i don't know if there is one set by the site or if there is a grace timeout... will set to a value of, let's say 70, by default and you will be able to change it...

Maybe a delay between 2 requests is a better way to solve this.

I think when they receive too many requests from the same ip, it's considered like an attack. So they ban the ip.
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 3 months ago #22991

  • PouPou
  • PouPou's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 9
  • Karma: 0
Hi,

Another little thing, with some series (canardo by example) there is a problem with the tittle.

Example for www.bedetheque.com/album-12152-BD-La-fil...evait-d-horizon.html

The title scrapped is 1110. La fille qui rêvait d'horizon

But i think it just happened when there's 2 number for the same albums.
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 3 months ago #22992

  • ninjaw
  • ninjaw's Avatar
  • Offline
  • Senior Boarder
  • Posts: 63
  • Thank you received: 7
  • Karma: -2
That's not a scrapper issue but a bdgest issue
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 3 months ago #23027

  • PouPou
  • PouPou's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 9
  • Karma: 0
There's a strange behavior.

If i scrape with the quick scraper, the "serie" is modified but with the 'normal' scraper it isn't.

Example :
Title before scrapping : [b]Le cycle de Cyann[/b]

Quick Scrap 
Url : http://www.bedetheque.com/album-146-BD-La-sOurce-et-la-sOnde.html
Id : 146

Result: The title is changed by [b]Cycle de Cyann (Le)[/b]

Normal scrap

Result: the title is still [b]Le cycle de Cyann[/b]

Note : 
- The scrapper is configured to modify series
- In the case above, the bdtheque url is the same after scrapping

There's probably a strange bahavior with the number when it's set before the scrap.
Last Edit: 7 years 3 months ago by PouPou.
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23039

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 459
  • Thank you received: 149
  • Karma: 69
Thanks.

I am currently taking a look at the script for the other things mentioned here, so will add this to hte list.

M
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23062

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 459
  • Thank you received: 149
  • Karma: 69
PouPou wrote:
Hi,

Another little thing, with some series (canardo by example) there is a problem with the tittle.

Example for www.bedetheque.com/album-12152-BD-La-fil...evait-d-horizon.html

The title scrapped is 1110. La fille qui rêvait d'horizon

But i think it just happened when there's 2 number for the same albums.

For this, use the Alternate number (put 10) and leave the Number as it should be, 11. Worked for me...
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23063

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 459
  • Thank you received: 149
  • Karma: 69
PouPou wrote:
There's a strange behavior.

If i scrape with the quick scraper, the "serie" is modified but with the 'normal' scraper it isn't.

Example :
Title before scrapping : [b]Le cycle de Cyann[/b]

Quick Scrap 
Url : http://www.bedetheque.com/album-146-BD-La-sOurce-et-la-sOnde.html
Id : 146

Result: The title is changed by [b]Cycle de Cyann (Le)[/b]

Normal scrap

Result: the title is still [b]Le cycle de Cyann[/b]

Note : 
- The scrapper is configured to modify series
- In the case above, the bdtheque url is the same after scrapping

There's probably a strange bahavior with the number when it's set before the scrap.

Found (I hope) the problem, will be fixed in next release....
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23064

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 459
  • Thank you received: 149
  • Karma: 69
PouPou wrote:
Hi,

Thnaks for the script, i've just installed it today and it worked fine until now.

Now, each time i try to scrap, the comic is ignored... But in the rename log, i don't see any change...

Thursday 17 May 2012 18:34:38 > [Spirou et Fantasio] 1 - 4 aventures de Spirou et Fantasio ** Renommé **

Thursday 17 May 2012 19:16:28 > [Spirou et Fantasio] 1 - 4 aventures de Spirou et Fantasio ** Ignoré **

Any idea?

Hi, i think also this has been solved... will be in the next release..
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23066

  • rose44
  • rose44's Avatar
  • Offline
  • Junior Boarder
  • Posts: 32
  • Thank you received: 5
  • Karma: 5
mizio66 wrote:
It is not the first time i hear this...

Never happened to me, i don't remember having scraped so many BD all together.

It has been asked by another user, privately... i will introduce a limit, customizable, to the number of scrapes per session... i don't know if there is one set by the site or if there is a grace timeout... will set to a value of, let's say 70, by default and you will be able to change it...

will release a new version soon...

On the BDTQ site, your IP is banned if you request more than 120 pages per minute.
I've also been banned recently. It happens when the site is not busy at all. And you have to email the site to explain and be unbanned few days later :(

I don't know how much pages request does a scrape needs, let's say 10, so the limit should be something like "no more then 12 scrapes per minute", instead of "per session".

Thank you very much for that great scraper.

Regards


BTW: Still no result with QuickScrapeBD2, even with v2.01, unless cutting/pasting the id. :(
Last Edit: 7 years 2 months ago by rose44.
The administrator has disabled public write access.
The following user(s) said Thank You: PouPou

Re: Bedetheque Scraper 2 - v2.01 Final 7 years 2 months ago #23110

  • rose44
  • rose44's Avatar
  • Offline
  • Junior Boarder
  • Posts: 32
  • Thank you received: 5
  • Karma: 5
I'm affraid I've got another problem with DBscraper v2.01 final (using CR 0.9.155 64 bits, Win 7).
The "Capitalization" of the text (Formatted Titles checkbox) doesn't do anything : I'm unable to get scraped titles capitalized. I had no problem with previous version 2.00b ?

Even uninstalling/re-installing everything. Same result. :unsure:

Am I the only one ? A problem with 64b version or with .ini file stored in "C:\Users\xxxx\AppData\Roaming\cYo\ComicRack\Scripts\Bedetheque Scraper 2" ?

I can't wait next release ;)
The administrator has disabled public write access.
Time to create page: 0.257 seconds

Who's Online

We have 113 guests and no members online