Welcome, Guest
Python Scripts for ComicRack

TOPIC: ComicVine (Scraper) current limit

ComicVine (Scraper) current limit 2 years 8 months ago #41810

  • Drybonz
  • Drybonz's Avatar
  • Offline
  • Gold Boarder
  • Posts: 296
  • Thank you received: 1
  • Karma: 9
I didn't realize I was abusing their service by tagging a run of comics. I have tagged several runs at a time sometimes. Usually, it comes in bunches. It's been a while since I used it at all.

I get that they have to watch the bandwidth, but it's a pain to have to babysit the process if you are tagging a bunch of books, trying to periodically reset it so you can finish. Does anyone know if they offer a paid service or plan to, without the interruptions?

Alternatively, I guess some kind of auto-resume on the scraper would help so you don't have to sit in front of the computer until it's finished? Dunno... just thinking out loud.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 8 months ago #41811

  • RevQuixo
  • RevQuixo's Avatar
  • Offline
  • Gold Boarder
  • Posts: 280
  • Thank you received: 26
  • Karma: 12
Just go to the advanced tab of comic vine scraper and put in:

SCRAPE_DELAY=30

and then walk away.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 8 months ago #41812

  • Drybonz
  • Drybonz's Avatar
  • Offline
  • Gold Boarder
  • Posts: 296
  • Thank you received: 1
  • Karma: 9
Thanks guys... I saw that in the previous post too, so I will give it a try and see how long it takes.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 14 hours ago #44131

  • pweasel
  • pweasel's Avatar
  • Offline
  • Expert Boarder
  • Posts: 123
  • Thank you received: 18
  • Karma: 8
I made a run of 16 comics in less than 15 minutes and got shafted :(
CRW 0.9.178 x64 on Win10
CRA 1.80 on Nexus 10
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 14 hours ago #44132

  • tglass1976
  • tglass1976's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 8
  • Karma: 0
Appears they made changes again: www.comicvine.com/forums/api-developers-...-limiting-1746419/#1
Many of you have noticed that our API rate limiting is stifling to put it mildly. We heard you and we, yet again, changed the way we limit API use. You'll like this one we're sure...

Previously:

There's a limit of 450 requests within a 15 minute window. If you go above that you're temporarily blocked. You can make all those requests within anywhere from 1 second to 15 minutes.

Now:

TL;DR: Space out your requests so AT LEAST one second passes between each and you can make requests all day. Go even a millisecond faster and you'll hit a brick wall REALLY HARD.

There is no limit of the number of requests. You are limited to how often you can make requests. There are no hard numbers in this, its more of a throttling algorithm that will restrict aggressive apps and reward those that are well behaved. If your app spreads out requests to at most one per second you will not have any problems and can make requests 24/7. If the time between requests is less than 1 second you will be restricted and the more of these requests you make the more likely you will be blocked and proceeding amounts of allowed requests will dramatically drop.

I have SCRAPE_DELAY=5 in my configuration but am still getting stopped after about 8 issues.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 13 hours ago #44133

  • tglass1976
  • tglass1976's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 8
  • Karma: 0
SCRAPE_DELAY=15 seems to be working so far.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 2 years 11 hours ago #44136

  • ArKay
  • ArKay's Avatar
  • Offline
  • Senior Boarder
  • Posts: 70
  • Thank you received: 2
  • Karma: 0
Does that cause a 15 second delay between each file? Probably has to do with the fact that there is not just 1 request per file. Maybe the actual scraper should do the delay handling?
Last Edit: 2 years 11 hours ago by ArKay.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 1 year 11 months ago #44149

  • duckpuppy
  • duckpuppy's Avatar
  • Offline
  • Junior Boarder
  • Posts: 39
  • Thank you received: 3
  • Karma: 1
I have forked the CVScraper code and have added in a rate limiter that limits to one API call per 1.5 seconds (even limiting to 1 call per 1.25 seconds hit the rate limit - the detection code on the other end doesn't seem to be too accurate). I'm testing that right now, but since cbanack is not maintaining CVS anymore (and wasn't accepting pull requests prior to that anyway), I'm not sure how best to distribute this fix once I determine it works on a large scrape. I'm not a Python developer (in fact, I grabbed the rate limiting code from a StackOverflow thread, proper attribution is in the source), so I'm not really looking to take over maintaining this plugin.
Last Edit: 1 year 11 months ago by duckpuppy.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 1 year 11 months ago #44150

  • ArKay
  • ArKay's Avatar
  • Offline
  • Senior Boarder
  • Posts: 70
  • Thank you received: 2
  • Karma: 0
Whatever works!

Even with the SCRAPE_DELAY set to 15 I hit the limit last night and of course it's not a solution to the problem since we are most likely still doing more than 1 call per second at the moment.
The administrator has disabled public write access.

ComicVine (Scraper) current limit 1 year 11 months ago #44151

  • boshuda
  • boshuda's Avatar
  • Offline
  • Gold Boarder
  • Posts: 295
  • Thank you received: 64
  • Karma: 8
duckpuppy wrote:
I have forked the CVScraper code and have added in a rate limiter that limits to one API call per 1.5 seconds (even limiting to 1 call per 1.25 seconds hit the rate limit - the detection code on the other end doesn't seem to be too accurate). I'm testing that right now, but since cbanack is not maintaining CVS anymore (and wasn't accepting pull requests prior to that anyway), I'm not sure how best to distribute this fix once I determine it works on a large scrape. I'm not a Python developer (in fact, I grabbed the rate limiting code from a StackOverflow thread, proper attribution is in the source), so I'm not really looking to take over maintaining this plugin.
I would recommend creating a new thread for your replacement scraper in the scripts directory. cbanack added a nice little script to compress the entire thing into a plugin so ComicRack knows what to do with it. Put in there what you wrote here and paste the plugin. Then add a comment into the existing Comic Vine Scraper to point to your new thread.
The administrator has disabled public write access.
Time to create page: 0.416 seconds

Who's Online

We have 273 guests and 7 members online