Welcome, Guest
Share your ideas

TOPIC: ComicRack is very mean to webcomic servers

ComicRack is very mean to webcomic servers 5 years 3 weeks ago #39817

  • Helmic
  • Helmic's Avatar
  • Offline
  • Expert Boarder
  • (Don't be) That guy
  • Posts: 150
  • Thank you received: 82
  • Karma: 52
As it is, when CR opens a .cbw it'll go through the entire scraping process every time. That means going to the page 1 URL, requesting the page, searching for the image URL, and then finding the next page URL to repeat the process.

This is incredibly inconvenient for both the reader and whatever sever they're getting their getting their webcomic from. While this can somewhat be alleviated by setting an absurdly large cache size, it's not ideal. And trying to update webcomics is a nightmare.

Instead, when a comic is scraped, every page that successfully grabs an image URL should have that image URL saved. Page 1 is simply listed as mywebcomic.com/comics/001.png, for example. The last page should have the URL saved as well so that it can be called up to check for updates. This extra information could be saved directly to the cbw or in ComicRack. Then, when the user comes back to the comic, all CR has to do is look up the last page URL and continue from there.

The user should also be able to just right click a webcomic and just check for updates for that one webcomic, instead of trying to download all new pages for all webcomics in the library at once.

The point is that the scraper shouldn't be calling upon each and every webpage and downloading every image unless absolutely necessary. It'd be nice to not accidentally DDoS every site that I have a CBW for.
Request webcomics on IRC: http://widget.mibbit.com/?server=irc.rizon.net&channel=%23comicrack
server: irc.rizon.net
channel: #comicrack
Or in the webcomics subforum: http://comicrack.cyolito.com/forum/22-web-comics
The administrator has disabled public write access.
Time to create page: 0.195 seconds

Who's Online

We have 171 guests and no members online