Welcome, Guest
News and Announcements

TOPIC: Comic Vine Scraper

Comic Vine Scraper 8 months 1 week ago #49049

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 596
  • Thank you received: 150
  • Karma: 30
Try this new version instead if you want, it is thought for cases with no results (should not affect the previous correct searches)

Last "improved" version of the patch:

File Attachment:

File Name: cvdb-2-3.zip
File Size:9 KB


I continue playing with the filter but I need people testing it and giving me their results as I can test so many comics in my collection...

I will make a separate topic so that I don't spam the official scraper topic and we can discuss the patch results :)
Last Edit: 8 months 1 week ago by Xelloss.
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49051

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1367
  • Thank you received: 550
  • Karma: 186
I posted a message on the Giant Bomb website to find out if it they are going to give us the tools we need to get the Scraper's search working the way it used to. If not, perhaps something like Xelloss' patch will need to be added permanently to the scraper.
The administrator has disabled public write access.
The following user(s) said Thank You: Xelloss

Comic Vine Scraper 8 months 1 week ago #49052

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 596
  • Thank you received: 150
  • Karma: 30
By te way it sort the results "google style" I have assumed it was on purpose for "unlimited searches" till you find what you are looking for... (even if the matches are not exact). The results as given now are mucho more difficult to return than the previous ones, so I don't think "it is a bug" or something done by mistake.

The page searcher in the site, for example, always worked that way.

That doesn't mean they cannot include an option for "only exact matches" to make it more robust.

All the same, I am quote happy as how it works now with my patch. I think it needs some tinkering yet, but it now works quite good in the tests I made. You can even play a bit with the filters you applied and get results you didn't get before... all without losing anything from the previous way of searching.

Remember the actual API will return the exact mataches (using AND) if any FIRST, and then, when finished, the ones with OR... (sometimes 1 or 2 of the first results are not exact matches, I cannot find why, but by the 3 or 4 THEY ARE WALYS exact matches till they are not anymore)

As I have noticed, the new search works this way:

You search for example "A and B and C"

The results are the following (IN THIS ORDER):

A and B and C (if any)
(A and B ) or (A and C) or (B and C) (if any)
A or B or C (if any)

if you want the exact results that were given before, you just stop when the first of the second group appear XD (in my patch I didn't do that, I did something a bit more complex so that if not perfect matches where found, select the nearest result to exact matches, trying to stop downloading more results as soon as possible to make it quicker)
Last Edit: 8 months 1 week ago by Xelloss.
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49053

  • krandor
  • krandor's Avatar
  • Offline
  • Gold Boarder
  • Posts: 313
  • Thank you received: 34
  • Karma: 5
Xelloss wrote:
By te way it sort the results "google style" I have assumed it was on purpose for "unlimited searches" till you find what you are looking for... (even if the matches are not exact). The results as given now are mucho more difficult to return than the previous ones, so I don't think "it is a bug" or something done by mistake.

That is what I've been thinking since this OR change started that this may be the way they want it to work which is fine and great for the website, but a little tougher for the API.

Guess we'll have to see what the devs say and if it is here to stay, you could do a pull request with your patch.

Just loaded your latest version and also just got some new comics so will run this on there in a bit and let you know how it goes.
The administrator has disabled public write access.
The following user(s) said Thank You: Xelloss

Comic Vine Scraper 8 months 1 week ago #49054

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 596
  • Thank you received: 150
  • Karma: 30
Thanks! You can post what you find here:

comicrack.cyolito.com/forum/13-scripts/4...r-patch-not-official

So that we don't spam this topic
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49055

  • forkicks
  • forkicks's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 873
  • Thank you received: 111
  • Karma: 37
As a sysadmin, it amazes me to no end how they don't seem to care that their servers are being hit much more with the search done this way as opposed to the way it was being done before.

fK
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49056

  • krandor
  • krandor's Avatar
  • Offline
  • Gold Boarder
  • Posts: 313
  • Thank you received: 34
  • Karma: 5
forkicks wrote:
As a sysadmin, it amazes me to no end how they don't seem to care that their servers are being hit much more with the search done this way as opposed to the way it was being done before.

fK


I'm a network engineer and thinking the same thing. They are going to get killed tomorrow.
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49057

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 596
  • Thank you received: 150
  • Karma: 30
Thought about the same thing, but I assume they don't take into consideration scripts that will continue loading results till the last one... remember a query only returns 100 results...

All the same, if you think about it, the most processing usage is by sorting the results... After all searching an entire database with many conditions is almost the same than doing with one condition... the problem is then returning them in order of which matched more conditions (you have to store each in a different list, and then join them again)

In the other hand, if we are thinking about DATA TRANSFER, this way almost all queries will return a list of 100 results (and the data from them), instead of 1 or 2 most queries returned before... THAT would be a big problem

I think they thought that it was better to have one query that give the needed results and not 100 tries because someone is misspellin a word...

We must remember ComicVine surely don't even remember something called Comicvine Scraper exists :P. They must think in uses of the api similar to the uses people on the site do, just load results till you find what you are looking for... and not TILL THE END and then search in the results for the one you are looking for...

I think 99% of the searches they receive in the site must be from their web, which works as usual, and not from the API. The only thing they should worried about with the API is that someone doing some experiment don't do thousands of queries in minutes... which they do XD

In case of us, geeks that use ComicRack (we are not that many out there XD) AND the scraper (less), yes, we will download A LOT of more data from CV (and A LOT of more server power if we don't apply the patch) in the next days if they keep this search algorithm... But I don't think they will even notice XD
Last Edit: 8 months 1 week ago by Xelloss.
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49058

  • krandor
  • krandor's Avatar
  • Offline
  • Gold Boarder
  • Posts: 313
  • Thank you received: 34
  • Karma: 5
Xelloss wrote:
We must remember ComicVine surely don't even remember something called Comicvine Scraper exists :P. They must think in uses of the api similar to the uses people on the site do, just load results till you find what you are looking for... and not TILL THE END and then search in the results for the one you are looking for...

I'm sure comicvine remembers us, but the people writing the search are not at comicvine - they are at giantbomb and don't seem to be thinking about CV much in what they are doing. Those devs may have never heard of comicrack.
The administrator has disabled public write access.

Comic Vine Scraper 8 months 1 week ago #49065

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 596
  • Thank you received: 150
  • Karma: 30
ok, new version:

File Attachment:

File Name: cvdb-2-3-4-5.zip
File Size:9 KB


(this fix some issues with ignored words)
The administrator has disabled public write access.
Time to create page: 0.375 seconds

Who's Online

We have 134 guests and one member online