Welcome, Guest
News and Announcements

TOPIC: Comic Vine Scraper

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34028

  • sykoone
  • sykoone's Avatar
  • Offline
  • Expert Boarder
  • Posts: 153
  • Thank you received: 16
  • Karma: 5
Here's a copy of the log file. The filenames are Thor V1 #???, but I have marked them in CR as "The Mighty Thor #???" It does select the right issue once I choose the volume.

The biggest problem is that unlike in previous versions, the scraper doesn't remember what volume I choose, making me have to re select the volume on a stack of books from the same series, even if it had chosen the correct series to begin with.
Attachments:
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34029

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1328
  • Thank you received: 508
  • Karma: 182
sykoone wrote:
Here's a copy of the log file. The filenames are Thor V1 #???, but I have marked them in CR as "The Mighty Thor #???" It does select the right issue once I choose the volume.

The biggest problem is that unlike in previous versions, the scraper doesn't remember what volume I choose, making me have to re select the volume on a stack of books from the same series, even if it had chosen the correct series to begin with.
I think the problem is that the comics are named "Thor" but they belong to a series called "The Mighty Thor". That probably means that the scraper is not able to guess the correct series for these comics (you can see what the scraper is guessing if you try running with the automatic series confirm feature turned off...it's the scraper's first suggestion, which in this case I'm betting is NOT "The Mighty Thor".)

If you've marked the comics as "The Mighty Thor", the scraper should be using that series name for it's search, though...you're sure the comics themselves are all named "The Mighty Thor" in ComicRack?

If the scraper cannot guess the correct series, then it will not be able to confirm that series automatically--it does not use the cover image to search for the right series, it only uses it to confirm the series that it guessed based on the filename or ComicRack series name.


However, you should only have to confirm each series once--looking at the filenames for these Thor comics, it seems like you should only need to select "The Mighty Thor" once, and it should then scrape all of those "Thor comics using that choice. It sounds like this isn't happening, so I'll have to look into that, too.
Last Edit: 4 years 6 months ago by cbanack.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34031

  • wojosama
  • wojosama's Avatar
  • Offline
  • Gold Boarder
  • Posts: 180
  • Thank you received: 45
  • Karma: 11
Corey, I think the problem is the opposite. The series is "Thor" on comicvine but he has it tagged as "The Mighty Thor" but it still doesn't explain the having to keep rechoosing. Are they all in the same folder?



EDIT: I was completely wrong. Stupid Marvel numbering making 411-489 their own series.

Anyhow, I tested this exact scenario with 25 of the issues that were giving problems. I copied them from my folder, renamed all of them to match the file names in the log (used a mix of files that were named Vol1 and V1 just to be thorough) and placed them into a test folder. I cleared all the data on them just to be safe, then tagged the series name as "The Mighty Thor."
Now, I scraped and there was a problem scraping it. I'm not sure that if it was CVS or not thought. When the series list popped up, I had trouble finding it in the list because I was looking for a series with 80+ issues, and when I finally found it, it was listed as having 0. But once I hit ok on the series, it scraped them all without another prompt.
Last Edit: 4 years 6 months ago by wojosama.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34034

  • sykoone
  • sykoone's Avatar
  • Offline
  • Expert Boarder
  • Posts: 153
  • Thank you received: 16
  • Karma: 5
The files are listed in CR as The Mighty Thor V1989, which is the proper listing on CV. They are in different folders, but in previous versions that had not been an issue. I moved a few into a single folder, and that fixed the need to confirm each issue, but I didn't have that problem in previous versions. I still needed to reset the choice.

Further testing shows that even renaming the files to the proper series still had the scraper point to the wrong volume, even though both filename and internal metadata was correctly listed. One thing that might be causing the bug is that the particular volume I'm searching comes back as showing 0 issues in the selection box, even though hitting Show Issues brings up all the available issues.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34035

  • wojosama
  • wojosama's Avatar
  • Offline
  • Gold Boarder
  • Posts: 180
  • Thank you received: 45
  • Karma: 11
Edited my previous post.

Yet another edit: Not sure what version you are coming from but corey posted this almost 4 months ago:
cbanack wrote:
Karvajalka wrote:
I have bit mixed feelings about the new way the CVS decides which comics belong into the same series. ... So, while waiting the image matching, would it be possible to make the folder part of the series detection logic optional?

Yes, there already is an option in the Advanced Settings for that. Just add the line:

IGNORE_FOLDERS = TRUE

to the Advanced Settings text area in the settings dialog.
Last Edit: 4 years 6 months ago by wojosama.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34045

  • Kirtai
  • Kirtai's Avatar
  • Offline
  • Senior Boarder
  • Posts: 77
  • Thank you received: 4
  • Karma: 1
Isn't Thor one of those annoying series that's had different names at different times but is still considered one series on ComicVine?
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34046

  • wojosama
  • wojosama's Avatar
  • Offline
  • Gold Boarder
  • Posts: 180
  • Thank you received: 45
  • Karma: 11
I don't know. I'm 99% sure when I scraped issues 411-490 or whatever about a month ago they were all part of the 1966 volume (I know I didn't change the volume and that's what they are listed as in my library), but now they are listed as a separate volume.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34051

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1328
  • Thank you received: 508
  • Karma: 182
wojosama wrote:
I cleared all the data on them just to be safe, then tagged the series name as "The Mighty Thor." Now, I scraped and there was a problem scraping it.
Yeah, there's a known issue that happens when you scrape comics immediately after using "Clear Data" on them; you probably ran into that. cYo said he'd look into it.
sykoone wrote:
I moved a few into a single folder, and that fixed the need to confirm each issue, but I didn't have that problem in previous versions. I still needed to reset the choice.
As wojosama mentioned, you can go back to the old behaviour with IGNORE_FOLDERS=TRUE, but be careful, since the scraper might end up thinking that all of your Thor comics from different volumes are all part of the same volume/series (because normally it uses the fact that they are in different folders as a clue that they are NOT part of the same series.) The safest thing to do would be to scrape all the issues for each volume separately.

sykoone wrote:
Further testing shows that even renaming the files to the proper series still had the scraper point to the wrong volume, even though both filename and internal metadata was correctly listed. One thing that might be causing the bug is that the particular volume I'm searching comes back as showing 0 issues in the selection box, even though hitting Show Issues brings up all the available issues.
Yup, this is exactly the problem. The scraper uses issue count as one way of narrowing down its guesses. There is some kind of bug in Comic Vine's data here; if you look at the ComicVine page for that series, it even lists it as having zero issues!

Probably someone could fix that by editing it directly or by reporting it as a bug at the ComicVine bug forum.
Last Edit: 4 years 6 months ago by cbanack.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34053

  • wojosama
  • wojosama's Avatar
  • Offline
  • Gold Boarder
  • Posts: 180
  • Thank you received: 45
  • Karma: 11
Appreciate the response. I've actually never had an issue clearing data then immediately rescraping? I do it quite often actually. Unless I just don't notice the problem (absolutely possible). I was referring to the 0 issue bug. But ty nonetheless, I'll see if next time I do the clear data I have any problems.


Oh btw, just as an added thank you and nod of appreciation; I love the new image comparison algorithm. Scraped about 1500 comics earlier today and other than weird naming of a handful of files, it worked flawlessly.
The administrator has disabled public write access.

Re: Comic Vine Scraper 1.0.65 4 years 6 months ago #34054

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1328
  • Thank you received: 508
  • Karma: 182
wojosama wrote:
Appreciate the response. I've actually never had an issue clearing data then immediately rescraping? I do it quite often actually. Unless I just don't notice the problem (absolutely possible). I was referring to the 0 issue bug. But ty nonetheless, I'll see if next time I do the clear data I have any problems.
Try changing the pages of the comic in the main comic scraper window while you are scraping it--if you've just done "clear data" on the comic, you'll see the problem. :)
Oh btw, just as an added thank you and nod of appreciation; I love the new image comparison algorithm. Scraped about 1500 comics earlier today and other than weird naming of a handful of files, it worked flawlessly.
That's great to hear!
The administrator has disabled public write access.
Time to create page: 0.222 seconds

Who's Online

We have 213 guests and 2 members online