Welcome, Guest
Python Scripts for ComicRack

TOPIC: Missing Issues using ComicVine (New Version 06-DEC-2014)

Missing Issues using ComicVine (New Version 21-JUN-2014) 3 years 5 months ago #39795

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
It's very basic what I'm doing.

I read in the element from the DB file...
else if (curElement.equals("CustomValuesStore") && !xmlr.getText().trim().equals("")) {
	customValuesStoreTagFound = true;
	customValuesStoreText = xmlr.getText();
}

I check to see if the element has a value, and set it to "empty" if it does not...
if (!(customValuesStoreTagFound)){
	customValuesStoreText = "empty";
}

Then I parse out the volume ID...simple...
if ("empty".equals(customField)) {
	//Error generation if the customField is not found.
	System.out.println(mySeries + " (" + myYearElement + ")" + " #" + myNumberElement + " is missing the required custom fields and needs to be rescraped.");
	badIssueIDs.add(new String(mySeries + " (" + myYearElement + ")" + " #" + myNumberElement + " is missing the required custom fields and needs to be rescraped."));
	continue;
} else {
	try {
		//Trims off the text before the numeric ID.
		volumeNo = customField.substring(customField.indexOf("comicvine_volume") + 17);
		//If another tag exists after comicvine_volume, find the location of the comma separator. 
		//Returns 0 if no comma is found
		int commaSpot = volumeNo.indexOf(",");
		if (commaSpot > 0) {
			//Trims off everything after the numeric ID, if a comma is found
			volumeNo = volumeNo.substring(0,commaSpot);
		}
	} catch (Exception e) {
			//Error generation if there are problems parsing out the volume ID.
		System.out.println(mySeries + " (" + myYearElement + ")" + " #" + myNumberElement + " is missing the required custom fields and needs to be rescraped.");
		badIssueIDs.add(new String(mySeries + " (" + myYearElement + ")" + " #" + myNumberElement + " is missing the required custom fields and needs to be rescraped."));
		continue;
	}
}
And that's it. After that, the volume ID is not changed or manipulated in any way. Friggin' bizarre!!
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 21-JUN-2014) 3 years 5 months ago #39797

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
Appears to be a bug with the STAX XML library I'm using to read in the DB.

The custom values store reads in as two separate lines for certain issues. It doesn't happen often and it always seems to be the same issues and I have no explanation as to why that my be.

It reads in as
,comicvine_issue=35552,comi
cvine_volume=4375

Instead of...

,comicvine_issue=35552,comicvine_volume=4375

So a little logic to reassemble it and another run it and we'll see how we do.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 5 months ago #39798

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
I think that managed to fix it. Try the new build here...

www.sendspace.com/file/80v6ca
Last Edit: 3 years 5 months ago by Samael69.
The administrator has disabled public write access.
The following user(s) said Thank You: forkicks, Moonboi

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41264

  • Moonboi
  • Moonboi's Avatar
  • Offline
  • Junior Boarder
  • Posts: 24
  • Thank you received: 1
  • Karma: 0
Has anyone been having issues with this script since the last update? I am still getting a lot of issues that show up even though they are already in my library or when I rerun the script to look for new missing issues they don't appear
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41265

  • marcdh1
  • marcdh1's Avatar
  • Offline
  • Junior Boarder
  • Posts: 35
  • Thank you received: 1
  • Karma: 1
You have to be sure to check rebuild local cache, to get it to update for any missing issues. I think it was turned off because it was hitting the Comicvine DB to much. Also be sure to set the delay in the upper left to at least 3 seconds, so you do not hit the comicvine api rate cap. Then waaaiiiit. It has to be slower now

It's still helpful though.

I have 2 other issues with it, one is 1/2 issues and another is some of the names are being cut off when there are special characters like `&`.

But it still is the best for missing issues, evemn if it forces me to rebuild ~1 a month to check for new missing comics.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41269

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
I've not been keeping up on comics other than the handful physical ones I buy. Did they break 1/2 issues again? I had that fixed. Does it not still query each volume for every run? I can look at adding this back as an option if not. I know I make some major structural changes when the API limits were implemented, but I didn't think I removed the volume check as this kind of defeats the purpose. The "&" issue is one I've had issues tracking down and really of low importance since it doesn't break anything.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41277

  • Moonboi
  • Moonboi's Avatar
  • Offline
  • Junior Boarder
  • Posts: 24
  • Thank you received: 1
  • Karma: 0
Yeah I have been rebuilding my local cache every time but I do remember when Samael69 had changed things to query for each volume and at the time it worked very well. It seems like something in the last update or 2 might have broke that function.

My biggest issue right now is getting so many issues in the output that I already have in my library. I would say that half my output are like this. I have gone back to a lot of them to double check that they are scrapped correctly so I know that isn't the problem. And it also appears to be random. When rebuilding cache I don't get the same missing issues as last time. It is a new batch of books I already have.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41278

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
Moonboi wrote:
Yeah I have been rebuilding my local cache every time but I do remember when Samael69 had changed things to query for each volume and at the time it worked very well. It seems like something in the last update or 2 might have broke that function.

My biggest issue right now is getting so many issues in the output that I already have in my library. I would say that half my output are like this. I have gone back to a lot of them to double check that they are scrapped correctly so I know that isn't the problem. And it also appears to be random. When rebuilding cache I don't get the same missing issues as last time. It is a new batch of books I already have.
Now I remember. Moonboi, have you gone back and rescraped all your old stuff with the ComicVine Scraper? The reason I ask is that some time ago, about 2 years, Cory put in some custom fields in to the issue, one of them being volume ID.

"It will only look at books that have a custom field containing the volume ID. If you have old collections that have not been scraped since custom fields were added you will need to rescrape all these books. I'd suggest creating a smartlist to find books that do not have custom fields defined, but do have tags containing "CVDB". These are what will have to be rescraped. I'd suggest starting this work as soon as possible."

With this data, I don't have to query on every issue for a full rebuild, only every volume, so the first issue encountered in each volume. In fact, the only time I query out to ComicVine is if the volume does not exist in the local cache and only for the first issue in the volume. Of course, the problem is if that custom volume ID field is missing then the issue is ignored and reported as missing. The short of it is that you have to rescrape everything that does not have that volume ID field. I'd suggest building a custom filter to find them.

There is also this, "Because it will no longer be updating issue information from ComicVine directly it will no longer be able to detect invalid issue IDs. This is a sad loss, but necessary." Such errors will now most likely manifest as falsely reported missing issues. This happens when issues are removed from a volume. If they are re-added, they will get a different issue ID, which will not match in the Missing Issues databases. Again, an unfortunate, but necessary loss to appease the ComicVine gods.

As marcdh1 said, it is a good idea to rebuild every month, or even weekly if you have the time, since it doesn't do it automatically by default anymore. The ComicVine admins asked me to reduce my query load as much as I could and so I did in good faith. For an average library, I probably cut the query load down 10 fold, but it does REQUIRE that custom Volume ID field supplied by the Scraper.

I have no idea why it seems to be random. Has anyone else experienced such behavior?

FYI, the rebuild local cache is actually less load than the original "Update All Volumes". The original rebuild local cache, before the optimizations required by the query cap, queried for every volume AND every issue, which generated massive load on ComicVine. Due to the cap, the issues has to be taken out of the mix. A collection of say 45000 comics might generate ~50000 queries to ComicVine between issues and volumes and that would take at least 42 hours to complete with a 3 second delay. This was simply not acceptable.
Last Edit: 3 years 1 week ago by Samael69.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41279

  • marcdh1
  • marcdh1's Avatar
  • Offline
  • Junior Boarder
  • Posts: 35
  • Thank you received: 1
  • Karma: 1
For me, the non-numeric ½ is the biggest pain. as ½ is a character and not a numerical value?

I do get the missing issues from time to time when rebuilding the cache. And that I can typically resolve by re-re-scraping the issue, or by making sure I am selecting the right issue (Sometimes multiple volumes with the same issues/covers pop up). I wonder if the entry has been edited in some way in comic vine, and yet another re-scrape is needed to keep it up to date. A few rescrapes later, things are back to normal till the next forced full run, then another few issues may pop up. I recognize this is not an issue with the F.M.I. tool.
The administrator has disabled public write access.

Missing Issues using ComicVine (New Version 25-JUN-2014) 3 years 1 week ago #41280

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
marcdh1 wrote:
For me, the non-numeric ½ is the biggest pain. as ½ is a character and not a numerical value?
When I get a chance, I'll look in to this as I am dealing with these characters (1/2, 1/4, 3/4) and a handful of others explicitly as special cases. I thought it was working, but ComicVine seems to keep changing how they deal with these. Sometimes it's the 1/2 character, sometimes it's 1/2 (3 characters, as in one slash two), sometimes it's 0.5.

I tend to ignore 1/2 issues as there's just too many of them not scanned and it clutters up the list.
Last Edit: 3 years 1 week ago by Samael69.
The administrator has disabled public write access.
The following user(s) said Thank You: marcdh1
Time to create page: 0.214 seconds

Who's Online

We have 233 guests and 2 members online