Welcome, Guest
Python Scripts for ComicRack

TOPIC: Missing Issues using ComicVine (New Version 06-DEC-2014)

Re: Missing Issues using ComicVine (Now available for download) 6 years 3 months ago #16134

  • haywire
  • haywire's Avatar
  • Offline
  • Junior Boarder
  • Posts: 28
  • Thank you received: 2
  • Karma: 0
Samael69 wrote:
I think I managed to solve the memory issues. It "should" now be faster and use far less memory as I am no longer using DOM to load the database, I switched to STAX. I also implemented an initial, albeit not perfect, sorting algorithm (Gotta love Java to be inconsistent) to sort the output by series name.

Are these changes available?
I've been loving this tool, as it's been helping me fill in a lot of holes. As my collections is over 30k books now, it takes about 50 mins to run. So any improvement would be awesome. (And sort by series would be my most anticipated addition)
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 3 months ago #16145

  • krwren
  • krwren's Avatar
  • Offline
  • Junior Boarder
  • Posts: 37
  • Thank you received: 1
  • Karma: 1
No problem's on the delay. That is why I waited a couple weeks before asking. Life does have a way of getting in the way of the work you get no money for.
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 2 months ago #16774

  • haywire
  • haywire's Avatar
  • Offline
  • Junior Boarder
  • Posts: 28
  • Thank you received: 2
  • Karma: 0
Samael,
the last couple days, I've been having an issue. I've posted the top of the stacktrace below. Let me know if I can provide any further info for you.




Beginning comparison phase...Please wait a few moments.

java.lang.NullPointerException
at CVMissing.XMLReader.compareFiles(XMLReader.java:505)
at CVMissing.XMLReader.<init>(XMLReader.java:453)
at CVMissing.JFileChooser.jButton3ActionPerformed(JFileChooser.java:335)

at CVMissing.JFileChooser.access$300(JFileChooser.java:46)
at CVMissing.JFileChooser$4.actionPerformed(JFileChooser.java:139)
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 2 months ago #17103

  • haywire
  • haywire's Avatar
  • Offline
  • Junior Boarder
  • Posts: 28
  • Thank you received: 2
  • Karma: 0
I think I've found a quick fix to the issue in my last post, but I don't have the tools handy to decompile and rebuild at the moment:

500 Element remoteVolume = returnVolume(remoteDoc, aVolume.getAttribute("xml:id"));

503 NodeList localIssueList = aVolume.getElementsByTagName("issue");

505 NodeList remoteIssueList = remoteVolume.getElementsByTagName("issue");

For whatever reason, the returnVolume method is returning null, but it's not being checked before access on line 505. If you add the following before line 505, it shouldn't blow up:
if (remoteVolume == null) continue;

Now, there might actually be a bigger issue at play, but I haven't explored far enough into the code...I'm just trying to get over this hurdle, since this tool is an integral part of my cataloging.
The administrator has disabled public write access.
The following user(s) said Thank You: forkicks

Re: Missing Issues using ComicVine (Now available for download) 6 years 2 months ago #17110

  • forkicks
  • forkicks's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 871
  • Thank you received: 109
  • Karma: 37
Thanks for looking into it at all. I'm sure Samael will appreciate the pointers.

fK
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 1 month ago #17487

  • haywire
  • haywire's Avatar
  • Offline
  • Junior Boarder
  • Posts: 28
  • Thank you received: 2
  • Karma: 0
Samael, any chance you can patch this bug for me? Or even let me at your source and I can take care of it?

I'm anxious, as your tool has become an integral part of my collection maintenance.
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 1 month ago #17541

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
I apologize for the delay in reply. I missed the previous posts, due to checking irregularly and large volumes of Android-related stuff quickly pushing this off the recent posting page. The summer has been crazy for me and I haven't had ANY time to look at this app. If you're in a hurry, the source for the current release is already posted in the original post. This is most likely a data problem. You're suggested fix should work fine, but it's a band-aid fix to hide a larger issue...read below. If you do such a fix, I would suggest adding a text output so the problem issues/volumes can be identified and potentially fixed. Be warned though, the source is ugly, with lots of stray code from different experiments I've tried and it's not commented at all.

I'd be interested in seeing the cache records for the book/volume it fails on since it makes it through my 22.5k books just fine. I wonder if removing an entire volume is what causes it to blow up, but I haven't issues with this in that past. Currently it doesn't parse both ways. If you remove a book/volume from your collection, it does not automatically remove it from the cache, and you have to rebuild the cache to get rid of them. I may add this as an option in the future, but that would add an huge amount of overhead. More likely, I would add it as a separate function, such as "Clean Cache". This is a known issue that, to me, is much more minor than crashing or memory issues.

If you rebuild the cache does the error go away?

That being said, I think there's something else in play here there. There's no error checking at that point because there shouldn't need to be if the cache files are built properly. Basically, it sounds like you have a volume entry in your local cache with no corresponding entry in the remote cache, or vice-versa...which, in theory, should never happen...so it was never accounted for. Out of curiosity, have you edited the cache file(s) manually to remove entries? Another thing I'll look at is what the behaviour is when a book has an out-of-date web page tag, which is the single piece of data the entire process is based on. I don't remember if it will still try to add it to the cache. It shouldn't, since at that point it would have no volume ID, due to a failed query but I don't remember. Another potential issue could arise if the volume ID for a given issue changes, but not the issue ID, which I'm unsure if this could happen on CV. Then, there might be the potential for the local and remote cache files to get out of sync, which could potentially cause this error. These issues, with the exception of the out of date web tag issue, should be resolved by rebuilding the cache...which I tend to do once a month or so to clean up strays and other bad data. I will try to check in to the exception this evening. I'll have to download the code from here so I can look at this particular release though. B)

My development version, that I was working on before the summer, has been somewhat modified.
  • It will introduce "some" sorting
    • Position, which is the current default sorting. Basically it will come out on the list in the same order the volumes were added to the DB.
    • alphabetic for sure (A little flaky though...still looking at this.)
    • Maybe by modified date so those with the newest entries will show at the top/bottom. This could be set at the time new issues are added to a volume in the remote cache.
  • Should increase parsing speed, especially when not hitting CV
  • Will substantially reduce memory requirements. In my testing, memory usage was well under 100M for about 18000 files, where it tended to be vastly higher before, in the 600M range. It now uses STAX rather than DOM to load the DB and that's where most of the memory crunch was. Memory usage will also not grow drastically as the library grows. Still 50k+ will likely still use a couple hundred meg of RAM, but that is probably using well over 1.5 gig currently. The rule still is, the more books, the more memory used since it still uses DOM for loading and comparing the cache files, but they're much smaller than the DB and DOM is much faster in this case. Overall the memory usage will be reduced probably on a factor of 10...give or take.
Last Edit: 6 years 1 month ago by Samael69.
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 1 month ago #17564

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
Sorting logic is complete as per above. I just need to add some checkboxes to allow user selection. Haywire, any word on whether rebuilding the cache fixed your issues. It looks very much like your cache files got out of sync, which I'm not entirely sure how that can happen. In any case, I'll put the logic in to skip such entries and perhaps to display them in the output...depending on the circumstances.
The administrator has disabled public write access.

Re: Missing Issues using ComicVine (Now available for download) 6 years 1 month ago #17565

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
These are still pending and may not make it in to this release.
perezmu wrote:
Samael69 wrote:
  • Scan only recently updated files rather than the entire library. This one, I need some input. I plan to use the "FileModifiedTime" field.
  • I have not yet had the chance to test the script... I am still in the proccess of really and finally organizing my 6TB+ of comics (thus why I created the duplicates script!)... but what about using the "Date added", which holds the date when the book was added to the library?
    The new architecture using STAX over DOM may eliminate the need for this. It scans my entire 22.5k library in about 5 minutes. Obviously that's the rescan after the ComicVine details have been initially loaded. The initial load will still likely take several hours for a large library and WILL have to be done for the new version. "Date Added" to the DB, however, might make a nice additional sorting option. i.e. Sort the volumes by the date the most recent book was added to the DB (Not necessarily the newest book since you could conceivably add #3 after #5). My current date sort sorts volumes with those with the newest ComicVine entries to the top/bottom but this data will be lost each time the cache is fully refreshed since I generate the date stamp. I'll change this to use ComicVine "date_added" field instead at some point, but that's a little more challenging and might wait.
    krwren wrote:
    I would like to make one request, could you create a complete series file that show all the series that do not have any missing issues.
    I assume you would only need the volume information and not the issue info?
    Last Edit: 6 years 1 month ago by Samael69.
    The administrator has disabled public write access.

    Re: Missing Issues using ComicVine (Now available for download) 6 years 1 month ago #17571

    • haywire
    • haywire's Avatar
    • Offline
    • Junior Boarder
    • Posts: 28
    • Thank you received: 2
    • Karma: 0
    Samael,
    i had tried to rebuild the cache. unfortunately, it didn't help...issue remains.

    I can send you my xml, if you want to try to play with it. let me know.

    otherwise, i'll try to work through your source and see what I can do. (I'm a .NET developer, so java shouldn't be a stretch.)
    The administrator has disabled public write access.
    Time to create page: 0.281 seconds

    Who's Online

    We have 409 guests and one member online