Welcome, Guest
Python Scripts for ComicRack

TOPIC: Idea for a new script

Idea for a new script 7 years 7 months ago #5714

So, while thinking about the current scraper, I had the notion that someone should do a script to update the data in the CVDB based on pre-existing metadata in CR. It was pointed out to me that the CV API doesn't currently have a way to submit changes, but that doesn't mean we couldn't write a script that outputs the differences (if any) between our issues and CVDB's.

I looked at the scraper scripts and it looks like there may be a way to do it since it's mostly compartmentalized now, but the code is a little too dense for my skills. Cory, do you think it's possible you could point out what I would need to do this? My basic idea is this (others, feel free to comment/add to this).

Assumptions: a Tag for the CVDB ID exists in the book

-Open a save dialogue to output the log of what's not up to date as a txt file
-Connect to CVDB API
FOR EACH BOOK
-Go through each of the following fields and see if they differ from CVDB. If so, write existing value and CVDB value to log:
__-Title
__-Year
__-Month
__-Of (not sure if CVDB has this field off the top of my head)
-For each of these fields, split on , and ; and for each element, see if it's in the creator list and in the appropriate role (maybe flag if listed roles differ):
__-Writer
__-Penciller
__-Inker
__-Letterer
__-Colorist
__-Cover Artist
__-Editor
-For each of these fields, split on , and ; and for each element, see if it's in the appropriate field:
__-Genre
__-Characters
NEXT BOOK
-Close file and pop up a "Done" message

This can help us identify what needs to be changed in CVDB. What do you think, sirs?
The administrator has disabled public write access.

Re:Idea for a new script 7 years 7 months ago #5724

  • Stonepaw
  • Stonepaw's Avatar
  • Offline
  • Moderator
  • Posts: 920
  • Thank you received: 267
  • Karma: 173
I like this idea!

Just glancing through the source, it look like it would be fairly easy to modify the existing script to accomplish this.
-Open a save dialogue to output the log of what's not up to date as a txt file
You would probably want to add this right after the part:
if books:
in the scraper main method
-Connect to CVDB API
FOR EACH BOOK
-Go through each of the following fields and see if they differ from CVDB. If so, write existing value and CVDB value to log:
__-Title
__-Year
__-Month
__-Of (not sure if CVDB has this field off the top of my head)
-For each of these fields, split on , and ; and for each element, see if it's in the creator list and in the appropriate role (maybe flag if listed roles differ):
__-Writer
__-Penciller
__-Inker
__-Letterer
__-Colorist
__-Cover Artist
__-Editor
-For each of these fields, split on , and ; and for each element, see if it's in the appropriate field:
__-Genre
__-Characters
You would want to put this in main for loop in the scrape method. Basically changing the part the adds the found data to the comics to a comparer to the existing comic data. Then write that data to a text file.

You could probably remove the part where it searches for a book when there is no CVDB tag since there would be no point in comparing the data.

The config window should be changed to select which fields should be compared.

I can't think of anything else at the moment, but that's my 2¢.
The administrator has disabled public write access.

Re:Idea for a new script 7 years 7 months ago #5913

  • Shinrai
  • Shinrai's Avatar
  • Offline
  • Platinum Boarder
  • With great power comes great W/T.
  • Posts: 885
  • Thank you received: 81
  • Karma: 33
Yes, please. There are tons of holes and I'd love to fill them! Make it so!
The administrator has disabled public write access.

Re:Idea for a new script 7 years 7 months ago #5932

+1 This sounds like a great idea!
The administrator has disabled public write access.

Re:Idea for a new script 7 years 7 months ago #5956

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1318
  • Thank you received: 503
  • Karma: 181
Hi, DBT, sorry for the late reply on your question. I'm pretty focused on the Comic Vine Scraper thread, so I often don't read the other threads in this forum--so best way to make sure I don't miss something you want me to see is to send a PM!

I think the simplest way for you to accomplish what you're trying to do is to collate the debug output from Comic Vine Scraper, since it already does the diff that you're looking for.

When you scrape your comics, there's a way to get the python output in a little console window on the side...I can't remember where the link is, but I'm sure someone around here does.

Anyway, if you scrape one or more comics while that console window is open, you'll see debug output that looks like this for each book:
scraping next eComic book: Mass Effect: Redemption (V2010), #1
no CVDB tag found in book, beginning search...
querying comicvine for all series that match: 'mass effect redemption'
comicvine provided 1 results for the search
... chose series 30921 ('Mass Effect: Redemption')
searching for issue in series: 30921 ('Mass Effect: Redemption')
querying comicvine for all available issues...
...got 0 issue IDs from the comicvine database, 3 from the local cache
trying to find issue ID for issue number 1
...chose issue ID 191277 for this book
querying comicvine for issue details...
setting values for this comic book ('*' = changed):

--> Series : Mass Effect: Redemption
--> Issue Number : 1
--> Title : Issue #1
--> Alt/Arc :
--> Summary : Commander Shepard's companion, Dr. Liara T'Soni, undertakes...
--> Year : 2010
--> Month : 1
--> Volume : 2010
--> Publisher : Dark Horse
--> Imprint :
--> Characters : Miranda Lawson, Feron, Dr. Liara T'Soni
--> Writers : Mac Walters, John Jackson Miller
--> Pencillers : Omar Francia
--> *Inkers : Omar Francia
--> Colorists : Michael Atiyeh
--> Letterers : Michael Heisler
--> CoverArtists : Daryl Mandryk
--> Editors : Brendan Wright, Dave Marshall
--> *Tags : CVDB191277
--> *Notes : Scraped metadata [CVDB191277] on 2010.03.04 22:47:49.

Scrape completed normally

As you can see, the last part of the output lists every field in your comic book, preceded by a "-->" if that field WAS NOT changed by the scrape, and a "--> *" if that field was changed.

So you could make a copy of the books you're interested in finding the diff for (so as not to damage the original files), then scrape them with all update settings turned on (including "replace with blank values), and then copy and paste the output from that console window into a text file. You could even write a little script to go through that text file and throw away all of the lines that don't start with "-->", if you want to clean it up a bit.

And then, voila! You have a list of your changes compared to what's in comic vine. The "-->" lines should contain everything you need to know to efficiently move your changes into comic vine. And this is certainly going to be far easier than wading your way into the scraper source code...

Also, the CVDB tag contains a number, call it XXXXX. To go directly to the page where you can make your updates, use the following URL, replacing XXXX with the number from the CVDB tag:
http://comicvine.com/issue/37-XXXXX
Last Edit: 7 years 7 months ago by cbanack.
The administrator has disabled public write access.

Re:Idea for a new script 7 years 7 months ago #5970

Exactly the advice I was looking for! No worries on a late reply! I hope to have something by the end of the weekend
The administrator has disabled public write access.
Time to create page: 0.208 seconds

Who's Online

We have 136 guests and 6 members online