Welcome, Guest
News and Announcements

TOPIC: Comic Vine Scraper 1.0.48-52

Re: Comic Vine Scraper 1.0.48-52 4 years 11 months ago #27475

  • Casublett
  • Casublett's Avatar
  • Offline
  • Gold Boarder
  • Posts: 168
  • Thank you received: 19
  • Karma: 3
LOVE this script, keep up the amazing work... I try to help by adding (I'be added TONS, and I keep doing it) as much as I can to the CV DB. :)

Anyway, couple comments, feature requests and other tidbits, not sure if this is the place, but here goes...

REQUEST: A feature to auto remap main publisher names to something else? Example... CV has "EC Comics" as "Ec" and having a feature to auto remap from "Ec" to "EC Comics" would be amazing. Maybe an editable file just like the imprints.py one? Anyway, I've talked to various CV mods and changing publishers is no small feat with how the site works. There are MANY other examples of publisher errors at CV but that one works as a valid example.

REQUEST: Once a volume is selected, no information is displayed about that volume, just the issues contained within. Sometimes with titles that have TONS of volumes (Punisher for example), I find myself needing to back out and reselect to verify I'm on the proper volume. Maybe a header displaying basic volume information once INSIDE the selected volume? So, instead of just the descending list of issue #'s and issue titles, a constant top header showing the volume year and volume title?


A few comments for the current imprints.py file.

Star Comics, Curtis Magazines & Marvel Soleil are missing as Marvel imprints. MAYBE Atlas and Timely should be added as Marvel Imprints. I personally have them this way, but there might be a reason why this wasn't done.

Soleil is improperly mapped as a Marvel imprint instead of its own main publisher. Marvel Soleil and Soleil are different entities.

Maverick and Dark Horse Books are missing as Dark Horse Comics imprints.

Marvel bought Malibu, so shouldn't all Malibu and its imprints be set to Marvel?

There are others, but after upgrading to 1.52, I forgot to backup my custom imprints.py file, but I'd be happy to suggest the ones I forgot as I rediscover them again. But only if it's helpful, I don't wanna be a pest about this. :)


Once again, thx for this amazing script!
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 11 months ago #27486

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1327
  • Thank you received: 508
  • Karma: 182
Hi Casublett, thanks for the kind words and suggestions, see my specific response below!
Casublett wrote:
REQUEST: A feature to auto remap main publisher names to something else?
I get a lot of requests to give the scraper the ability to edit or modify ComicVine data after it's been scraped. My usual response is something like "Sorry, no, the scraper's purpose is to copy the data as it exists on Comic Vine. ComicRack has many excellent bulk editing/filtering tools that make it easy to change specific fields after scraping, so just use those!" :)

However, your specific request has come up more than once, so I've created a feature request in my issue tracker to remind me to look at this again as I'm building the next release.
REQUEST: Once a volume is selected, no information is displayed about that volume, just the issues contained within. Sometimes with titles that have TONS of volumes (Punisher for example), I find myself needing to back out and reselect to verify I'm on the proper volume. Maybe a header displaying basic volume information once INSIDE the selected volume?
Actually, the title of the series IS there in the title bar of the dialog. But for some reason most people don't see the text in that bar--it's like it's invisible! So it's probably not a bad idea to put that information (and maybe volume and publisher, too) inside the dialog itself. I've added that to my issue tracker, too.
A few comments for the current imprints.py file.
...
I'd be happy to suggest the ones I forgot as I rediscover them again. But only if it's helpful, I don't wanna be a pest about this. :)
Oh yeah, this sort of information is very helpful. In fact, I rely on people like you pointing missing/incorrect imprints, so fire away! Better yet, add them to this issue in the tracker.

One rule, though: I'll only add/change/remove imprints that are listed in Comic Vine's publisher page for that imprint (for example, Maverick is listed on Comic Vine as an imprint of Dark Horse, so I'll definitely add that one to the scraper's list). This way, I'm not letting any one person totally dictate imprints for everyone, and I'm sticking with my goal of scraping only what Comic Vine says.

Of course, this doesn't mean you can't add imprint information to Comic Vine--it's nice for me, because if you make a mistake or a controversial choice and I don't notice, the Comic Vine editors will likely catch it...
Last Edit: 4 years 11 months ago by cbanack.
The topic has been locked.
The following user(s) said Thank You: 600WPMPO

Re: Comic Vine Scraper 1.0.48-52 4 years 11 months ago #27490

  • Casublett
  • Casublett's Avatar
  • Offline
  • Gold Boarder
  • Posts: 168
  • Thank you received: 19
  • Karma: 3
Ok, great and thx!

I've actually added lots of publishers, volumes and titles to CV (Same user name on both sites) and will continue to do so, not sure I've added the actual words "Marvel Imprint" to many tho so Ill make sure to do that from now on.

I'll be sure to add imprint info to the issue you created to the tracker. I'm really retentive about CV info being EXACTLY right per the indicia, so I find stuff all the time, I'll pass along what is relevant. :)
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28480

  • misakitchi
  • misakitchi's Avatar
  • Offline
  • Senior Boarder
  • Posts: 43
  • Karma: -1
Thanks for this scraper! :)
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28521

  • santiagodraco
  • santiagodraco's Avatar
  • Offline
  • Junior Boarder
  • Posts: 24
  • Thank you received: 1
  • Karma: 0
If this has been asked and answered before, my apologies. I didn't find anything that seemed to be an exact match.

My question is this. If a comic has more than one volume and CV chooses the wrong volume for the comic, what is the best way to correct the comics in CR (or metadata) so that CV can rescan them correctly?

For example. Green Lantern Corps. 2006 vs 2011 (I think those dates are right!) For some reason CVS tagged them all as 2006 even though I had selected one as 2011 volume. None of the files with 2011/2012 were flagged properly. I later went in and changed the volume to read 2011 in CR and figured I could run a new scan and it would be correct.. but nope, it was still pulling the 2006 data for those comics. What's the best strategy to fix this issue if it occurs?

Secondly a suggestion. If a comic appears to be an exact match for two volumes (ie same name and volume number but different dates detected in the file name or some such) is there an option available, or could one be added, to force the user to choose the correct volume for each comic during the scan? Not as a global option (meaning I don't want it to affect unrelated comics I might have in the same scan).

Thanks :)
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28523

  • Madmatx
  • Madmatx's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 457
  • Thank you received: 63
  • Karma: 19
santiagodraco wrote:
My question is this. If a comic has more than one volume and CV chooses the wrong volume for the comic, what is the best way to correct the comics in CR (or metadata) so that CV can rescan them correctly?

Go to configure and select the behavior tab, then uncheck the option that says "Use previous choice when 'rescraping' comics"

You might have to hit the search again or go back buttons, also.
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28524

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1327
  • Thank you received: 508
  • Karma: 182
@misakitchi: You're welcome! :cheer:

@santiagodraco:
My question is this. If a comic has more than one volume and CV chooses the wrong volume for the comic, what is the best way to correct the comics in CR (or metadata) so that CV can rescan them correctly?
If you've got some comics that got scraped incorrectly, there are a few ways you can fix them:

1) Select the bad comics in ComicRack, right-click on them, and choose "Clear Data", then scrape them again.

-or-

2) Manually go into the details for the bad comics, and remove the CVDBXXXXX value from the Notes and the Tags fields, then scrape them again.

-or-

3) Go into the Comic Vine Scraper settings dialog and turn off the option called "Use previous choice when 'rescraping' comics. Then scrape the bad comics again.


There is another option in the settings dialog, called "When several comics appear to be from the same series, only confirm the series for the first one." This can cause problems when you scrape comics that appear (based on their filenames) to be in the same series, but they actually are not, like in your Green Lantern Corps example. Comic Vine Scraper sees them all as a group, and when you choose a series for the first one, it gets applied to all of them.

You can turn that option off, in which case the scraper will ask you to choose the correct series for each and every comic book. That will *really* slow you down if you are scraping a lot of comics from the same series. A better solution is to glance through your comics before you scrape them, and if you see two series that have exactly the same name, just run the scraper on each series separately, that way you can pick the right one each time.

Once you've got them scraped correctly once, you won't need to worry about it again because the scraper will always 'remember' which series each comic belongs to (because of the CVDBXXXX tags that it adds to each comic).
Last Edit: 4 years 10 months ago by cbanack.
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28526

  • 600WPMPO
  • 600WPMPO's Avatar
  • Offline
  • Moderator
  • Posts: 3788
  • Thank you received: 557
  • Karma: 232
Madmatx wrote:
santiagodraco wrote:
My question is this. If a comic has more than one volume and CV chooses the wrong volume for the comic, what is the best way to correct the comics in CR (or metadata) so that CV can rescan them correctly?

Go to configure and select the behavior tab, then uncheck the option that says "Use previous choice when 'rescraping' comics"

You might have to hit the search again or go back buttons, also.
...or right-click the comic & hit 'Clear Data'.
Now Playing: The ComicRack Manual (Online)

See my new comics & gadgets on: Tumblr!
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28531

  • santiagodraco
  • santiagodraco's Avatar
  • Offline
  • Junior Boarder
  • Posts: 24
  • Thank you received: 1
  • Karma: 0
Thanks for the tips guys :)
The topic has been locked.

Re: Comic Vine Scraper 1.0.48-52 4 years 10 months ago #28718

  • oraclexview
  • oraclexview's Avatar
  • Offline
  • Moderator
  • aka SoundWave
  • Posts: 906
  • Thank you received: 182
  • Karma: 37
@cbanack : Again, I just want to say that this is one solid piece of code! Thanks again for all the work you've put into it over the years and the things you continue to add going forward.

There are a couple things I'm curious about. The first thing is in the Index of Scripts (Updated July 15, 2012) thread, the ComicVine Scraper "Forum Topic" link points to the old original Comic Vine Scraper 1.0.39-43 thread instead of the current Comic Vine Scraper 1.0.48-52 thread. Is this intentional or should I go ahead and update this to point to the new current thread?

The other thing is that I was wondering how you get the CVDB Tags to work as far as looking up the corresponding webpage just by that last numerical characters in the CVDB Tag ID string. Here are two URL examples showing how I confused how you got your script to look up the pages: The section of the URLs after the main site name "www.comicvine.com" and before the issue ID "37-xxxxx" left me in the dark. Then, I got the idea to try this: To my surprise, this standard URL universal string worked. So, is this the string format that your script uses in order to re-scrape books that contain your CVDB Tags? B)
Last Edit: 4 years 10 months ago by oraclexview.
The topic has been locked.
Time to create page: 0.232 seconds

Who's Online

We have 235 guests and 4 members online