Welcome, Guest
News and Announcements

TOPIC: Comic Vine Scraper

Comic Vine Scraper 1 month 5 days ago #48847

  • RevQuixo
  • RevQuixo's Avatar
  • Offline
  • Gold Boarder
  • Posts: 282
  • Thank you received: 27
  • Karma: 12
Xelloss wrote:
It is funny because as I scrap most of my comics by comicvine volume id, and only new number 1s by search, most of the problems and temporary bugs people usually talk about in this topic don't affect me at all...

You should use my Autocomplete script for scraping comics (specially new numbers from already scrapped volumes), it will make it a lot easier and avoid most comicvine server problems... Scraping by comicvinevolume id is a lot quicker and less problematic than doing a search for EVERY comic you scrap...

(for those of you that want to autopopulate comicvine volume id and not use all the autopulates of my script, I am planning to release a "light" version that only autopopulate this field and not the rest)

It's sad that Comicvine messing up is what it took me to find this script..its great...so much quicker to scrape using it. Thanks!
The administrator has disabled public write access.
The following user(s) said Thank You: Xelloss

Comic Vine Scraper 1 month 5 days ago #48848

  • solidus0079
  • solidus0079's Avatar
  • Offline
  • Senior Boarder
  • Posts: 78
  • Thank you received: 3
  • Karma: 1
Holy crap what are they doing over there? X_X
The administrator has disabled public write access.

Comic Vine Scraper 1 month 4 days ago #48854

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 574
  • Thank you received: 139
  • Karma: 29
RevQuixo wrote:
Xelloss wrote:
It is funny because as I scrap most of my comics by comicvine volume id, and only new number 1s by search, most of the problems and temporary bugs people usually talk about in this topic don't affect me at all...

You should use my Autocomplete script for scraping comics (specially new numbers from already scrapped volumes), it will make it a lot easier and avoid most comicvine server problems... Scraping by comicvinevolume id is a lot quicker and less problematic than doing a search for EVERY comic you scrap...

(for those of you that want to autopopulate comicvine volume id and not use all the autopulates of my script, I am planning to release a "light" version that only autopopulate this field and not the rest)

It's sad that Comicvine messing up is what it took me to find this script..its great...so much quicker to scrape using it. Thanks!

Remember to check what this script does... I am constantly improving it, but it is still not 100% reliable in finding the correct comic to copy the data from...
The administrator has disabled public write access.

Comic Vine Scraper 1 month 4 days ago #48855

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 574
  • Thank you received: 139
  • Karma: 29
krandor wrote:
krandor wrote:
Xelloss wrote:
It is funny because as I scrap most of my comics by comicvine volume id, and only new number 1s by search, most of the problems and temporary bugs people usually talk about in this topic don't affect me at all...

You should use my Autocomplete script for scraping comics (specially new numbers from already scrapped volumes), it will make it a lot easier and avoid most comicvine server problems... Scraping by comicvinevolume id is a lot quicker and less problematic than doing a search for EVERY comic you scrap...

(for those of you that want to autopopulate comicvine volume id and not use all the autopulates of my script, I am planning to release a "light" version that only autopopulate this field and not the rest)

Can you post a link to said autocomplete script?

Never mind. I found it. In fact I already had it loaded since I run that before I run duplicate manager so read status and so forth gets moved over if I get a better quality copy. Had no clue it would help prior to scraping as well.

Thanks jkthemac for that... it was his idea, my original idea was only for scraped comics... and it is funny the real use I gave it most of the time now is not what I made this script for XD

Note: The not scraped comic autocomplete method is not 100% reliable (the problems of not working with ids), have that in mind when you use it... I tried to make it so it doesn't recognise comics before recognising them wrong... but it still makes false positives now and then... (the more flexible I made the rules to find "same volumes" the more comics it recognise, but the more false positives it makes... the less flexible, the more comics it doesn't recognised :P. So I have to put the equilibrium in some point)

Also the script works differently with scraped and not scraped comics... my advise is that you use it AGAIN after correct scraping to load more data that could have been ignored before scraping time... (this is becuase the script will copy the more data the more sure it is the volume recognition is reliable)

The same happens when duplicates... If the comic_id is the same of another comic with more data, it will copy ALMOST EVERYTHING from it, and not only volumes releated fields...
Last Edit: 1 month 4 days ago by Xelloss.
The administrator has disabled public write access.

Comic Vine Scraper 1 month 2 days ago #48884

I'm having an issue that looks like it started in November. The Published date column after scraping has gone from DD/MM/YYYY to MM?YYYY. I have about 200 dupes that are not showing in Duped View because they have a DD in the Published column and ComicRack is not recognizing them as dupes.

Is this a setting I can tweak or do I have to clear the data from those issues and re-scrape?
The administrator has disabled public write access.

Comic Vine Scraper 1 month 2 days ago #48885

  • kino13
  • kino13's Avatar
  • Offline
  • Senior Boarder
  • Posts: 57
  • Thank you received: 6
  • Karma: 0
This was posted in the comicvine forum:

"@arkay74: CV has a new search engine, it has relevancy problems that are being worked on."


Now, I would really like to know who is the retarded, fucking, useless piece of shit that went on to experiment on the production server without testing first what the hell he was doing...

Sorry, I am angry.
with no power comes no responsibility. except that wasn't true
The administrator has disabled public write access.

Comic Vine Scraper 1 month 1 day ago #48888

  • kino13
  • kino13's Avatar
  • Offline
  • Senior Boarder
  • Posts: 57
  • Thank you received: 6
  • Karma: 0
Issue has been moved to "GiantBomb", whatever that is (owners of the CV for what I can see)

www.giantbomb.com/forums/bug-reporting-3...ing-live-1817994/#38

Seems they have applied a new search engine to all the sites.
with no power comes no responsibility. except that wasn't true
The administrator has disabled public write access.

Comic Vine Scraper 1 month 1 day ago #48890

  • boshuda
  • boshuda's Avatar
  • Offline
  • Gold Boarder
  • Posts: 318
  • Thank you received: 76
  • Karma: 9
kino13 wrote:
This was posted in the comicvine forum:

"@arkay74: CV has a new search engine, it has relevancy problems that are being worked on."


Now, I would really like to know who is the retarded, fucking, useless piece of shit that went on to experiment on the production server without testing first what the hell he was doing...

Sorry, I am angry.
That's how they roll every change is made to the live servers. I suspect they keep the CV API available because we become the testers for their other products.
The administrator has disabled public write access.

Comic Vine Scraper 1 month 1 day ago #48891

  • Xelloss
  • Xelloss's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 574
  • Thank you received: 139
  • Karma: 29
As a professional tester, I can asure you that that kind of things is quite common... and I work for a bank XD
The administrator has disabled public write access.

Comic Vine Scraper 1 month 1 day ago #48894

  • kino13
  • kino13's Avatar
  • Offline
  • Senior Boarder
  • Posts: 57
  • Thank you received: 6
  • Karma: 0
Xelloss wrote:
As a professional tester, I can asure you that that kind of things is quite common... and I work for a bank XD

I am a sys admin, I have worked for three different international service providers (among other companies), and I can assure you we were very, very careful before making any change on production. Everything had to be documented, prepared, scheduled and more importantly, had a roll back solution.

I have seen some disasters on banks as well...

Regards
with no power comes no responsibility. except that wasn't true
The administrator has disabled public write access.
Time to create page: 0.328 seconds

Who's Online

We have 227 guests and 3 members online