Welcome, Guest
Python Scripts for ComicRack

TOPIC: Bedetheque Scraper 2 - v4.9

Re: Bedetheque Scraper 2 - v1.5 6 years 4 months ago #15677

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
Released v1.5 !!

Errors reported and fixed:
- speed of scrape
- CR 0143 works ok

You will lose the defined parameters... sorry, you will have to redfine them again... make a screenshot, maybe...

Enjoy

M
Last Edit: 6 years 2 months ago by mizio66.
The administrator has disabled public write access.
The following user(s) said Thank You: 600WPMPO

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16146

  • pcjco
  • pcjco's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 6
  • Thank you received: 2
  • Karma: 0
Hi,

Thanks Mizio66 for the complete cleaning of this plugin.

1 remark :
- the QuickScrape doesn't seem to scrape the "Genre" field (and probably more)
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16147

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
yes, this info and other is in the SERIES' part, not in ALBUM... i will need to revise the quickscrape...

thanks !

M
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16148

  • Joentjuh
  • Joentjuh's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 17
  • Thank you received: 1
  • Karma: 1
It might be just me, but the plugin doesn't work that well for me. (or maybe I'm just having the wrong expectations?)
Automatic scraping misses always unless the series name is an exact match to the one on bedetheque (that also means capitalisation and the accents... something I'm not used to, being primarily Dutch and English).
To get information, I thus have to go to the website, copy the 'correct' name of the series and manually rename all issues. Tome numbers are assumed 1 if not entered correctly (even if this information is available in the filename, i.e. T02)... Can be quite annoying.

The manual scraper remains an enigma to me, can't get it to do anything no matter what I enter in the text field (tried pretty much every link I could find on the series page).

I suggest: similar system to that of the ComicVine scraper, if a series is not found (miss) then simply ask the user for help (textbox pop-up) and/or provide a list of possible matches.
If the tome number is not known (or unsure) don't just assume it's the first, ask the user for confirmation.
Even a simple dialogue showing the scraped information and asking for confirmation would be much appreciated.

Don't get me wrong, I'm grateful for all the time you've put into this and it's much better than having to do it all manually... But now I feel I'm missing half of the plugin.
Last Edit: 6 years 2 months ago by Joentjuh.
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16149

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
Actually... this is the way to manage accents and stuff... always check on BDTQ the correct grammar, you know french :-) jokes apart, having a search made on a not-so-exact title of the series is an option i was thinking... will have to work on it... afetr the temperature is a bit down from the current 38 C that is now... too hot to think !!!

Numbers are not assumed, unless the number field is empty... even if grey (not confirmed) it should work. do you have some examples/printscreen? or maybe you expect the scraper to use the filename ? this is not the case then, as a name scraper should be in place, whereas there are other scripts able to parse the filename and get metadata... i cannot ask for the number... as sometimes is empty or not numeric i.e.... too many exceptions.

direct scrape works if you past the link to an album... browse BDTQ site, get to an album link (means: in the top of the browser you have "album" in the link) and copy paste it... it shoudl work, still with the lmitation exposed before by someone and that i'll work on.

Look, i scraped some 4K albums (and rescraped those) quite easily, even with the previous version... now i think it's a bit more complete, still something to do... i'll give it a try...

ciao, happy reading

M
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16150

  • Joentjuh
  • Joentjuh's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 17
  • Thank you received: 1
  • Karma: 1
mizio66 wrote:
Actually... this is the way to manage accents and stuff... always check on BDTQ the correct grammar, you know french :-) jokes apart, having a search made on a not-so-exact title of the series is an option i was thinking... will have to work on it... afetr the temperature is a bit down from the current 38 C that is now... too hot to think !!!

Numbers are not assumed, unless the number field is empty... even if grey (not confirmed) it should work. do you have some examples/printscreen? or maybe you expect the scraper to use the filename ? this is not the case then, as a name scraper should be in place, whereas there are other scripts able to parse the filename and get metadata... i cannot ask for the number... as sometimes is empty or not numeric i.e.... too many exceptions.

direct scrape works if you past the link to an album... browse BDTQ site, get to an album link (means: in the top of the browser you have "album" in the link) and copy paste it... it shoudl work, still with the lmitation exposed before by someone and that i'll work on.

Look, i scraped some 4K albums (and rescraped those) quite easily, even with the previous version... now i think it's a bit more complete, still something to do... i'll give it a try...

ciao, happy reading

M

Okay, if it's something you're already working on I won't mention it any further (thumbs up!).
The number issue is not all that important (just a minor annoyance sometimes when I forget to set the fields)... I don't get/buy new comics that often so it's only a little bit of work (after the initial import of my current comics).


I still have yet to encounter any 'album' link on the BDTQ site, only "Série, Cotes, # Ventes, # Avis, # ParaBD and Galerie"... Maybe this only shows if you're a registered user?

Current series I'm trying: Lanfeust des Étoiles.

Another thing I've only just noticed, the volume numbers 'of' field seems way off. I'm assuming your using the number of total issues, thus including the special/other releases. Is it possible this is made an option to choose the 'real' total albums (count the number of rows in the "Les albums" box)?
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16151

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
Try this: www.bedetheque.com/album-7299-BD-Un-deux-Troy.html

you have to follow the links until you will see the list of albums, then click on one album... that is the album (!) link to copy/paste.

About the number of, it is taken fromthe site directly, no counting on the scrape side... and yes, i suppose it take all in count. Maybe this should be arequest to the site... they should be able to modify that info first... i'll give a try to this though too.

ciao
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16174

  • mizio66
  • mizio66's Avatar
  • Offline
  • Platinum Boarder
  • Started reading comics at 4... and still counting!
  • Posts: 451
  • Thank you received: 143
  • Karma: 67
Released v1.6 !!

Some minor bugs fixed, plus:
- Now partial names or non-accented words in Series name are identified and a list of possible candidates supplied for choosing the correct Series (i.e. Liberte or Liberté will be the same, just click more away).
- "Real numbers" of issues in the Series can be used instead of Total number (see parametrization form).
- Word defined as articles (Les,le, etc.) are freely defined by user (see parametrization form)
- Now the direct scrape should scrape all fields correctly...

Read the updated manual as soon as 600 (thanks!) will upload it.

Get new version in the first post as usual... report any bugs...

Enjoy

M
Last Edit: 6 years 2 months ago by mizio66.
The administrator has disabled public write access.
The following user(s) said Thank You: Joentjuh

Re: Bedetheque Scraper 2 - v1.5 6 years 2 months ago #16177

  • Joentjuh
  • Joentjuh's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 17
  • Thank you received: 1
  • Karma: 1
Hi, only encountered one bug so far:

Real '... of' numbers
www.bedetheque.com/serie-48-BD-Aquablue.html
Script reports total of 9 volumes, website lists 11.
The administrator has disabled public write access.

Re: Bedetheque Scraper 2 - v1.6 6 years 2 months ago #16184

  • pcjco
  • pcjco's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 6
  • Thank you received: 2
  • Karma: 0
Hi, some other remarks :

- there is a button labelled "Annuller" instead of "Annuler"
- when trying to scrape a missing volume (not referenced) by Bedetheque, the script is freezing instead of skipping it. (in fact, it seems that it is not freezing undefinitely, but several minutes at least)
i.e. "Jérôme - T89 - Le géant des neiges"
Last Edit: 6 years 2 months ago by pcjco.
The administrator has disabled public write access.
Time to create page: 0.215 seconds

Who's Online

We have 238 guests and 5 members online