Welcome, Guest
Python Scripts for ComicRack

TOPIC: Import tags from filename with regular expressions

Re:Import tags from filename with regular expressi 6 years 11 months ago #11460

  • Flaser
  • Flaser's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 18
  • Thank you received: 3
  • Karma: 2
I too would like to use this plugin as I have about 5000 files I'd like to tag with it:
I use this strict naming scheme:

[00000001] Franchise {Artist} - Title [Translator]

It's like that because I'm cataloging hentai doujinshi/manga. The first field is a unique ID for the file, so I can quickly locate it.
...and if the new "standard" by LBW is accepted it may get even more complicated, ie:
sadpanda.us/images/306032-5EJ1O2W.png

dir: (C##) [Artist (Doujinshi Circle)] Title (Franchise) (English) [Translator]
zip/rar: (C##)_[Artist_(Doujinshi_Circle)]_Title_(Franchise)_(English)_[Translator]

...where C## is the Comiket number. (The latest was C78)

I've cooked up this regex:

\[(?<AlternateNumber>\p{N}{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

Except for the literals, pretty straight forward. Could it be, that the plugin can't handle control characters as literals? (Square brackets in this case)

Couple of filenames that should be matched:

[00000001] #Various Fighting Games# {Shinnihon Pepsitou} - Kagayake WP Championship [LWB].zip
[00000002] 009-01 {Tokie Hirohito} - Next Mission [SaHa].zip
[00004154] 3x3 Eyes {Kurosawa pict} - Seima Kyuuin [LWB].zip
[00000003] 7th Dragon {ReDrop} - Flore Magique [4dawgz+A-S].zip
[00003585] After Sweet Kiss {T2 Art Works} - After... [RT].zip
Last Edit: 6 years 11 months ago by Flaser.
The administrator has disabled public write access.

Re:Import tags from filename with regular expressi 6 years 11 months ago #11480

  • Stonepaw
  • Stonepaw's Avatar
  • Offline
  • Moderator
  • Posts: 921
  • Thank you received: 268
  • Karma: 173
Flaser wrote:
I've cooked up this regex:

\[(?<AlternateNumber>\p{N}{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

That regex seems to work perfectly for me with the example files you gave. What problem have you been having?
The administrator has disabled public write access.

Re:Import tags from filename with regular expressi 6 years 11 months ago #11484

  • Yellowbox
  • Yellowbox's Avatar
  • Offline
  • Junior Boarder
  • Posts: 25
  • Thank you received: 10
  • Karma: 3
Stonepaw wrote:
Flaser wrote:
I've cooked up this regex:

\[(?<AlternateNumber>\p{N}{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

That regex seems to work perfectly for me with the example files you gave. What problem have you been having?

Yeah, ditto on that. Flaser, don't forget to give Expresso a try (www.ultrapico.com/Expresso.htm). That tool works wonders for debugging regex's.
The administrator has disabled public write access.

Re:Import tags from filename with regular expressi 6 years 11 months ago #11529

  • Flaser
  • Flaser's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 18
  • Thank you received: 3
  • Karma: 2
Oh, I've *made* this in Expresso. In it, the RegEx I made, works just fine.
However with the Comic Rack RegEx plugin, it can't match even the less complicated names.
Last Edit: 6 years 11 months ago by Flaser.
The administrator has disabled public write access.

Re:Import tags from filename with regular expressi 6 years 11 months ago #11530

  • Yellowbox
  • Yellowbox's Avatar
  • Offline
  • Junior Boarder
  • Posts: 25
  • Thank you received: 10
  • Karma: 3
Flaser wrote:
I've cooked up this regex:

\[(?<AlternateNumber>\p{N}{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

Except for the literals, pretty straight forward. Could it be, that the plugin can't handle control characters as literals? (Square brackets in this case)
No, but close. I'm pretty sure Python's implementation of regex doesn't support Unicode or named character classes (what Expresso gave you as \p{N}). If you're looking for all numbers, use \d instead, presuming you're not looking for non-Arabic numerals.

So try this:
\[(?<AlternateNumber>\d{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

Hope it helps. Much luck!
The administrator has disabled public write access.

Re:Import tags from filename with regular expressi 6 years 11 months ago #11531

  • Flaser
  • Flaser's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 18
  • Thank you received: 3
  • Karma: 2
Yellowbox wrote:
Flaser wrote:
I've cooked up this regex:

\[(?<AlternateNumber>\p{N}{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

Except for the literals, pretty straight forward. Could it be, that the plugin can't handle control characters as literals? (Square brackets in this case)
No, but close. I'm pretty sure Python's implementation of regex doesn't support Unicode or named character classes (what Expresso gave you as \p{N}). If you're looking for all numbers, use \d instead, presuming you're not looking for non-Arabic numerals.

So try this:
\[(?<AlternateNumber>\d{8})\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

Hope it helps. Much luck!

Unfortunately it still doesn't work. Still, thanks for the help.
Next I tried this, hoping that the "]" will be enough to separate the tag.
\[(?<AlternateNumber>.*)\] (?<Series>.*){(?<Writer>.*)} - (?<Title>.*) \[(?<Publisher>.*)\]

This, finally DID work.
Last Edit: 6 years 11 months ago by Flaser.
The administrator has disabled public write access.

Re: Import tags from filename with regular expressions 6 years 10 months ago #11816

  • pcvii
  • pcvii's Avatar
  • Offline
  • Expert Boarder
  • Posts: 80
  • Thank you received: 1
  • Karma: 0
Posting this because I fixed a bug a while ago and noticed there wasn't any updates to this in like 2 years. The bug that was fixed: When you would load a file with the regular expression it would change ComicRack's working directory. oraclexview pointed out how to fix this. Though I never posted the fixed version.

One bug I've noticed is when you go to load a file and hit cancel and then close the window comic rack throws an error. I'm not sure what causes that.
Attachments:
The administrator has disabled public write access.
The following user(s) said Thank You: Yellowbox

Re: Import tags from filename with regular expressions 6 years 10 months ago #11855

  • jorgev
  • jorgev's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 12
  • Karma: 0
Hi all,
First of all, fantastic script, this one saves me hours of monkey work :)

But I've got a little problem with my regex, i'll explain:

Ive got 2 possible strings:

1st)
@S@ Storm @RN@ 0005 @AS@ De kronieken van de Diepe Wereld @ASRN@ 0004 @T@ De groene hel @Y@ 1980

REGEX: (?:@S@ )(?<Series>.*)(?: @RN@ )(?:0{1,3})(?<Number>.*)(?: @AS@ )(?<AlternateSeries>.*)(?: @ASRN@ )(?:0{1,3})(?<AlternateNumber>.*)(?: @T@ )(?<Title>.*)(?: @Y@ )(?<Year>.*)

2nd)
@S@ 666 @RN@ 0001 @T@ Ante demonium
REGEX: (?:@S@ )(?<Series>.*)(?: @RN@ )(?:0{1,3})(?<Number>.*)(?: @T@ )(?<Title>.*)

Both regex do work fine

Now i want to combine those expressions with an Alternation Construct => (?( expression ) yes | no )

Here's my regex:
(?(?:@S@ .* @RN@ .* @AS@ .*)(?:@S@ )(?<Series>.*)(?: @RN@ )(?:0{1,3})(?<Number>.*)(?: @AS@ )(?<AlternateSeries>.*)(?: @ASRN@ )(?:0{1,3})(?<AlternateNumber>.*)(?: @T@ )(?<Title>.*)(?: @Y@ )(?<Year>.*)|(?:@S@ )(?<Series>.*)(?: @RN@ )(?:0{1,3})(?<Number>.*)(?: @T@ )(?<Title>.*))

=> This is working perfectly in EXPRESSO
But in CR it's not working, i guess due to the python regex implementation

Anyone got an idea how to adapt the regex so it works for python and CR?
The administrator has disabled public write access.

Re: Import tags from filename with regular expressions 6 years 10 months ago #11856

  • Yellowbox
  • Yellowbox's Avatar
  • Offline
  • Junior Boarder
  • Posts: 25
  • Thank you received: 10
  • Karma: 3
Wow, jorgev, those regexes hurt my eyes and broke my brain! But I still think I can help.
Python supports conditionals using a numbered or named capturing group. Python does not support conditionals using lookaround, even though Python does support lookaround outside conditionals. Instead of a conditional like (?(?=regex)then|else), you can alternate two opposite lookarounds: (?=regex)then|(?!regex)else).
Got that from http://www.regular-expressions.info/conditional.html. Hope it helps!
The administrator has disabled public write access.

Re: Import tags from filename with regular expressions 6 years 10 months ago #11891

  • jorgev
  • jorgev's Avatar
  • Offline
  • Fresh Boarder
  • Posts: 12
  • Karma: 0
Thanks for the info Yellowbox

I've been trying the past 2 days to get (?=regex)then|(?!regex)else) or (?(?=regex)then|else)working , but it doesn't work,

It's working in expresso but not in CR. It's not a big deal , i'll just use 2 seperate expressions based on the file naming of any series.
The administrator has disabled public write access.
Time to create page: 0.218 seconds

Who's Online

We have 248 guests and one member online