Welcome, Guest
Python Scripts for ComicRack

TOPIC: Duplicates Manager (v Alpha - 0.6)

Duplicates Manager (v Alpha - 0.6) 6 years 10 months ago #12076

  • perezmu
  • perezmu's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1114
  • Thank you received: 64
  • Karma: 51




PESCUMA HAS OFFERED US A NEW version: Alpha 0.7 - WIKI UPDATED

Courtesy of PESCUMA! Thanks Ricardo!!!!!

NEW RULES - SEE WIKI


Changelog:
New version: 0.7

Added: - New rules: 
                   - pagecount remove largest
                   - pagecount remove smallest
                   - filesize remove largest
                   - filesize remove smallest
       - Now it copies some comic information from deleted comics. It is disabled per default. 
               Enable in constants.py : UPDATEINFO
       - Fix for series with multiple volumes


The main feature is the copy of comic info from deleted files. Some fields (like alternate series number) are not handled in cvdb. So, if you added this info to a dup comic that gets deleted, now it is copied to the remaining files. (It only copies the info if the remaining file does not have it). The copied fields are: AlternateCount, AlternateNumber, AlternateSeries, Count, Title


One more side note: since last version the keep pagecount and keep filesize commands accept a percentage as last argument (optional). It can be used in a this case: if you have the same comic compacted as rar and as zip they will have slightly different file sizes. Suppose you prefer to keep the zip files (I do), so you can write in the rules:

filesize keep largest 10%
filesize keep zip
filesize keep largest

What that will do is:
1. Keep the largest file and all files that have a size greater than 90% of that size. Since the zip and rar will have more or less the same size, both will be keep. Smaller files will be removed.
2. From the result, keep only zip files (if there is any zip file - else this rule is ignored)
3. From the result, keep only the largest file. If you had more than one zip with almost the same size both would remain until now
Please, be advised, this is in very rough state and you are very likely to encouter errors... please report them here or in the google code page
=== IMPORTANT NOTICE ===

Since I do not want to mess with your files & library before we are sure this thing works right,
the script out of the box will not move or remove any comic, just log what it would do in the logfile. To enable the actual processing of files you need set to True the variables "MOVEFILES" and "REMOVEFROMLIB" - DO SO IN THE dmrules.dat FILE.

See the Wiki page for details.

TO SEE WHAT THE SCRIPT WOULD ACTUALLY DO TO YOUR COMICS, OPEN THE TEXT FILE "logfile.log" LOCATED BY DEFAULT AT "C:\__DUPES__"




This script is an addon to comicrack that identifies duplicated ecomics and follows a set of user defined rules to remove unwanted dupes. It is designed with the 0-days in mind, but should prove useful in other scenarios.

The script reads a file (dmrules.dat) from the directory where it is installed, that contains a series of rules to manage the duplicate files. Duplicate files that meet the criteria expressed in the rules are moved to a dump directory (not deleted) and removed from the comicrack library (default dump directory is C:\_dupes_). This directory also holds a logfile (logfile.log) that details the process followed on your comics

So, the first thing you want to do is read the rules (see wiki) and edit your custom dmrules.dat file.


Wiki Index:

----

FOR COMPLETE INFORMATION AND DOWNLOADS, SEE THE GOOGLE CODE PAGE. THE INFO IN THIS POST SEEMS SHALLOW, BUT I HAVE MADE QUITE AN EFFORT TO DOCUMENT EVERYTHING IN THE WIKI...

Download: Duplicates Manager 0.7.crplugin

----

Changelog:
v0.6 -> New Features Release

   Added:    - New parser (texts with more than one word can be surrounded by "
             - Percentage option to filesize keep/remove
             - Added percentage option to pagecount keep/remove largest/smallest
             - Added keep first (to remove remaining identical files)
             - Allow filter multiple words (using any of then) in texts

v0.5 -> 

   Fixed:   - Major bug found in the 'text' rules!!!!!

   Changed: - 'pagecount keep noads' now skips comics with COVERPAGES or less pages
	    - Added "@ OPTION VALUE" to the rules.dat file
	    - Added new options:
		    - COVERPAGES (int)
		    - SIZEMARGIN (int)     (part of issue 9, still not operative)
            - Allow more than one word as part of the text based rules (issue 8)	


v0.4 -> Bug fix Release

   Fixed:   - Issue 7 (fileless removal) finally solved (I hope!)
	    - Issue 6 (series info panel corruption) finally solved (I hope!)
	    - Correctly changed version number in Package.ini

v0.3 -> Bug fix Release
	
   Fixed:   - Doesn't break Series info Panel (issue 6) anymore

	    - Threw and exception when there were no dupes (issue 4)


   Changed: - Remove leading 0's in comic number to improve duplicate discovery

            - 'pagecount remove fileless' will remove all fileless dupes but one when
		a group of only fileless is found. One with thumbnail will be kept
		(issue 7)


v0.2 -> Bug fix Release

	Fixed: 	- Rules '[text] remove word' is now correctly parsed.

v0.1 -> Initial Release

Cheers!!!!! :laugh: :laugh: :laugh: :laugh:
Last Edit: 6 years 9 months ago by perezmu.
The administrator has disabled public write access.
The following user(s) said Thank You: doolittle

Re: Duplicates Manager 6 years 10 months ago #12077

  • forkicks
  • forkicks's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 871
  • Thank you received: 109
  • Karma: 37
Yes please! Specially if its smart enough that you can, say, give it some rules like "Always pick the largest size file to keep" (which is what i end up doing manually anyway :-)).

Neat :)
fK
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12078

  • perezmu
  • perezmu's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1114
  • Thank you received: 64
  • Karma: 51
Granted!
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12080

  • 600WPMPO
  • 600WPMPO's Avatar
  • Offline
  • Moderator
  • Posts: 3788
  • Thank you received: 557
  • Karma: 233
perezmu wrote:


coming sooner than you expect...
:woohoo: :woohoo: can't wait..!!

p.s. the icon is looking good!
Now Playing: The ComicRack Manual (Online)

See my new comics & gadgets on: Tumblr!
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12082

  • cbanack
  • cbanack's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1328
  • Thank you received: 508
  • Karma: 182
Looking forward to it!
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12084

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
If it's the only file in a folder, can it remove the empty folder as well?
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12085

  • perezmu
  • perezmu's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 1114
  • Thank you received: 64
  • Karma: 51
Samael69 wrote:
If it's the only file in a folder, can it remove the empty folder as well?

If I cannot add this in this first version, it will surely come soon!
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12094

  • Stonepaw
  • Stonepaw's Avatar
  • Offline
  • Moderator
  • Posts: 921
  • Thank you received: 268
  • Karma: 173
Looking forward to it!
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12096

  • 600WPMPO
  • 600WPMPO's Avatar
  • Offline
  • Moderator
  • Posts: 3788
  • Thank you received: 557
  • Karma: 233
I am really curious to see how this script handles different scans of a book.

e.g. Suppose we have 3 different scans of a book.. like Comic 1 (Scanner X), Comic 1 (Scanner Y), and Comic 1 (Scanner Z). Now, I always prefer Scanner X more than Y more than Z. So, I would like to keep the Scanner X book here. A simple scenario.

Next step is bit complicated. Here we have Comic 1 (Scanner X), and Comic 1 (two covers) (Scanner Y). Obviously, I prefer comics with maximum number of covers. But, Scanner X is the preferred one. So, what do we do here?

The last and the most difficult one. Here we again have 3 scans.. Comic 1 (c2c) (Scanner X), Comic 1 (two covers) (noads) (Scanner Y), and Comic 1 (two covers) (c2c) (Scanner Z). The preference is c2c. Now, add this to the previous 2 preferences (X>Y>Z and number of covers) ,and we have a headache at our hands. :P

Really looking forward to this script..!!! :evil:
Now Playing: The ComicRack Manual (Online)

See my new comics & gadgets on: Tumblr!
Last Edit: 6 years 10 months ago by 600WPMPO.
The administrator has disabled public write access.

Re: Duplicates Manager 6 years 10 months ago #12097

  • Samael69
  • Samael69's Avatar
  • Offline
  • Platinum Boarder
  • Posts: 381
  • Thank you received: 47
  • Karma: 21
I had assumed it was used only after the ComicVine scrape.
The administrator has disabled public write access.
Time to create page: 0.242 seconds

Who's Online

We have 233 guests and no members online