DupeKill

Submit portable freeware that you find here. It helps if you include information like description, extraction instruction, Unicode support, whether it writes to the registry, and so on.
Post Reply
Message
Author
User avatar
smaragdus
Posts: 2120
Joined: Sat Jun 22, 2013 3:24 am
Location: Aeaea

DupeKill

#1 Post by smaragdus »

DupeKill is a tiny and simple duplicate files finder.

Synopsis
Incredibly, I found room on the internet for yet another duplicate file remover. What sets this one apart is that it will make guesses about which files you want to keep based on the filename. The app tries to select the shortest, most descriptive name as the one to keep. As an example: a file with a name containing "copy of", or ".1.txt" would be considered less descriptive than one without; and a file named "lkePic.jpg" is considered less descriptive than "Lake Pictures.jpg". This way the time you spend picking the right file to keep is minimized.

The app also makes use of a speed improvement I haven't seen anywhere else: It makes an extra pass of the file list to create a 'fasthash'. The fasthash uses small samples of the file (16 kilobytes), taken from the beginning, end, and three places in the middle; then does a duplicate check based on the hash of the samples. This is very quick for large files, and it eliminates the vast majority of potential duplicates, as most files will have different samples. Most other duplicate finders omit this step, but it really speeds things up.
Commands
* [Enter] Open the file.
* [Delete]: Mark the file for deletion
* [Space]: Mark the file to keep.
* [Ctrl+A]: Select all
* [Ctrl+C]: Copy filenames to clipboard
* [Ctrl+X]: Exchange the selected items actions. (Swaps Keep and Delete)
Links
https://cresstone.com/apps/DupeKill/ - web-page
https://cresstone.com/ - publisher
http://www.softpedia.com/get/System/Fil ... Kill.shtml - DupeKill at Softpedia

Downloads
Direct Download Link for the latest version (as of 2016.08.22) of DupeKill- 0.6 Beta.

Images

DupeKill - program window:

Image

DupeKill - settings window:

Image

Portability
DupeKill is natively portable- download the archive (DupeKill_v0.6.zip), extract to a folder if your choice, run dupeKill.exe - settings are saved in an INI file (dupeKill_settings.ini) inside program folder:
No installation; just unpack and run. A settings file and ancillary files may be created in the program folder.
Requirements
DupeKill requires .NET Framework:
Version 4 or better of the .net framework is recommended.
License
DupeKill is closed-source freeware:
This software is distributed as-is, without any representations or warranties of any kind.
The author of this software imposes no additional license terms or limits upon its use or redistribution.
Notes
DupeKill comes with a help file (README.TXT).
DupeKill supports integration to Windows shell.
DupeKill supports several hash algorithms- MD5, SHA1, SHA256, SHA384, SHA512.
DupeKill can be minimized to system tray (Close to Tray in Settings).
Wish list - support for Drag&Drop of folders, context menu command- "Open Containing Folder".

File Information
Name: DupeKill_v0.6.zip
File Size: 63849 Byte(s) (62.35 KB)
Modified Date: 2016-03-18 04:49
MD5: b4497cc37ce24c18dc9d200901e89787
SHA1: 5b1acf70e0abf97b7611c122405db98fbba4b669
SHA256: 04fc6c1919650a388ac4900244ef2c4512310e63224667f9016dee2b56e25523
CRC32: 43b36bd0
VirusTotal analysis - 0 / 55

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DupeKill

#2 Post by webfork »

What sets this one apart is that it will make guesses about which files you want to keep based on the filename. The app tries to select the shortest, most descriptive name as the one to keep.


I have a ton of old files from different sources about essentially the same thing and no way to sort through it all clearly. This could be huge.

Thanks for posting on this.

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DupeKill

#3 Post by webfork »

I got some testing in and found this a really impressive tool. We have programs on the site that will do some of what this program does but they aren't open source nor do they have the certainty of higher SHA values. The excellent DoubleKiller and Bytessence DuplicateFinder both come close, but both use the weaker CRC32 analysis might find "duplicates" that really aren't.

The really critical thing I found when using this was to grab multiple files and either select Keep, Delete, or Invert as a group. That way when a whole series is important, useless, or the opposite of what you want, you can quickly shift it's status. Also check the readme file for some nice hotkeys that speed things up.

Wishlist
  • Available checkboxes for keep/delete as well as the current functionality so you can quickly add and remove visually or use the hotkeys or right-click menu
  • Available "Open file folder" in the right-click menu (e.g. c:\temp\a.txt would open to c:\temp)
  • Skip 0 byte files (they all have the same hash)
  • Ability to:
    • Delete to recycle bin
    • Skip files above or below a certain set
    • Sort items according to size
    • Ignore items underneath a certain size
    • Set newer/older or longer/shorter files as the default delete item. For example, the duplicates on my system for example were because I had a simple file dump and then an organized area. The organized files had newer created/modified values. They also tend to be more descriptive and longer.
  • Some kind of note in the settings menu that SHA1 or MD5 should be more than adequate for almost any check.

    Massive databases with billions of files might need SHA256 values to check for duplicates, but but an accidental duplicate on home computers is very remote. Plus, the speed difference is dramatic as the program's own benchmarking tool notes:

    MD5 ... Took 0.4110147 seconds @ 486.601 MB/s
    SHA1 ... Took 0.4994599 seconds @ 400.433 MB/s
    SHA256 ... Took 2.6020576 seconds @ 76.862 MB/s
    SHA384 ... Took 10.687029 seconds @ 18.714 MB/s
    SHA512 ... Took 11.2029351 seconds @ 17.852 MB/s

    This reminds me of all the file wipe programs that will erase a file 7x times even though there's no real security improvement. MD5 or SHA1 should be more than adequate.

User avatar
Happy.Camper2
Posts: 19
Joined: Tue Dec 23, 2014 4:48 pm

Re: DupeKill

#4 Post by Happy.Camper2 »

DupeKill now at 0.7Beta
DL Link: https://cresstone.com/apps/DupeKill/DupeKill_v0.7.zip

HomePage:https://cresstone.com/apps/DupeKill/#appInfo

Apps Directory:https://cresstone.com/apps/

This author has some nifty utilities with a twist that other apps don't have.

Website is easy to navigate & all info is at hand.

Enjoy :)

User avatar
smaragdus
Posts: 2120
Joined: Sat Jun 22, 2013 3:24 am
Location: Aeaea

Re: DupeKill 0.8

#5 Post by smaragdus »

DupeKill at version 0.8 (released on 2017-07-09), latest changes:

DupeKill 0.7
* Beta 7; 2016-12-29
- Added: Symbolic link code finished and activated.
- Added: Ctrl+mousewheel zooming.
- Added: Some keyboard shortcuts for buttons, check the tooltips.
- Changed: Settings for window position now stored in a sane way.
- Fixed: Turned off single-instance limit.
- Changed: Start with windows removed (another setting inherited from the template that didn't make a lot of sense, since we're mostly running on-demand.)
DupeKill 0.8
* Beta 8; 2017-07-09
- Added: Major feature: advanced criteria scans can be configured via the 'address bar' drop-down.
- Added: "Include subfolders" is now a remembered setting.
- Added: New option in settings to remember recent scan paths.
- Added: New option in settings to sort scan results by size or # of dupes. (On-demand sorts may be next, but are not easy due to the potential size of scans)
- Added: "Copy all info" command file list context menu (keyboard Ctrl+Alt+C). The following information about the selected items is copied to the clipboard (tab separated): full path, file size, creation date, modification date, file signature, action
- Fixed: list no longer flickers on action change.

cresstone
Posts: 5
Joined: Wed Jul 12, 2017 9:56 am

Re: DupeKill

#6 Post by cresstone »

Hi, I'm the author of this tool, just saw this page... responding to some of the feedback:


*Drag folder to app to check - now on my todo list.

*Available check-boxes for keep/delete
webfork, can you elaborate on this? I'm not quite sure what you mean...

There are two 'remove' options now (soon to be three, as I have a "move to folder x" slated to be a potential action), so I dunno if a check-box makes sense... The fastest way to change a file's 'action' is just to double click the file in the action column; that will toggle between keep and the default remove action.

*"Open file folder" - slated for the next release.

*0-size files - of course, we don't actually try to hash these, but I feel it's useful to show them just cause I usually want to get rid of empty files. You can skip them in version 0.8 using an advanced criteria scan with a 1 byte minimum size.

*Delete to recycle bin - I'll consider adding this as an option

*Sorting and filtering options - got covered in versions .7 & .8

*Different automatic assignment logic - I'm working on code for 'keep oldest/newest', I'll consider 'keep longest/shortest filename'. I'm also working on criteria where you can specify 'never delete from this folder'.

*Hash explanation - The only reasons I include the sha256-512 was because .net has them right there in the the framework, and md5 (and sha1 now too I guess, with SHAttered) can be gamed to create false positives.
Accidental collisions just wont happen... even for md5, the number of files you'd need to have any sort of chance would crash the app anyway.

Anyway, I think sane defaults might be enough here, I'm not trying to give a math lesson or anything; though maybe I'll stick a wikipedia link in the benchmark form...



PS: sort of stumbled on this board, if feedback is sent to the address on my page, I'll see it /considerably faster/.

Finally: the direct download links to the .zip files will get redirected to the app's main page, this is just to prevent out-of-date versions for spreading.

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: DupeKill

#7 Post by webfork »

cresstone wrote: Wed Jul 12, 2017 10:36 amHi, I'm the author of this tool, just saw this page... responding to some of the feedback
I delayed responding to this just to see if some of the suggestions made it into the next version but it's been a little while so I guess I'll just reply to what's there ...
cresstone wrote: Wed Jul 12, 2017 10:36 am*Available check-boxes for keep/delete
webfork, can you elaborate on this? I'm not quite sure what you mean...
I was thinking that if you wanted to keep two duplicates but delete a third, you could check those you want to delete and uncheck all the others (keep). I was thinking of the functionality of a similar program called DoubleKiller:

Image
cresstone wrote: Wed Jul 12, 2017 10:36 am*0-size files - of course, we don't actually try to hash these, but I feel it's useful to show them just cause I usually want to get rid of empty files. You can skip them in version 0.8 using an advanced criteria scan with a 1 byte minimum size.-
Makes sense
cresstone wrote: Wed Jul 12, 2017 10:36 am *Hash explanation - The only reasons I include the sha256-512 was because .net has them right there in the the framework, and md5 (and sha1 now too I guess, with SHAttered) can be gamed to create false positives.
Yeah I know more about the topic than when I first posted. Feel free to ignore my notes on that.

cresstone
Posts: 5
Joined: Wed Jul 12, 2017 9:56 am

Re: DupeKill

#8 Post by cresstone »

1) "Default Remove Action:" [ Delete File / Recycle / Replace with SL / Move to... ] I think Recycle is broken, or am I missing something? files do not get put in the Recycle Bin which is what I was expecting, instead they just get deleted with no way to recover them. So to me the two options "Delete File + Recycle" seem to do the same thing.

2) "Move to..." This options just seems buggy in general like it will delete files and not move them, you should look into this to make sure it works correctly...

3) Can we drag on drop multiple folders with files in them to compare against? Like drop a folder, then while holding down "Ctrl" drop another, and another, etc.? I find being limited to one folder, well limiting.

Can you hit the 'create log file’ box and check what it says for these files it's getting rid of? I want to double check that it’s saying: "Moving: C:\someFile.txt" or "Recycling: c:\somefile.txt" just to ensure it’s following the correct code path…

I believe the recycle logic will fall back to deletion if there's no recycle bin for the drive; but in the move logic it should not be possible for it to delete anything, so something’s wrong...

For multiple folders: you can do this via the 'advanced criteria' editor. Hit the main 'address bar' drop-down and go to the bottom, then click "Advanced Criteria...". The editor should pop up and there you can then enter multiple folders; it supports drag&drop too.

Post Reply