Hashing use cases (why would I want to hash my files?)

Discuss anything related to portable freeware here.
Post Reply
Message
Author
User avatar
webfork
Posts: 9640
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Hashing use cases (why would I want to hash my files?)

#1 Post by webfork » Tue May 23, 2017 2:43 pm

Background

Something that's been bothering me that I couldn't quite name was the ongoing submission of a TON of different hashing tools. Normally I like a variety, but this is a really crowded area and there are really only 3 reasons to use hashes:
  • Check VirusTotal - basically find out a given file's reputation. Algorithm required: I understand the service will check SHA1 files, but for the most part all this site cares about is SHA256.
  • File verification - making sure the file or files you downloaded from somewhere are really the file you were looking for. Algorithm required: SHA1 and MD5 seem to be the only ones anyone uses despite security concerns with both. EDIT: I've seen more and more posts with the far more secure SHA 256, format largely due to support from Github.
  • Change analysis - seeing if a file or files have changed. Examples include if a CD is corrupt. Algorithm required: CRC32, Blake2, or SHA-1 are all that's necessary for this kind of analysis, especially if also checking against file size or even just file type. Anything else is probably overkill unless you're dealing with billions of files.
If developers come up with a hashing program that doesn't improve on existing tools that do one of the above tasks, they are wasting time. Still, I have little doubt I'll see at least 10 more MD5 file analyzers before the end of the year.

For reference, here are my programs of choice for the above requirements: 1. SigCheckGUI, 2. fHash, and 3. RapidCRC Unicode.


Periphery (not quite hashing but in the neighborhood):
  • Duplicate check - this is usually done behind the scenes by a duplicate-checking program (like DoubleKiller). You rarely see or care about the actual hash of a given file. Algorithm required: usually CRC32 unless looking through millions of files, and then MD5 or SHA1.
  • Search for a file by the hash - this is one of the reasons I use SigcheckGUI on files I highlight here on the site: that makes it possible to find a file when you only have an old hash. Algorithm required: due to popularity, MD5 or SHA1, but I'm increasingly finding files by their SHA256 code.
  • Torrents - a series of segmented SHA1 hash files, they can definitely check files for file changes and do verification, assuming you trust the torrent. However, it's very inefficient compared to other methods. On the upside, the hashes are portable and everyone has a torrent client so it could certainly do the trick.
  • Data redundancy - e.g. Multipar - also like torrents, not really hash but can verify and fix broken files. Solely for file verification alone, it's hugely inefficient in terms of space and processor usage by comparison to other hashing tools.
Did I miss any?

---

Related:
Last edited by webfork on Wed Dec 16, 2020 11:47 am, edited 1 time in total.
Reason: (lots of minor updates)

User avatar
SYSTEM
Posts: 1957
Joined: Sat Jul 31, 2010 1:19 am
Location: Helsinki, Finland

Re: Hashing use cases

#2 Post by SYSTEM » Tue May 23, 2017 10:41 pm

webfork wrote:File verification - making sure the file or files you downloaded from somewhere are really the file you were looking for. Algorithm required: SHA1 and MD5 seem to be the only ones anyone uses despite security concerns with both.
I use a hash calculator to verify that files I download haven't been corrupted during the download. For that purpose, SHA1 and MD5 are perfectly sufficient.
My YouTube channel | Release date of my 13th playlist: August 24, 2020

User avatar
Midas
Posts: 5627
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Hashing use cases

#3 Post by Midas » Wed May 24, 2017 5:00 am

SYSTEM wrote:I use a hash calculator to verify that files I download haven't been corrupted during the download.
  • +1 :)

User avatar
JohnTHaller
Posts: 646
Joined: Wed Feb 10, 2010 4:44 pm
Location: New York, NY
Contact:

Re: Hashing use cases

#4 Post by JohnTHaller » Wed May 24, 2017 9:23 am

The PortableApps.com Platform uses MD5 hashes for download verification on all apps in the updater/app store. The PortableApps.com Installer uses MD5 hashes for download verification in live installers where the app can't legally be repackaged as per publisher request or EULA. We also publish the MD5 sums to each app download page for end users that manually download. This is able to catch infected mirrors, incomplete downloads, corrupted mirroring, and a typical malicious actor inserting themselves into the network with a substituted fake file. The hash check serves as a nice additional security measure on top of the installer's built in CRC self check and code signing of the installer itself.

We're considering switching to SHA2 in a future release for even better security as that would also protect against a more advanced malicious actor custom-creating a badware file that matched everything including the MD5 sum, which is difficult but not impossible.
PortableApps.com - The open standard for portable software | Support Net Neutrality

User avatar
Midas
Posts: 5627
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Hashing use cases

#5 Post by Midas » Mon Jun 26, 2017 4:45 pm

Quick note to link an extensive list of related tools posted to TPFC: viewtopic.php?t=6358 ...

For its sheer uniqueness, let me add mention of another, sadly but expectably non-portable, shell extension with hashing and VirusTotal capabilities.
http://kvsoft.at.ua/peid_tab/peid_tab_en.html wrote:PEiD Tab, a free utility that extends the possibilities of Windows Explorer by adding a function studies of PE files, which allows the compiler to know, and, consequently, the programming language used for writing the program, packer or kriptora.
Image

User avatar
webfork
Posts: 9640
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Hashing use cases

#6 Post by webfork » Tue Jun 27, 2017 7:25 pm

Midas wrote:a free utility that extends the possibilities of Windows Explorer by adding a function studies of PE files, which allows the compiler to know, and, consequently, the programming language used for writing the program, packer or kriptora.
Yeah, that's what I was looking for when I came up with this. Great add.
Midas wrote:Quick note to link an extensive list of related tools posted to TPFC: viewtopic.php?t=6358
Good related post, thanks.

User avatar
webfork
Posts: 9640
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Hashing use cases

#7 Post by webfork » Tue Nov 13, 2018 7:25 pm

Two other possible uses of file hashing:

1. Unique password generator/manager (doesn't work on sites with case or special character requirements) - create or generate random file, run hash somewehre bewteen CRC32 or MD5 hash (creates an 8 or 32 character password respectively) , and use the value as the password. The caveat here is that if the file is ever accidentally corrupted or modified, the password of course cannot be recovered.

2. Unmanaged file server trick - A combination of verification and change detection, I use this on a file server (Windows shared folder) whose file permissions are very open. It's often unclear who did what. To address this, I've been adding a hash code to the filename (filename [A3JS0N].txt) via RapidCRC Unicode to ensure the last file I posted is the copy I posted. Also the odd text seems to throw people off and they leave it alone unless specifically pointed to the file.

User avatar
webfork
Posts: 9640
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: Hashing use cases

#8 Post by webfork » Sun Oct 11, 2020 7:26 pm

Found another great trick this weekend ...

* Duplicate photo names - I frequently pull images down from my camera with with sequential or semi-original filenames e.g. IMG038.jpg but when I try to restore from backup or combine different groups of photos, there's often overlap. I want to add those pictures to the same collection and avoid both overwrite or duplicate images. By integrating a hash into the filename, such as the CRC32 code in the file above IMG_038 [A97131F8].jpg, I can safely avoid both issues. The rename of course enables the ability to quickly check the files for errors and hopefully restore the correct file from backup.

As with the "Unmanaged File Server" case above, I used RapidCRC Unicode to do this in batch.

User avatar
Midas
Posts: 5627
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: Hashing use cases (why would I want to hash my files?)

#9 Post by Midas » Mon Oct 12, 2020 7:32 am

A great tip to better manage my ever growing digital image collection. Thanks. :sunglasses:

Post Reply