TLDR: The go-to program is WinMerge. You can tweak out that program to do most of the following so if you're not sure, start there.
What this post is not about
- Search tools - Comparison software is about letting the computer suggest what's significant based on the criteria you give it.
- Hash software - These programs only care about weather or not a file is identical. A War and Peace ebook with a single extra space is no more different or similar than Shakespeare. Also, I already wrote up something on why you'd use hash tools.
- Developer Tools - I'm not excluding these, but the runaway vast majority of tools in the space are focused on log files and source code. It's a well covered topic.
Differences
Line changes
This is more the domain of log file and source code analysis tools, where line differences are the most crucial part. Recommendation: WinMerge, ExamDiff, or Diffinity
Word changes
Important content choices are almost always word choices. If you want to see how something was edited to change the meaning, you probably care most about the words. Recommended: WinMerge because you can choose to ignore whitespace, punctuation, and much more.
All changes
Catch any difference between two files, usually between edits or updates. Sometimes this means demonstrating that you've done the work AKA a "redline," or showing how small changes have big impact in legal documents and config files.
* WinMerge is probably the leader again because it can generate reports, but LibreOffice is also worth going to because it will also catch format changes and includes a filter to show different types of changes (pictured below):
* WinMerge is probably the leader again because it can generate reports, but LibreOffice is also worth going to because it will also catch format changes and includes a filter to show different types of changes (pictured below):
Duplication / similarities
* Repeating phrases in the same document, helps either catch bad writing or confirm that words and phrases are consistent.
MatnPardaz - You can see repeats in words and phrases of varying word and phrase length, along with how many times it shows up.
* Similarities in multiple files, e.g. If you see similar phrasing or numbers across many documents, that might be something important.
Still looking for something here but right now I'd start with TextComparer and the similarly-named TextCompare covers this at least partly with "Same line both sides" and/or the "Moved or Copied line" view:
.
.
* Similarities inside of the same document. So if mention of a policy number keeps coming up in a file it would be nice to see that highlighted.
See the thread on Concordance tools.
* Fuzzy matching - usually addressing misspellings, different suffix and prefix words.
WinMerge
---
Other
Related Fuzzy Search
Unlike the above tools, these programs address you have to enter in some word or phrase and see if it happens to connect.
LibreOffice - find this in the "Search and replace" window under "Similarity search." Only works with one term at a time so you can't view variations on a a list of words.
Docfetcher - you can use the ~ symbol for similar searches. Multiple terms could be used with an OR value written between them, e.g. see~ OR porta~.
Docfetcher - you can use the ~ symbol for similar searches. Multiple terms could be used with an OR value written between them, e.g. see~ OR porta~.
Additional feedback, suggestions, programs or use cases all welcome.