Is there a program that can remove duplicate lines in text files?
Is there a program that can remove duplicate lines in text files?
Is there a program that can remove duplicate lines in text files?
Re: Is there a program that can remove duplicate lines in text files?
Great question. I've been looking for something like this myself. My big thing was I didn't want it to just automatically remove them first, I needed to have them identified. I have a trick that works with LibreOffice Calc and Excel, but that's fairly complex. So far it's RJ Texted that will bookmark duplicate lines.
To just zap them Notepad3 has something in the menu bar: Edit - Lines - Remove Duplicate lines. Interestingly, DocPad (not portable) says it will discard duplicate *paragraphs* which I haven't tested but sounds cool.
EDIT: if you need something quick, there's https://www.textevo.com/
Related thread: Finding duplicate phrases
- __philippe
- Posts: 687
- Joined: Wed Jun 26, 2013 2:09 am
Re: Is there a program that can remove duplicate lines in text files?
Quick and dirty solution(s), ...provided you don't mind a bit of CLI wrestling...
Uniq
(part of unxutils suite)
Code: Select all
c:\mytools>uniq --help
Usage: uniq [OPTION]... [INPUT [OUTPUT]]
Discard all but one of successive identical lines from INPUT (or
standard input), writing to OUTPUT (or standard output).
-c, --count prefix lines by the number of occurrences
-d, --repeated only print duplicate lines
-D, --all-repeated print all duplicate lines
-f, --skip-fields=N avoid comparing the first N fields
-i, --ignore-case ignore differences in case when comparing
-s, --skip-chars=N avoid comparing the first N characters
-u, --unique only print unique lines
-w, --check-chars=N compare no more than N characters in lines
-N same as -f N
+N same as -s N
--help display this help and exit
--version output version information and exit
A field is a run of whitespace, then non-whitespace characters.
Fields are skipped before chars.
Report bugs to <bug-textutils@gnu.org>.
Uniq
(part of BusyBox)
Code: Select all
c:\mytools>busybox uniq --help
BusyBox v1.27.0-FRP-1035-g74163a5 (2017-02-09 08:42:39 GMT) multi-call binary.
Usage: uniq [-cdu][-f,s,w N] [INPUT [OUTPUT]]
Discard duplicate lines
-c Prefix lines by the number of occurrences
-d Only print duplicate lines
-u Only print unique lines
-f N Skip first N fields
-s N Skip first N chars (after any skipped fields)
-w N Compare N characters in line
Re: Is there a program that can remove duplicate lines in text files?
EverEdit also has a command for this - Edit - Delete - Delete Duplicated Lines
https://www.portablefreeware.com/index.php?id=2538
https://www.portablefreeware.com/index.php?id=2538
Re: Is there a program that can remove duplicate lines in text files?
Many thanks, yes, I had found that page, too. Some of those suggestions do not work properly, are not available anymore. And I do not want to to it online.As a result of a quick googling, I found this:
7 Ways To Remove Duplicate Lines in Text Files.
Thank you, will have a look at it.So far it's RJ Texted that will bookmark duplicate lines
Thank you, that works well, just tried it.To just zap them Notepad3 has something in the menu bar: Edit - Lines - Remove Duplicate lines.
Many thanks, but I somehow have some concenring about such online services.EDIT: if you need something quick, there's https://www.textevo.com/
Thank you for the link.From where I stand what you're looking for is a specialized kind of software generally called concordancers (https://en.wikipedia.org/wiki/Concordancer). A decade back I would have some ready suggestions for you but too much time has passed since.
Thank you very much, philippe,
That for me looks a bit complicated somehow.
Thank you, "Released on 29 Nov 2013", but it will work anyway, I guess.EverEdit also has a command for this - Edit - Delete - Delete Duplicated Lineshttps://www.portablefreeware.com/index.php?id=2538
- __philippe
- Posts: 687
- Joined: Wed Jun 26, 2013 2:09 am
Re: Is there a program that can remove duplicate lines in text files?
@uwotm8
Fear not, uniq basic usage is simple as pie...
Consider, if you will:
Input file: test-in.txt
Code: Select all
C:\mytools\cat test-in.txt
EenyMeenyMinyMoe-0
EenyMeenyMinyMoe-1
EenyMeenyMinyMoe-2
EenyMeenyMinyMoe-2
EenyMeenyMinyMoe-3
EenyMeenyMinyMoe-3
EenyMeenyMinyMoe-3
EenyMeenyMinyMoe-4
EenyMeenyMinyMoe-4
EenyMeenyMinyMoe-4
EenyMeenyMinyMoe-4
EenyMeenyMinyMoe-A
EenyMeenyMinyMoe-A
EenyMeenyMinyMoe-A
EenyMeenyMinyMoe-Z
EenyMeenyMinyMoe-Z
Code: Select all
C:\mytools>uniq test-in.txt
EenyMeenyMinyMoe-0
EenyMeenyMinyMoe-1
EenyMeenyMinyMoe-2
EenyMeenyMinyMoe-3
EenyMeenyMinyMoe-4
EenyMeenyMinyMoe-A
EenyMeenyMinyMoe-Z
Code: Select all
C:\mytools\uniq test-in.txt > test-out.txt
Re: Is there a program that can remove duplicate lines in text files?
Okay, yes, I understand, very good, works great, many thanks!
- __philippe
- Posts: 687
- Joined: Wed Jun 26, 2013 2:09 am
Re: Is there a program that can remove duplicate lines in text files?
Attaboy ! Chalk up another recruit to the CLI clique...