txtproc - A CLI tool for text processing

Submit command line tools that you find here.
Post Reply
Message
Author
TP109
Posts: 571
Joined: Sat Apr 08, 2006 7:12 pm
Location: Midwestern US

txtproc - A CLI tool for text processing

#1 Post by TP109 »

txtproc is a command line tool to do various text transformations. It is called with some text to transform and a transformation function as input and will return the transformed text as output. For some of the transformation functions additional parameters must be provided as well.

Input text can be supplied:
  • on the command line
  • via a file (using the --file parameter)
  • from stdin when txtproc is used in a pipeline
  • via the clipboard (built-in on Windows or using a tool like xclip on Linux)
Output text can go:
  • to stdout, i.e. the console window or another tool in a pipeline (which again can be txtproc)
  • back to the input file (i.e. modifying the input file)
  • a new file (using redirection, i.e. the > symbol)
  • back to the clipboard (built-in on Windows or using a tool like xclip on Linux)

Help
Help
Functions page1
Functions page1
Functions page2
Functions page2

txtproc Homepage - GitHub

txtproc Releases and Download page

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: txtproc - A CLI tool for text processing

#2 Post by webfork »

Some caveats about this one: it's early stage (alpha?) and light on documentation but I'll definitely be digging into this. Great find.

User avatar
vevy
Posts: 795
Joined: Tue Sep 10, 2019 11:17 am

Re: txtproc - A CLI tool for text processing

#3 Post by vevy »

webfork wrote: Sun Apr 12, 2020 8:53 pm it's early stage (alpha?)
Why? :? Because it is v0.4.0 or did you find problems during usage?


@TP109 Is this your screenshot? Why does it say clink in the title?

Otherwise, this looks very interesting!

TP109
Posts: 571
Joined: Sat Apr 08, 2006 7:12 pm
Location: Midwestern US

Re: txtproc - A CLI tool for text processing

#4 Post by TP109 »

@vevy

Yes, that is Clink. It's already listed on cli.portablefreeware.com as a CMD extension. I use it frequently but also use PyCmd, ConEmu, and Console2 at times. I use Clink more often because it's very fast, can be injected into a CMD window, keeps command history, can be customized, and can open paths as a parameter on launch. I probably would use Greg's Dos Shell instead because it has a lot of good features, but it since can't open paths as a parameter, that is a serious limitation.

Yes, I made a screenshot. The developer doesn't include a screenshot on the program's homepage.

I've known about texproc since it was first released and have used it in various scripts without a problem.

User avatar
vevy
Posts: 795
Joined: Tue Sep 10, 2019 11:17 am

Re: txtproc - A CLI tool for text processing

#5 Post by vevy »

TP109 wrote: Tue Apr 14, 2020 7:25 am @vevy

Yes, that is Clink. It's already listed on cli.portablefreeware.com as a CMD extension. I use it frequently but also use PyCmd, ConEmu, and Console2 at times. I use Clink more often because it's very fast, can be injected into a CMD window, keeps command history, can be customized, and can open paths as a parameter on launch. I probably would use Greg's Dos Shell instead because it has a lot of good features, but it since can't open paths as a parameter, that is a serious limitation.

Yes, I made a screenshot. The developer doesn't include a screenshot on the program's homepage.

I've known about texproc since it was first released and have used it in various scripts without a problem.
Thanks. I know about clink. It is a great tool. I just saw the window title and thought that there was a clink replacement shell or something I didn't know about :D. I looked into the release zip file and found the batch file with the window title! I got excited for nothing!

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: txtproc - A CLI tool for text processing

#6 Post by webfork »

vevy wrote: Tue Apr 14, 2020 3:01 am
webfork wrote: Sun Apr 12, 2020 8:53 pm it's early stage (alpha?)
Because it is v0.4.0 or did you find problems during usage?
It looks like alpha status from the version number, but I don't know. Hence the question mark.

I haven't been able to test this yet.

User avatar
vevy
Posts: 795
Joined: Tue Sep 10, 2019 11:17 am

Re: txtproc - A CLI tool for text processing

#7 Post by vevy »

webfork wrote: Wed Apr 29, 2020 5:35 pm It looks like alpha status from the version number, but I don't know. Hence the question mark.
If I go by my experience, version numbers are next to useless in most cases.
  1. Subjectivity in interpreting quality: what "alpha/beta quality" means to the developer:
    1. Some accept that bugs are facts of life so they are more likely to "upgrade" the designation:
      • Crashes all the time = alpha.
      • Buggy, but mostly works = beta.
      • Occasional non-fatal bugs = release quality.
    2. Some are more strict or are used to working in a strict professional environment. I think they may see things differently:
      • Can lose your work under any circumstances/Not basic-function-complete = alpha/no go.
      • Works without trouble, but may gracefully act weird/require workarounds by the client/user on some circumstance = beta. No crashes or lost work.
      • Water-tight and covers all the edge the developer can possibly think of = release quality. (Critical bugs in production jokes aside).
  2. Type of software:
    • Data-reading tools: what if your music player kept crashing? You'd curse it and switch to another one.
    • Data-writing tools: what if your text editor crashes, or if your audio tagger saved malformed tags. You lost work or have to spend some/a lot of time correcting the issue.
    • And then, what if a low-level disk utility had a nasty bug and horribly mangled all your data!
  3. Tempering expectations: even if the developer thinks the product is good, they may not want other to be too confident when dealing with it or send them angry emails.
  4. How important does a developer see an addition to justify a major (number) upgrade of the version. "Now you can export to CSV! Version increased to 9.0 from 8.2"
  5. Some ignore the whole model and use versions as they please: Chrome v80, using dates as versions, etc. Some start at version 1.0 no matter how polished the product is.
---------------------------------------------------
Some examples:
  • Cherrytree is still on version 0.39.
  • uutils coreutils (Gnu coreutils in one Windows executable, stay tuned :wink:) is working alright at version 0.0.1.
  • Everything with a ton of features is still at version 1.4.
  • foobar2000 is similarly at 1.5.
  • FBReader (that I recommended to you here) left at 0.12 has more features than many other EPUB readers.
  • Swiss File Knife is at 1.9.
  • etc

TP109
Posts: 571
Joined: Sat Apr 08, 2006 7:12 pm
Location: Midwestern US

Re: txtproc - A CLI tool for text processing

#8 Post by TP109 »

vevy wrote: Sun May 03, 2020 7:39 am If I go by my experience, version numbers are next to useless in most cases.
Good points that are difficult to disagree with. Agree that version numbers are basically meaningless in the majority of cases.

Probably should rephrase the above to:
Good points that are easy to agree with. Agree that version numbers can be meaningless in many cases.
Last edited by TP109 on Mon May 04, 2020 1:59 am, edited 1 time in total.

User avatar
Midas
Posts: 6725
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: txtproc - A CLI tool for text processing

#9 Post by Midas »

IMHO, version numbering is just like the odometer in your car -- it's good to tell you how fast you're going (or even if you're moving at all, when it's dark all around :upside_down:) but little help if all you want is to know when you'll be at your destination...

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: txtproc - A CLI tool for text processing

#10 Post by webfork »

I tested this out a bit and really liking it. I can definitely see a lot of uses in my daily work. A comprehensive list of actions:

Upper - Change input text to UPPER case.
Lower - Change input text to lower case.
Toggle - Toggle case of input text.
Capital - Change Input Text To Capital Case.
Sentence - Change input text to sentence case.
Snake - Change input text to snake_case.
Camel - Change input text to CamelCase.
CRC32 - Calculate CRC of input text.
MD5 - Calculate MD5 of input text.
RIPEMD160 - Calculate RIPEMD160 of input text.
SHA1 - Calculate SHA1 of input text.
SHA256 - Calculate SHA-256 of input text.
SHA512 - Calculate SHA-512 of input text.
Checksum - Calculate all available checksums of input text.
Count - Count number of characters, words and lines of input text.
CountCharacters - Count number of characters of input text.
CountWords - Count number of words of input text.
CountWordOccurence - Count how many times each word occurs in the input text.
CountSentences - Count number of sentences of input text.
CountLines - Count number of sentences of input text.
CountMore - Count number of characters, words, sentences and lines of input text.
CountCharacterOccurence - Count how many times each character occurs in the input text.
CountRegex - Count how many times the specified regular expression matches.
RemoveEmptyLines - Removes empty lines from input text.
RemoveExtraEmptyLines - Reduces consecutive empty lines to one empty line.
RemoveDuplicateLines - Removes duplicate lines from input text.
RemoveLinesContaining - Removes lines containing a specified sub-text from input text.
RemoveLinesContainingRegex - Removes lines containing a specified regular expression from input text.
SplitIntoLines - Split input text into lines using the specified separator string.
JoinLines - Join lines of input text into one single line.
AppendToLines - Append some text to each line of the input text.
PrependToLines - Prepend (prefix) some text to each line of the input text.
AddLineNumbers - Add line numbers to each line of input text.
TrimLine - Removes a specified number of characters from the beginning and/or end of each line.
RemoveWords - Removes a specified number of words from the beginning and/or end of each line.
Strip - Removes whitespace from the beginning and/or end of each line.
StripNonWordCharacters - Removes non-word characters from the beginning and/or end of each line.
RemoveTo - Removes everything before and/or after the sub-text specified (optionally including the sub-text).
ExtractColumn - Extracts the specified column delimited by the specified text out of each line.
ReverseLines - Reverse order of lines within input text.
ReverseSentences - Reverse order of sentences within input text.
ReverseWords - Reverse order of words within input text.
ReverseCharacters - Reverse order of characters within input text.
ReverseCharactersWithinWords - Reverse order of characters within words of input text.
Shuffle - Shuffle order of characters within input text.
ShuffleWords - Shuffle order of words within input text.
ShuffleLines - Shuffle order of lines within input text.
ShuffleSentences - Shuffle order of sentences within input text.
ShuffleWordsWithinSentence - Shuffle order of words within sentences of the input text.
ShuffleWithinWords - Shuffle order of characters within words of input text.
Search - Search sub-text in input text.
Replace - Replace sub-text in input text by a replacement text.
SearchRegex - Search sub-text in input text using a regular expression.
ReplaceRegex - Replace sub-text in input text by a replacement text using a regular expression.
SearchNonAscii - Search for non ASCII characters in input text.
SearchDuplicateWords - Search the input text for consecutive words which have been duplicated.
TabsToSpaces - Replace tabs by spaces such that characters following a tab align at their respective tab stops.
SpacesToTabs - Replace spaces with the optimal number of tabs (spaces and tabs at the end of a line are removed).
RemoveCharacters - Removes the specified set of characters from the input text).
FlipUpsideDown - Flips the input text upside down (works only for supported characters).
SortLines - Sort lines of input text alphabetically.
SortLinesByLength - Sort lines of input text by line length.
SortLinesByNumber - Sort lines of input text by first number found on each line.
SortSentences - Sort sentences of input text alphabetically.
SortWords - Sort words of input text alphabetically.
Tweet - Break up input text into tweets.
RemoveTags - Removes all tags from the input text.

User avatar
webfork
Posts: 10821
Joined: Wed Apr 11, 2007 8:06 pm
Location: US, Texas
Contact:

Re: txtproc - A CLI tool for text processing

#11 Post by webfork »

Regex tools on the command line:

txtproc.exe -v -e SearchRegex -p 'INSERTREGEXHERE'

https://github.com/phitsc/txtproc/issues/2

If have a frequent regex operation you have to run, I can see this inside a batch file saving a lot of time and energy.

---

Related: Balthazar – Text processing in the shell https://blog.balthazar-rouberol.com/tex ... -the-shell

User avatar
Midas
Posts: 6725
Joined: Mon Dec 07, 2009 7:09 am
Location: Sol3

Re: txtproc - A CLI tool for text processing

#12 Post by Midas »

Topic update: txtproc v0.4.0, last released 2016-01-16 (changes and downloads at https://github.com/phitsc/txtproc/releases).
[txtproc is] A command line tool for text processing.

FTR, here's a self-made txtproc man page generated from the executable output including version number, basic usage and complete function list:

Code: Select all

txtproc version 0.4.0

Usage: txtproc [options] [input text]

   --changes Print change history
   
-e --execute Function to process the supplied text with

-f --file Input file containing text to process

-i --ignore-case Ignore case

-l --list List available text processing functions

-m --modify Modify the input file in-place

-p --parameter Parameter to pass to processing function.
     Supply multiple times if necessary.
     
-v --from-clipboard Read text to process from clipboard

-x --to-clipboard Write processed text to clipboard

   --version Print version information
   
-d --debug Print debug output

-h --help This help information.


[TEXT PROCESSING FUNCTIONS (use after -e):]

Upper                      - Change input text to UPPER case.
Lower                      - Change input text to lower case.
Toggle                     - Toggle case of input text.
Capital                    - Change Input Text To Capital Case.
Sentence                   - Change input text to sentence case.
Snake                      - Change input text to snake_case.
Camel                      - Change input text to CamelCase.
CRC32                      - Calculate CRC of input text.
MD5                        - Calculate MD5 of input text.
RIPEMD160                  - Calculate RIPEMD160 of input text.
SHA1                       - Calculate SHA1 of input text.
SHA256                     - Calculate SHA-256 of input text.
SHA512                     - Calculate SHA-512 of input text.
Checksum                   - Calculate all available checksums of input text.
Count                      - Count number of characters, words and lines of
                             input text.
CountCharacters            - Count number of characters of input text.
CountWords                 - Count number of words of input text.
CountWordOccurence         - Count how many times each word occurs in the
                             input text.
CountSentences             - Count number of sentences of input text.
CountLines                 - Count number of sentences of input text.
CountMore                  - Count number of characters, words, sentences
                             and lines of input text.
CountCharacterOccurence    - Count how many times each character occurs in
                             the input text.
CountRegex                 - Count how many times the specified regular
                             expression matches.
RemoveEmptyLines           - Removes empty lines from input text.
RemoveExtraEmptyLines      - Reduces consecutive empty lines to one empty
                             line.
RemoveDuplicateLines       - Removes duplicate lines from input text.
RemoveLinesContaining      - Removes lines containing a specified sub-text
                             from input text.
RemoveLinesContainingRegex - Removes lines containing a specified regular
                             expression from input text.
SplitIntoLines             - Split input text into lines using the specified
                             separator string.
JoinLines                  - Join lines of input text into one single line.
AppendToLines              - Append some text to each line of the input text.
PrependToLines             - Prepend (prefix) some text to each line of the
                             input text.
AddLineNumbers             - Add line numbers to each line of input text.
TrimLine                   - Removes a specified number of characters from
                             the beginning and/or end of each line.
RemoveWords                - Removes a specified number of words from the
                             beginning and/or end of each line.
Strip                      - Removes whitespace from the beginning and/or
                             end of each line.
StripNonWordCharacters     - Removes non-word characters from the beginning
                             and/or end of each line.
RemoveTo                   - Removes everything before and/or after the
                             sub-text specified (optionally including it).
ExtractColumn              - Extracts the specified column delimited by the
                             specified text out of each line.
ReverseLines               - Reverse order of lines within input text.
ReverseSentences           - Reverse order of sentences within input text.
ReverseWords               - Reverse order of words within input text.
ReverseCharacters          - Reverse order of characters within input text.
ReverseCharactersWithinWords - Reverse order of characters within words of
                             input text.
Shuffle                    - Shuffle order of characters within input text.
ShuffleWords               - Shuffle order of words within input text.
ShuffleLines               - Shuffle order of lines within input text.
ShuffleSentences           - Shuffle order of sentences within input text.
ShuffleWordsWithinSentence - Shuffle order of words within sentences of the
                             input text.
ShuffleWithinWords         - Shuffle order of characters within words of
                             the input text.
Search                     - Search sub-text in input text.
Replace                    - Replace sub-text in input text by a replacement.
SearchRegex                - Search sub-text in input text using a
                             regular expression.
ReplaceRegex               - Replace sub-text in input text by a replacement
                             text using a regular expression.
SearchNonAscii             - Search for non ASCII characters in input text.
SearchDuplicateWords       - Search the input text for consecutive words
                             which have been duplicated.
TabsToSpaces               - Replace tabs by spaces such that characters
                             following a tab align at respective tab stops.
SpacesToTabs               - Replace spaces with the optimal number of tabs
                            (spaces and tabs at end of a line are removed).
RemoveCharacters           - Removes the specified set of characters from
                             the input text.
FlipUpsideDown             - Flips the input text upside down
                            (works only for supported characters).
SortLines                  - Sort lines of input text alphabetically.
SortLinesByLength          - Sort lines of input text by line length.
SortLinesByNumber          - Sort lines of input text by first number
                             found on each line.
SortSentences              - Sort sentences of input text alphabetically.
SortWords                  - Sort words of input text alphabetically.
Tweet                      - Break up input text into tweets.
RemoveTags                 - Removes all tags from the input text.
FYI, txtproc is able to generate CRC32 checksums for strings -- a hard to find function in Windows environments, it seems (not counting online tools)...

Post Reply