q - Text as Data
Overview
q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files).
q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs etc.
It supports automatic column name and column type detection, and provides full support for multiple encodings.
It can be used for execute queries on web Servers log files https://en.wikipedia.org/wiki/Server_log (standardized text file format used by web servers)
It works on CSV https://en.wikipedia.org/wiki/Comma-separated_values or TSV https://en.wikipedia.org/wiki/Tab-separated_values files.
Operative Systems supported:
• OSX https://github.com/harelba/q/releases/d ... _64-Darwin
• RPM Package: https://github.com/harelba/q/releases/d ... x86_64.rpm
• DEB Package: https://github.com/harelba/q/releases/d ... _amd64.deb
• Windows Installer: https://github.com/harelba/q/releases/d ... taller.exe
• Windows CLI: https://github.com/harelba/q/releases/d ... indows.exe
Please note that q.exe runs only on 64 bit Systems.
About Windows packages, author writes:
q.exe details:• Run the installer executable and hit next next next... q.exe will be added to the PATH so you can access it everywhere.
• Windows doesn't update the PATH retroactively for open windows, so you'll need to open a new cmd window after the installation is done.
Windows version current: 2.0.19
Windows programs sizes: 6,5/6,52 MB
Program links:
• q program page: https://harelba.github.io/q/#installation
• GitHub program page: https://github.com/harelba/q
• Softpedia: https://www.softpedia.com/get/PORTABLE- ... /q-H.shtml
q program page contains everything you need to download and use q immediately:
• Overview https://harelba.github.io/q/#overview
• Installation: https://harelba.github.io/q/#installation
• Requirements: https://harelba.github.io/q/#requirements
• Usage: https://harelba.github.io/q/#usage
• Query: https://harelba.github.io/q/#query
• Flags: https://harelba.github.io/q/#flags
• Additional functions: https://harelba.github.io/q/#additional-functions
• Examples: https://harelba.github.io/q/#examples
• Implementation: https://harelba.github.io/q/#implementation
• Development: https://harelba.github.io/q/#development
• Limitations: https://harelba.github.io/q/#limitations
• Rationale: https://harelba.github.io/q/#rationale
• Benchmark: https://harelba.github.io/q/#benchmark
• Future: https://harelba.github.io/q/#future
Abtain Help file:
From command line run:
Code: Select all
q -h
q --help
Code: Select all
Options:
-h, --help show this help message and exit
-v, --version Print version
-V, --verbose Print debug info in case of problems
-S SAVE_DB_TO_DISK_FILENAME, --save-db-to-disk=SAVE_DB_TO_DISK_FILENAME
Save database to an sqlite database file
--save-db-to-disk-method=SAVE_DB_TO_DISK_METHOD
Method to use to save db to disk. 'standard' does not
require any deps, 'fast' currenty requires manually
running `pip install sqlitebck` on your python
installation. Once packing issues are solved, the fast
method will be the default.
Input Data Options:
-H, --skip-header Skip header row. This has been changed from earlier
version - Only one header row is supported, and the
header row is used for column naming
-d DELIMITER, --delimiter=DELIMITER
Field delimiter. If none specified, then space is used
as the delimiter.
-p, --pipe-delimited
Same as -d '|'. Added for convenience and readability
-t, --tab-delimited
Same as -d <tab>. Just a shorthand for handling
standard tab delimited file You can use $'\t' if you
want (this is how Linux expects to provide tabs in the
command line
-e ENCODING, --encoding=ENCODING
Input file encoding. Defaults to UTF-8. set to none
for not setting any encoding - faster, but at your own
risk...
-z, --gzipped Data is gzipped. Useful for reading from stdin. For
files, .gz means automatic gunzipping
-A, --analyze-only Analyze sample input and provide information about
data types
-m MODE, --mode=MODE
Data parsing mode. fluffy, relaxed and strict. In
strict mode, the -c column-count parameter must be
supplied as well
-c COLUMN_COUNT, --column-count=COLUMN_COUNT
Specific column count when using relaxed or strict
mode
-k, --keep-leading-whitespace
Keep leading whitespace in values. Default behavior
strips leading whitespace off values, in order to
provide out-of-the-box usability for simple use cases.
If you need to preserve whitespace, use this flag.
--disable-double-double-quoting
Disable support for double double-quoting for escaping
the double quote character. By default, you can use ""
inside double quoted fields to escape double quotes.
Mainly for backward compatibility.
--disable-escaped-double-quoting
Disable support for escaped double-quoting for
escaping the double quote character. By default, you
can use \" inside double quoted fields to escape
double quotes. Mainly for backward compatibility.
--as-text Don't detect column types - All columns will be
treated as text columns
-w INPUT_QUOTING_MODE, --input-quoting-mode=INPUT_QUOTING_MODE
Input quoting mode. Possible values are all, minimal
and none. Note the slightly misleading parameter name,
and see the matching -W parameter for output quoting.
-M MAX_COLUMN_LENGTH_LIMIT, --max-column-length-limit=MAX_COLUMN_LENGTH_LIMIT
Sets the maximum column length.
-U, --with-universal-newlines
Expect universal newlines in the data. Limitation: -U
works only with regular files for now, stdin or .gz
files are not supported yet.
Output Options:
-D OUTPUT_DELIMITER, --output-delimiter=OUTPUT_DELIMITER
Field delimiter for output. If none specified, then
the -d delimiter is used if present, or space if no
delimiter is specified
-P, --pipe-delimited-output
Same as -D '|'. Added for convenience and readability.
-T, --tab-delimited-output
Same as -D <tab>. Just a shorthand for outputting tab
delimited output. You can use -D $'\t' if you want.
-O, --output-header
Output header line. Output column-names are determined
from the query itself. Use column aliases in order to
set your column names in the query. For example,
'select name FirstName,value1/value2 MyCalculation
from ...'. This can be used even if there was no
header in the input.
-b, --beautify Beautify output according to actual values. Might be
slow...
-f FORMATTING, --formatting=FORMATTING
Output-level formatting, in the format X=fmt,Y=fmt
etc, where X,Y are output column numbers (e.g. 1 for
first SELECT column etc.
-E OUTPUT_ENCODING, --output-encoding=OUTPUT_ENCODING
Output encoding. Defaults to 'none', leading to
selecting the system/terminal encoding
-W OUTPUT_QUOTING_MODE, --output-quoting-mode=OUTPUT_QUOTING_MODE
Output quoting mode. Possible values are all, minimal,
nonnumeric and none. Note the slightly misleading
parameter name, and see the matching -w parameter for
input quoting.
-L, --list-user-functions
List all user functions
Query Related Options:
-q QUERY_FILENAME, --query-filename=QUERY_FILENAME
Read query from the provided filename instead of the
command line, possibly using the provided query
encoding (using -Q).
-Q QUERY_ENCODING, --query-encoding=QUERY_ENCODING
query text encoding. Experimental. Please send your
feedback on this