NAME

termsql — Convert text into SQL table and query it

SYNOPSIS

termsql

[-h] [-1] [-0] [-d DELIMITER] [-p SEPARATOR] [-l LINE_AS_COLUMN] [-k KEY_COLUMNS] [-c COLUMNS] [--calc CALC] [-w] [-H OFFSET_HEAD] [-T OFFSET_TAIL] [-M MAX_ROWS] [-C] [-D] [-Q] [-m MODE] [-r MERGE] [-x] [-a] [-i INFILE] [-o OUTFILE] [-f FILE] [-e] [-q QUERYFILE] [-t TABLE] [-s SQLITE] [-v] [query query ...]

DESCRIPTION

Convert text into SQL table and query it

OPTIONS

-h, --help: show this help message and exit
-1, --head: use first line/head(er) for column names
-0, --show-header: show column names as header in output (not shown by default)
-d [DELIMITER], --delimiter [DELIMITER]: custom delimiter (default is whitespace)
-p [SEPARATOR], --separator [SEPARATOR]: set sqlite output field separator (default depends on output mode. ie. for list '|' is the default and for csv ',' is the default)
-l [LINE_AS_COLUMN], --line-as-column [LINE_AS_COLUMN]: each line of input is a whole column, next row begins after n lines
-k [KEY_COLUMNS], --key-columns [KEY_COLUMNS]: one or more columns can make up the primary key (i.e -k COL0 or -c one,two,three -k one,two)
-c [COLUMNS], --columns [COLUMNS]: set custom column names (ie. -c 'name,street,age')
--calc [CALC]: use sqlite as simple calculator, you can do multiple calculations separated by commata (i.e. "(5*109.05)/2 , 345+789") In this mode stdin input is ignored
-w, --whitespace: use whitespace as field separator (default is |). equal to mode column
-H [OFFSET_HEAD], --offset-head [OFFSET_HEAD]: ignore first n lines of input
-T [OFFSET_TAIL], --offset-tail [OFFSET_TAIL]: ignore last n lines of input
-M [MAX_ROWS], --max-rows [MAX_ROWS]: don't insert more than n rows into database
-C, --dump-create-table: dump create table SQL
-D, --dump: dump complete SQL to create table and content
-Q, --sql: outputs the full query, like it would be fed to SQlite
-m [MODE], --mode [MODE]: set sqlite output mode i.e.: csv, column, html, insert, line, list, tabs, tcl (default is list)
-r [MERGE], --merge [MERGE]: merges all columns from column n to the last one into one. This is useful when you have data like filenames with whitespaces in it, to prevent it from getting split by the delimiter. Note that counting starts from 0, therefore n=0 is the first column ...
-x, --select-all: add final SELECT * FROM to user defined query
-a, --append: don't DROP TABLE instead just append data to existing table. If the existing table doesn't have enough columns it will be ALTERed accordingly, if possible
-i [INFILE], --infile [INFILE]: use file as input instead of stdin
-o [OUTFILE], --outfile [OUTFILE]: location/filename to use for sql database (by default a tempfile is used)
-f [FILE], --file [FILE]: write output to file instead of stdout. It's overwriting the whole file, and not just appending, so please be cautious
-e, --edit: writes to the input file instead of stdout. For this to work the input file has to be specified with the -i/--infile option. It's overwriting the whole file, and not just appending, so please be cautious
-q [QUERYFILE], --queryfile [QUERYFILE]: load complex query from file, queries on the commandline get executed last
-t [TABLE], --table [TABLE]: tablename (must be a valid sqlite tablename)
-s [SQLITE], --sqlite [SQLITE]: path to sqlite executable
-v, --version: display version information

USAGE

By default termsql reads from stdin and converts each incoming line into a row it stores inside a sqlite database in a temporary file.

It separates columns by whitespace, because on the command line most programs usually present information in the output in that way. You can supply option -d to change that behavior and split columns by any character or string of characters of your choosing. Option -l/--line-as-columns on the other hand splits columns by newlines, and expects a number by the user to know how many columns the table has.

The most powerful part of termsql is that it allows users to query these newly created databases with the full power of sqlite. When no user query is supplied as last argument it defaults to "select * from tbl", as you can see from this the default name of the table is tbl. You can change that table name with the -t option.

All sqlite SQL syntax like select/update/delete/group by/where/order by/limit and all sqlite functions such as date('now'), SUM(*) or ABS(X) can be used to modify the table and the output of termsql.

The default output mode is the default output mode of sqlite, which is called list and separates columns by |. A nice output mode in a terminal is column, which you can activate with -m column or its synonymous option -w. -m csv and -m html output in the according formats. -m line lists one line per column in a key/value pair kind of fashion. Option -0/--show-header activates the header with column names in the output.

Often the first line of input contains the column names and option -1/--head ensures that these names are used as the actual column names in the table. If you want to specify your own custom column names, you can use the -c/--columns option followed by a comma separated list of column names. By default columns are named by their position starting with COL0. User defined column names take precedence over column names automatically found using -1/--head. If you're using the -1 option and -c at the same time and the header only contains 3 names, while you define 2 names with the -c option and the total number of columns in the table is 5 the actual column names of the table would look something like this: "coption1","coption2","headername3","COL3","COL4".

If you end up having column names with special characters in them it's good to know the sqlite syntax to escape column names like this "select [%CPU],[1k-blocks] from tbl". As you see wrapping column names in brackets can save the day.

You may want to convert input into a permanent database, for that you simply use the -o option followed by the filename of the database.

Instead of reading from stdin termsql can read from a file supplied with the -i option. If you make modifications to such input you may want to save it in the same file again. Option -e only works in conjunction with the -i option and automatically overwrites the input file with the new output. Make sure to check that the output is actually satisfying before you use this method. With the -f option you can supply a different file to write the output to.

For looking at the internals of the SQL the options -m insert, -D and -C may be interesting. Mode insert outputs the result of select as a series of SQL insert statements. -D dumps the complete table and data SQL, while -C only dumps the SQL the table was created with.

When creating the database termsql detects for each column automatically the best datatype, so numbers will get stored as integer or float if possible.

Option --calc X is a special mode that emulates "select X from tbl" and doesn't create a table or read input at all. You can do calculations like this termsql --calc "2+10,100/25" or you can use sqlite functions like this termsql --calc date('now').

EXAMPLES

-$ ps aux | termsql "select * from tbl where COL0='root' limit 5": outputs 5 processes that are running as root
-$ df | termsql -1 'select Filesystem from tbl order by [1K-blocks] desc limit 1': returns the largest device
-$ cat /proc/cpuinfo | termsql "select COL2 from tbl where COL0='bogomips'": how many bogomips does your system have?
-$ termsql --infile /etc/group -d ':' "select COL0 from tbl": read from file, use : as delimiter, shows all groups in /etc/group
-$ termsql -i MyCSVFile.CSV -t fancytable -d ',' -1 -o mysqlite.db: creates a new sqlite database from a CSV file. -d ',' tells termsql to use commata as delimiter. -1 tells termsql that the first line of the CSV file is not data and instead the column names and termsql creates database columns with those names accordingly. -t fancytable sets the tablename to fancytable
-$ sensors | termsql "select COL2 from tbl where COL0='Core'": extract the temperature from all your CPU cores
-$ export LC_ALL=en_US; top -b | head | termsql -1 -H 6 "select [PID],[USER],[COMMAND],[%CPU] from tbl where [%CPU]>=25": termsql doesn't recognize numbers like "25,3" as numbers, but as strings. export LC_ALL=en_US ensures that top outputs numbers that are easy for termsql/sqlite to digest (ie. "25.3"). -H 6 makes termsql disregard the first 6 lines. We select only the processes with more than 25% cpu usage and output their PID,USER,COMMAND and %CPU.
-$ export DISPLAY=$(ps aux | termsql "select COL11 from tbl where COL10 like '%Xorg.bin%' limit 1"): set DISPLAY environment variable to what display X is running on right now, assuming that the X binary is called "Xorg.bin")
-$ ls -lha /usr/bin/* | termsql -w -r 8 "select * from tbl order by COL8 desc": -r 8 merges the filenames into the 8th column. Then "order by COL8 desc" sorts them in reverse order. Due to -w the output looks nice on the command line
-$ termsql -m line --calc 2+2,23423./123: To use termsql as simple a calculator with the power of sqlite behind it, use the --calc option and a commata separated list of calculations you want to do. Sometimes you may need to append a dot . to a number to make sqlite aware that it's supposed to do floating point calculation. If you want to use paranthesis you may also need to use quotes like this --calc "(2+2,23423./123)". The -m line option beautifies the output.
-$ cat /etc/fstab | termsql "select COUNT(*) from tbl where COL2='ext4'": how many devices do you have in your fstab that are using the ext4 filesystem? COUNT(*) simply outputs the total number of rows that the current select/where statement would select.
-$ ps aux | termsql -m line -1 "select USER,COUNT(*) from tbl group by USER": counts the total number of processes that each user has running. -1 gets the column names from the first line, therefore we can use USER instead of COL0 in the SQL statement. "group by USER" groups the rows of with identical USER together and for that reason COUNT(*) returns the total number of rows (in this case processes) for each USER. -m line beautifies the output.
-$ termsql -ei .config/Bitcoin/Bitcoin-Qt.conf -c key,value -d = -p = -x "update tbl set value='true' where key='fMinimizeToTray'": Demonstrates how you can use termsql to edit simple config files with key/value pairs. -i loads the config file as input and -e makes sure this same file is written to instead of stdout. Simply speaking: enabling edit mode. -d = sets the delimiter for splitting the input to = and -p = sets the separator for the output to = again, so that the output format matches the input format. -x appends a ";select * from tbl" to the user defined query, so that we get everything back that we put in. The user query is an SQL update statement which sets the value part of the key value pair to true, where the key is fMinimizeToTray. -c key,value allows us to use these convenient names instead of COL0 (key) and COL1 (value). Note that this approach may not always be perfect, for example in .config/Bitcoin/Bitcoin-Qt.conf there's a section line "[General]" which gets changed to "[General]=" after termsql is done with editing. Which may or may not be a problem depending on the program that uses the config file. It could easily be fixed with a tool like sed though.
-$ termsql -ei my.config -c key,value -d = -p = -x "delete from tbl where key IN ('entry1','entry2','entry3','entry4')": Analogous to the previous example this one would find and delete the key/value pair entries with the keys entry1, entry2, entry3 and entry4. Using IN in a where clause you can specifiy a list to match against instead of just a single value. Everything else from the previous example also applies in this one.

December 1, 2014

Linux 3.18.0-rc5+