csvjoin

Type:command
Package:prepDATA/1.1 — Conversion, Transformation and Plotting of Basic Data Files
Namespace:&type1

Description

Execute a SQL-like join to merge CSV files on a specified column or columns.

Usage

csvjoin {options}
Bash equivalent: csvjoin {options}
usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
[-S] [–blanks] [–date-format DATE_FORMAT]
[–datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v]
[-l] [–zero] [-V] [-c COLUMNS] [–outer] [–left] [–right]
[-y SNIFF_LIMIT] [-I]
[FILE [FILE …]]
.
Execute a SQL-like join to merge CSV files on a specified column or columns.
.
positional arguments:
FILE The CSV files to operate on. If only one is specified,
it will be copied to STDOUT.
.
optional arguments:
-h, –help show this help message and exit
-d DELIMITER, –delimiter DELIMITER
Delimiting character of the input CSV file.
-t, –tabs Specify that the input CSV file is delimited with
tabs. Overrides “-d”.
-q QUOTECHAR, –quotechar QUOTECHAR
Character used to quote strings in the input CSV file.
-u {0,1,2,3}, –quoting {0,1,2,3}
Quoting style used in the input CSV file. 0 = Quote
Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 =
Quote None.
-b, –no-doublequote Whether or not double quotes are doubled in the input
CSV file.
-p ESCAPECHAR, –escapechar ESCAPECHAR
Character used to escape the delimiter if –quoting 3
(“Quote None”) is specified and to escape the
QUOTECHAR if –no-doublequote is specified.
-z FIELD_SIZE_LIMIT, –maxfieldsize FIELD_SIZE_LIMIT
Maximum length of a single field in the input CSV
file.
-e ENCODING, –encoding ENCODING
Specify the encoding of the input CSV file.
-L LOCALE, –locale LOCALE
Specify the locale (en_US) of any formatted numbers.
-S, –skipinitialspace
Ignore whitespace immediately following the delimiter.
–blanks Do not convert “”, “na”, “n/a”, “none”, “null”, “.” to
NULL.
–date-format DATE_FORMAT
Specify a strptime date format string like “%m/%d/%Y”.
–datetime-format DATETIME_FORMAT
Specify a strptime datetime format string like
“%m/%d/%Y %I:%M %p”.
-H, –no-header-row Specify that the input CSV file has no header row.
Will create default headers (a,b,c,…).
-K SKIP_LINES, –skip-lines SKIP_LINES
Specify the number of initial lines to skip (e.g.
comments, copyright notices, empty rows).
-v, –verbose Print detailed tracebacks when errors occur.
-l, –linenumbers Insert a column of line numbers at the front of the
output. Useful when piping to grep or as a simple
primary key.
–zero When interpreting or displaying column numbers, use
zero-based numbering instead of the default 1-based
numbering.
-V, –version Display version information and exit.
-c COLUMNS, –columns COLUMNS
The column name(s) on which to join. Should be either
one name (or index) or a comma-separated list with one
name (or index) for each file, in the same order that
the files were specified. May also be left
unspecified, in which case the two files will be
joined sequentially without performing any matching.
–outer Perform a full outer join, rather than the default
inner join.
–left Perform a left outer join, rather than the default
inner join. If more than two files are provided this
will be executed as a sequence of left outer joins,
starting at the left.
–right Perform a right outer join, rather than the default
inner join. If more than two files are provided this
will be executed as a sequence of right outer joins,
starting at the right.
-y SNIFF_LIMIT, –snifflimit SNIFF_LIMIT
Limit CSV dialect sniffing to the specified number of
bytes. Specify “0” to disable sniffing entirely.
-I, –no-inference Disable type inference when parsing CSV input.
.
Note that the join operation requires reading all files into memory. Don’t try
this on very large files.