Skip to content

Instantly share code, notes, and snippets.

@JoeGermuska
Forked from bycoffe/csvcut
Created September 1, 2010 20:51
Show Gist options
  • Save JoeGermuska/561347 to your computer and use it in GitHub Desktop.
Save JoeGermuska/561347 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
"""
Like cut, but for CSVs. To be used from a shell command line.
Note that fields are zero-based, as opposed to 'cut' where they are 1-based.
Should use something better than getopt, but this works...
Usage:
csvcut foobar.csv
(prints the first column of each row of foobar.csv)
head -10 foobar.csv | csvcut -f 0,2
(prints the first and third columns of the first ten lines of foobar.csv)
csvcut -f 0,2 -d "|" foobar.csv
(prints the first and third columns of the pipe-delimited foobar.csv)
csvcut -f 0,2 -t foobar.csv
(prints the first and third columns of the tab-delimited foobar.csv
if present, the -d option will be ignored.)
csvcut -h foobar.csv
(prints the values of the first line of foobar.csv, preceded by the field index which would
be used to display that column. If present, the -f option will be ignored.)
csvcut -f 0,1,2 -d "|" -o , foobar.csv
(prints the first three columns of the pipe-delimited foobar.csv; output
will be comma-delimited.)
csvcut -f 0,1,2 -o "|" foobar.csv
(prints the first three columns of the comma-delimited foobar.csv; output
will be pipe-delimited.)
csvcut -f : -o "|" foobar.csv
(prints all the columns of the comma-delimited foobar.csv; output will be
pipe-delimited.)
csvcut -f 0,1 -d "," -q "|" foobar.csv
(prints the first two columns of the comma-delimited, pipe-quoted foorbar.csv.)
"""
import sys, csv, getopt
opts, args = getopt.getopt(sys.argv[1:], "f:d:o:q:ht", [])
if args:
i = open(args[0],"U")
else:
i = sys.stdin
delimiter = ','
output_delimiter = ' '
cols = [0, ]
show_headers = False
if opts:
opts = dict(opts)
show_headers = '-h' in opts
if '-f' in opts:
cols = opts['-f'].split(",")
if '-t' in opts:
delimiter = "\t"
elif '-d' in opts:
delimiter = opts['-d']
if '-o' in opts:
output_delimiter = opts['-o']
if '-q' in opts:
quotechar = opts['-q']
else:
quotechar = None
for row in csv.reader(i, delimiter=delimiter, quotechar=quotechar):
if show_headers:
for i,c in enumerate(row):
print "%3i: %s" % (i,c)
break
writer = csv.writer(sys.stdout, delimiter=output_delimiter)
if cols == [':']:
cols = range(len(row))
writer.writerow([row[int(c)] for c in cols])
@JoeGermuska
Copy link
Author

Added "U" to the open call to handle common case of Mac OS Excel CSV exports, which have Mac line feeds instead of UNIX.

@jamesbroadhead
Copy link

If you're using this, there's a newer version merged into csvkit -- https://github.com/wireservice/csvkit :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment