2

All the CSV to TSV tutorials are suggesting a simple:

tr ',' '\t'

though some CSVs look like this:

1,310,"IntAct,PINA"

in which case I would like to keep "IntAct,PINA":

1   310 "IntAct,PINA"

How could I parameterize the tr command (or sed, etc.) in order to do that?

I appreciate any suggestions.

Eliah Kagan
  • 119,640

3 Answers3

2

Use csvformat from csvkit:

csvformat -d, -D$'\t' file

or shorter:

csvformat -T file

-d input delimiter (not needed here, as , is the default input delimiter)

-D output delimiter

-T set tabs as output delimiter

It will remove the quotes, as they are not needed for a tsv.


You should be able to install csvkit via pip:

sudo apt install python-pip
pip install csvkit
pLumo
  • 27,991
0

If csvkit (which I recommend) is not available, then you could use the perl Text::CSV module:

perl -MText::CSV -lne '
  BEGIN{$p = Text::CSV->new} print join "\t", $p->fields() if $p->parse($_)
' file

If you insist on retaining the quoting (which is unnecessary, since the embedded , is no longer a separator), then you could do something like

print join "\t", map { $_ =~ s/.*,.*/"$&"/r } $p->fields() if $p->parse($_)
steeldriver
  • 142,475
0

Using your CSV without heading

1,310,"IntAct,PINA"

and Miller (https://github.com/johnkerl/miller)

mlr --nidx --ifs "," --ofs "\t" cat input.csv

gives you back

1       310     "IntAct PINA"
aborruso
  • 814