bash - parse CSV, Group all rows containing string at 5th field, export each group of rows to file with filename <group>_someconstant.csv -


need in bash.

in linux directory, have csv file. arbitrarily, file have 6 rows.

main_export.csv

 1,2,3,4,8100_group1,6,7,8 1,2,3,4,8100_group1,6,7,8 1,2,3,4,3100_group2,6,7,8 1,2,3,4,3100_group2,6,7,8 1,2,3,4,5400_group3,6,7,8 1,2,3,4,5400_group3,6,7,8 
  • i need parse file's 5th field (first 4 chars only) , take each row 8100 (for example) , put rows in new file. same other groups exist, across entire file.

  • each new file can contain rows group (one file rows 8100, 1 file rows 3100, etc.)

  • each filename needs have group# prepended it.

the first 4 characters numeric value, can't check these against list - there 50 groups, , maintenance can't done on if group # changes.

when parsing fifth field, care first 4 characters

so we'd start with: main_export.csv , end 4 files:

  • main_export_$date.csv (unchanged)
  • 8100_filenameconstant_$date.csv
  • 3100_filenameconstant_$date.csv
  • 5400_filenameconstant_$date.csv

i'm not sure rules of site. if have try myself first , post this. i'll come once have idea - i'm @ total loss. reading on awk right now.

if have understood problem easy...

you can just:

$ awk -f, '{fifth=substr($5, 1, 4) ; print > (fifth "_mysuffix.csv")}' file.cv 

or just:

$ awk -f, '{print > (substr($5, 1, 4) "_mysuffix.csv")}' file.csv 

and several files like:

$ cat 3100_mysuffix.csv  1,2,3,4,3100_group2,6,7,8 1,2,3,4,3100_group2,6,7,8 

or...

$ cat 5400_mysuffix.csv  1,2,3,4,5400_group3,6,7,8 1,2,3,4,5400_group3,6,7,8 

Comments

Popular posts from this blog

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -

python - cx_oracle unable to find Oracle Client -