bash - parse CSV, Group all rows containing string at 5th field, export each group of rows to file with filename <group>_someconstant.csv -
need in bash.
in linux directory, have csv file. arbitrarily, file have 6 rows.
main_export.csv
1,2,3,4,8100_group1,6,7,8 1,2,3,4,8100_group1,6,7,8 1,2,3,4,3100_group2,6,7,8 1,2,3,4,3100_group2,6,7,8 1,2,3,4,5400_group3,6,7,8 1,2,3,4,5400_group3,6,7,8
i need parse file's 5th field (first 4 chars only) , take each row 8100 (for example) , put rows in new file. same other groups exist, across entire file.
each new file can contain rows group (one file rows 8100, 1 file rows 3100, etc.)
each filename needs have group# prepended it.
the first 4 characters numeric value, can't check these against list - there 50 groups, , maintenance can't done on if group # changes.
when parsing fifth field, care first 4 characters
so we'd start with: main_export.csv
, end 4 files:
main_export_$date.csv
(unchanged)8100_filenameconstant_$date.csv
3100_filenameconstant_$date.csv
5400_filenameconstant_$date.csv
i'm not sure rules of site. if have try myself first , post this. i'll come once have idea - i'm @ total loss. reading on awk right now.
if have understood problem easy...
you can just:
$ awk -f, '{fifth=substr($5, 1, 4) ; print > (fifth "_mysuffix.csv")}' file.cv
or just:
$ awk -f, '{print > (substr($5, 1, 4) "_mysuffix.csv")}' file.csv
and several files like:
$ cat 3100_mysuffix.csv 1,2,3,4,3100_group2,6,7,8 1,2,3,4,3100_group2,6,7,8
or...
$ cat 5400_mysuffix.csv 1,2,3,4,5400_group3,6,7,8 1,2,3,4,5400_group3,6,7,8
Comments
Post a Comment