0

I have file with data for data_0 to data_4 repeated in rows. I need to convert it into columns values under respected dataset. Is there any way to put blank/null value in case data is missing for earlier category. For Example

TimeStamp,Block,No_of_requests
04:19:12,data_0,4
04:19:12,data_1,6
04:19:12,date_2,8
04:19:12,date_3,10
04:19:12,data_4,12
04:19:14,data_0,5
04:19:14,data_1,6
04:19:14,date_3,7
04:19:14,data_4,8

Expected output is

TimeStamp,data_0,data_1,data_2,data_3,data_4
04:19:12,4,6,8,10,12
04:19:14,5,6,,7,8

etc. It should put empty data incase value for respective data_x is not available.

1

2 Answers 2

3

GNU awk solution:

awk 'BEGIN{ 
         FS = OFS = ",";
         PROCINFO["sorted_in"] = "@ind_num_asc";
         print "TimeStamp,data_0,data_1,data_2,data_3,data_4" 
     }
     NR > 1{ a[$1][substr($2, 6) + 1] = $3 }
     END{ 
         for (i in a) { 
             printf "%s,", i;
             for (j=0; j<=4; j++) printf "%s%s", a[i][j+1], (j == 4? ORS:OFS) 
         }
     }' file

The output:

TimeStamp,data_0,data_1,data_2,data_3,data_4
04:19:12,4,6,8,10,12
04:19:14,5,6,,7,8
2

Similar to Roman's answer, but hardcodes less about the contents of the file

awk -F, -v OFS=, '
    NR > 1 {data[$1][$2] = $3; blocks[$2]}
    END {
        PROCINFO["sorted_in"] = "@ind_str_asc"

        # header
        printf "TimeStamp"
        for (block in blocks) {
            printf "%s%s", OFS, block
        }
        print ""

        # data
        for (ts in data) {
            printf "%s", ts
            for (block in blocks) {
                printf "%s%s", OFS, data[ts][block]
            }
            print ""
        }
    }
' file
TimeStamp,data_0,data_1,data_4,date_2,date_3
04:19:12,4,6,12,8,10
04:19:14,5,6,8,,7

Note that your sample data uses "data" and "date" both.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.