0

Below is a snippet of a for loop where I sort txt file names. I am then trying to save the results in a json format file. However it results in an json format that is not the desired. How could i convert to desired json format the values from the for loop?

dir="myfiles/test/"

prefix=""
echo "[" >> test.json
for dir in "${array[@]}"; do
        #reverse the result of comparisons
        file=$(find "$dir" -maxdepth 1 -type f -iname '*.txt' | awk "NR==$i")
        [[ -n $file ]] && 
                printf '%b{ "filepath": "%s" }' $prefix "$file" >> test.json
        prefix=",\n"
done
echo
echo "]" >> test.json

Current output

[
    { "filepath" : "myfiles/test/sdfsd.txt" },
    { "filepath" : "myfiles/test/piids.txt" },
    { "filepath" : "myfiles/test/saaad.txt" },
    { "filepath" : "myfiles/test/smmnu.txt" },
]

Desired output

[
    [
        { "filepath" : "myfiles/test/sdfsd.txt" }
    ],
    [
        { "filepath" : "myfiles/test/piids.txt" }
    ],
    [
        { "filepath" : "myfiles/test/saaad.txt" }
    ],
    [
        { "filepath" : "myfiles/test/smmnu.txt" }
    ]
]

Also allow

[
    [
        { "filepath" : "myfiles/test/sdfsd.txt" },
        { "filepath" : "myfiles/test/sdfsd2.txt" }
    ],
    [
        { "filepath" : "myfiles/test/piids.txt" },
        { "filepath" : "myfiles/test/piids2.txt" }
    ],
    [
        { "filepath" : "myfiles/test/saaad.txt" }
    ],
    [
        { "filepath" : "myfiles/test/smmnu.txt" }
    ]
]
2
  • I am not sure if you can use node.js but this could be easily done with a simple Node.js script. Commented Jul 5, 2015 at 1:29
  • I think the point of Ma'moon's comment is largely that this is a lot easier in a language that has built-in support for json, and supports hashes and arrays. In particular because the second allowed output requires lots of logic in your bash script: you'll essentially want to build up the list of list of filenames in one loop, and write them in another loop. Commented Jul 5, 2015 at 1:53

2 Answers 2

2

Use jq combined with to achieve your goal. First we transform the undesired output to the correct format, syntactically. Then we format it using jq.

We use following awk script:

{
    # extract names of files (to see if they are equal
    # besides a numerical suffix).
    name1 = line
    name2 = $0
    sub(/"[^"]*$/, "", name1)
    sub(/"[^"]*$/, "", name2)
    sub(/.*\//, "", name1)
    sub(/.*\//, "", name2)
    sub(/\....$/, "", name1)
    sub(/\....$/, "", name2)
    sub(/[0-9]*$/, "", name1)
    sub(/[0-9]*$/, "", name2)
    # add array symbols to the line
    # if last item was closed by a ']' add '[' to beginning
    if (closed)
        sub(/{/, "[{", line)
    # if names are equal, same array
    if (name1 != name2) {
        sub(/},/, "}],", line)
        closed = 1
    } else
        closed = ""
    # if last line, consisting of simply a '['
    if ($0 ~ /^]$/)
        # remove extra comma at end of line
        sub(/,$/, "", line)
    # if line is set, print line
    if (line)
        print line
    # set current line to line variable
    line = $0
}

This yields an ill-formatted output:

$ cat file 
[
    { "filepath" : "myfiles/test/sdfsd.txt" },
    { "filepath" : "myfiles/test/piids.txt" },
    { "filepath" : "myfiles/test/saaad.txt" },
    { "filepath" : "myfiles/test/smmnu.txt" },
]
$ awk -f script.awk file
[
    [{ "filepath" : "myfiles/test/sdfsd.txt" }],
    [{ "filepath" : "myfiles/test/piids.txt" }],
    [{ "filepath" : "myfiles/test/saaad.txt" }],
    [{ "filepath" : "myfiles/test/smmnu.txt" }]
]

Which we can now format using jq:

$ awk -f script.awk file | jq .
[
  [
    {
      "filepath": "myfiles/test/sdfsd.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/piids.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/saaad.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/smmnu.txt"
    }
  ]
]

Note that this takes care of files that are almost identical, in the sense that they only differ in a numerical suffix. Example:

$ cat file 
[
    { "filepath" : "myfiles/test/sdfsd.txt" },
    { "filepath" : "myfiles/test/sdfsd2.txt" },
    { "filepath" : "myfiles/test/piids.txt" },
    { "filepath" : "myfiles/test/saaad.txt" },
    { "filepath" : "myfiles/test/smmnu.txt" },
]
$ awk -f script.awk file | jq .
[
  [
    {
      "filepath": "myfiles/test/sdfsd.txt"
    },
    {
      "filepath": "myfiles/test/sdfsd2.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/piids.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/saaad.txt"
    }
  ],
  [
    {
      "filepath": "myfiles/test/smmnu.txt"
    }
  ]
]
Sign up to request clarification or add additional context in comments.

Comments

0

For the first part, save the current output to test.json and just do:

cat test.json | sed 's,{,[\n{,g;s;},;}\n],;g'  > tmp ; mv tmp test.json

A shorter way would be to:

sed -i 's,{,[\n{,g;s;},;}\n],;g' test.json

Note that this still adds a comma to the last entry and doesn't format the output so the result is still invalid .

1 Comment

This fails for several reasons: 1. the last element is ended with a comma which is invalid json. 2. The last expected output of the OP will not be achieved using this method. 3. The format is not structured as desired. This can be improved in the following way: 1. use sed -i so you don't need to > tmp && mv .... 2. Don't pipe from cat but set the file as argument for sed. 3. Add the next line to the pattern space, if the pattern space matches \n]$, then don't add a comma before the newline.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.