0

I am reading and extracting some filenames from a directory and trying to add them into an array. The given directory has duplicate files so I would be extracting some duplicate file names also. Original file names in the directory as:

100_abc strategy-42005_04May2020_0000-04May2020_first_file.csv   
100_abc strategy-42005_04May2020_0000-04May2020_second_file.csv   
101_xyz statitics strategy_04May2020_first_file.csv

Script used:


#!/bin/bash

c=0

for filename in /home/vikrant_singh_rana/testing/*; do
        #stripping a file name 
        GroupName=$(basename "$filename" ".csv" | awk -F "_" '{print $2}' | awk -F "-" '{print $1}')
        echo "$GroupName"

        var=["$c"]="$GroupName"
        c=$(($c+1))
done
echo "print my array"
echo "${var[*]}"

the file name it extracts from directory contains spaces with them. for example.

abc strategy
abc strategy
xyz statistics strategy

so when I print my array it would be printing like as

abc strategy abc strategy xyz statistics strategy

above code is adding same file name again to array if it encounter same file again while reading.

so I have added a if statement in order to prevent that, which is not working as expected. I was expecting that array should have unique file name as an element only.

for filename in /home/vikrant_singh_rana/testing/*; do
        GroupName=$(basename "$filename" ".csv" | awk -F "_" '{print $2}' | awk -F "-" '{print $1}')

        if [[  "${var[@]}"  =~ "$GroupName" ]]; then
                echo "I am here "
                c=$(($c+1))
                var["$c"]="$GroupName"
        fi

done
4
  • What do you mean with "duplicate filenames"? You cannot have two different files with the same name in a directory. Commented Jun 15, 2020 at 6:17
  • I mean files are unique with time and date format but while extracting the file names they are meant to be duplicate. Commented Jun 15, 2020 at 6:18
  • 1
    Could you possibly show us the original filenames? Commented Jun 15, 2020 at 6:26
  • have added the file names as well Commented Jun 15, 2020 at 7:03

1 Answer 1

1

It might be easier to sort in a pipeline:

readarray -t var < <(
    cd "$HOME/testing"
    printf "%s\n" * | cut -d"_" -f2 | cut -d"-" -f1 | sort -u
)

readarray will slurp the lines of stdin into the array.

You can inspect the array with declare -p var

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.