I have a .csv file contains
Data1|Data2|10/24/2017 8:10:00 AM
I want to change date and time format of column 3 as following:
From 10/24/2017 8:10:00 AM (12-Hour) to 20171024 08:10:00(24-hour).
Not using -d
A pure awk solution (that doesn’t fork off a date command):
awk -F'|' -vOFS='|' '
function fail() {
printf "Bad data at line %d: ", NR
print
next
}
{
if (split($3, date_time, " ") != 3) fail()
if (split(date_time[1], date, "/") != 3) fail()
if (split(date_time[2], time, ":") != 3) fail()
if (time[1] == 12) time[1] = 0
if (date_time[3] == "PM") time[1] += 12
$3 = sprintf("%.4d%.2d%.2d %.2d:%.2d:%.2d", date[3], date[1], date[2], time[1], time[2], time[3])
print
}'
-F'|' breaks the input line apart at vertical bars
into $1, $2, $3, etc…split($3, date_time, " ") breaks the date/time field into three pieces:
the date, the time, and the AM/PM indicator.
If there aren’t three pieces, issue an error message and skip the line.split(date_time[1], date, "/") splits the date
into the month, the day, and the year.split(date_time[2], time, ":") splits the time
into the hour, the minutes, and the seconds.sprintf reassembles the year, month, day,
hour, minutes, and seconds, with leading zeroes, if necessary.
Assigning this to $3 rebuilds the input line
with the reformatted date/time; we then print that.Feature: If the input has more than three fields; e.g.,
Data1|Data2|10/24/2017 8:10:00 AM|Data4|Data5
this script will preserve those extra field(s).
Usage: A few minor variations:
}'), put the name(s) of file(s) you want to process.
You can (of course) use wildcards (e.g., *.csv) here,
in addition to or instead of filename(s).}', say < and a filename.
(You can process only one file at a time this way.)#!/bin/sh.
(Or, if you prefer, you can use #!/bin/bash or #!/usr/bin/env bash.
A discussion of the differences between these different “she-bang” lines,
and their relative merits and counter-indications,
is beyond the scope of this question,
but you can find plenty of discourse on the topic if you search.)}'),
put "$@" (including the quotes).gman.chmod +x gman../gman followed by either a list of filenames and/or wildcards,
or by < and a single filename.Here is one way of doing it assuming infile is your CSV file:
#!/bin/bash
IFS='|'
while read data1 data2 datestr
do
newdatestr=$(date -d"$datestr" +"%Y%m%d %T")
printf "%s|%s|%s\n" "$data1" "$data2" "$newdatestr"
done < infile
with AWK:
save file a.awk:
BEGIN{
FS="|"
OFS = FS
}
{
"date -d '"$3"' +'%Y%m%d %T' " | getline l
$3 = l
print $0
}
and run it with your csv file:
awk -f a.awk file.csv
for example, output is :
Data1|Data2|20171024 08:10:00
Data1|Data2|20171024 20:10:00
Data1|Data2|20171024 20:10:00
Data1|Data2|20171024 20:14:00
Data1|Data2|20171024 20:14:00
Data1|Data2|20171024 20:11:00
Data1|Data2|20171024 20:10:06
Data1|Data2|20171024 20:10:06
Data1|Data2|20171024 08:10:50
with this example:
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:10:00 PM
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:14:00 PM
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:11:00 PM
Data1|Data2|10/24/2017 8:10:06 PM
Data1|Data2|10/24/2017 8:10:00 PM
Data1|Data2|10/24/2017 8:10:50 AM
I'd use perl or any language with interface to strptime() and strftime():
perl -MTime::Piece -F'[|]' -lape '
$F[2] = Time::Piece->strptime($F[2], "%m/%d/%Y %I:%M:%S %p")->
strftime("%Y%m%d %T");
$_ = join "|", @F' < file.csv
Same with zsh:
zmodload zsh/datetime
while IFS='|' read -rA F; do
strftime -rs t '%m/%d/%Y %I:%M:%S %p' $F[3] &&
strftime -s 'F[3]' '%Y%m%d %T' $t
printf '%s\n' "${(j:|:)F}"
done < file.csv
Using GNU date (but not date -d) and a shell like bash that understands process substitutions:
$ cat file
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:10:00 AM
Data1|Data2|10/24/2017 8:10:00 AM
$ paste -d '|' <( cut -d '|' -f -2 file ) <( date -f <( cut -d '|' -f 3 file ) +'%Y%m%d %T' )
Data1|Data2|20171024 08:10:00
Data1|Data2|20171024 08:10:00
Data1|Data2|20171024 08:10:00
Data1|Data2|20171024 08:10:00
Data1|Data2|20171024 08:10:00
The call to date reads the dates from the cut command, which extracts the third |-delimited column from the given file. It outputs one reformatted date per line of input.
This is then pasted together with the first two columns using paste.
This has the downside that it reads the file twice, but it only calls date once (and without -d).
You could also do this with dateutils, e.g. with the following input:
10/24/2017 8:10:00 AM
10/24/2017 8:10:00 PM
10/24/2017 8:10:00 AM
10/24/2017 8:14:00 PM
10/24/2017 8:10:00 AM
10/24/2017 8:11:00 PM
10/24/2017 8:10:06 PM
10/24/2017 8:10:00 PM
10/24/2017 8:10:50 AM
and the dateconv or dateutils.dconv program:
dateconv -i '%m/%d/%Y %H:%M:%S %p' -f '%Y%m%d %T' < infile
Output:
20171024 08:10:00
20171024 20:10:00
20171024 08:10:00
20171024 20:14:00
20171024 08:10:00
20171024 20:11:00
20171024 20:10:06
20171024 20:10:00
20171024 08:10:50
This can be easily done by using sed's extended regex
I am amazed that no one has given answer using sed
GNU sed's one liner :
sed -r 's/([0-9]{2})\/([0-9]{2})\/([0-9]{4})/\3\1\2/' file_name
Here I used extended regex to capture groups
date -d?