-
-
Save DecisionNerd/3de707bc656cf757a0cb to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# CSV to JSON converter using BASH | |
# original script from http://blog.secaserver.com/2013/12/convert-csv-json-bash/ | |
# thanks SecaGuy! | |
# Usage ./csv2json.sh input.csv > output.json | |
input=$1 | |
[ -z $1 ] && echo "No CSV input file specified" && exit 1 | |
[ ! -e $input ] && echo "Unable to locate $1" && exit 1 | |
read first_line < $input | |
a=0 | |
headings=`echo $first_line | awk -F, {'print NF'}` | |
lines=`cat $input | wc -l` | |
while [ $a -lt $headings ] | |
do | |
head_array[$a]=$(echo $first_line | awk -v x=$(($a + 1)) -F"," '{print $x}') | |
a=$(($a+1)) | |
done | |
c=0 | |
echo "{" | |
while [ $c -lt $lines ] | |
do | |
read each_line | |
if [ $c -ne 0 ]; then | |
d=0 | |
echo -n "{" | |
while [ $d -lt $headings ] | |
do | |
each_element=$(echo $each_line | awk -v y=$(($d + 1)) -F"," '{print $y}') | |
if [ $d -ne $(($headings-1)) ]; then | |
echo -n ${head_array[$d]}":"$each_element"," | |
else | |
echo -n ${head_array[$d]}":"$each_element | |
fi | |
d=$(($d+1)) | |
done | |
if [ $c -eq $(($lines-1)) ]; then | |
echo "}" | |
else | |
echo "}," | |
fi | |
fi | |
c=$(($c+1)) | |
done < $input | |
echo "}" |
@linosteenkamp's version does not work with CSV that contain quoted comma (",") e.g. printf "head1,head2,head3\n1,\"foo, bar, baz\",\"foo bar baz\"" | ./csv2json.sh
will result in
[
{
"head1": 1,
"head2": ""foo",
"head3": " bar"
"": " baz""
"": "foo bar baz"
}
]
Quick fix for @outwitevil's script (https://gist.github.com/dsliberty/3de707bc656cf757a0cb#gistcomment-2103308) is to replace the \r
in the sed
regex with $(printf '\r')
. The script will still struggle with empty lines, so you have to delete them beforehand. A simple one-liner
printf "head1,head2,head3\n\n\n1,\"foo, bar, baz\",\"foo bar baz\"\n\n" | sed '/^[[:space:]]*$/d' | ./csv2json.sh
[
{
"head1": 1,
"head2": "foo, bar, baz",
"head3": "foo bar baz"
}
]
I haven't checked if there are any side effects on Linux now.
#!/bin/bash
# CSV to JSON converter using BASH
# original script from https://gist.github.com/dsliberty/3de707bc656cf757a0cb
# Usage ./csv2json.sh input.csv > output.json
# cat <input.csv> | csv2json > output.json
#set -x
shopt -s extglob
input="${1:-/dev/stdin}"
SEP=","
[ -z "${input}" ] && echo "No CSV input file specified" && exit 1
[ ! -e "${input}" ] && echo "Unable to locate ${input}" && exit 1
csv_nextField()
{
local line="$(echo "${1}" | sed 's/$(printf '\r')//g')"
local start=0
local stop=0
if [[ -z "${line}" ]]; then
return 0
fi
local offset=0
local inQuotes=0
while [[ -n "${line}" ]]; do
local char="${line:0:1}"
line="${line:1}"
if [[ "${char}" == "${SEP}" && ${inQuotes} -eq 0 ]]; then
inQuotes=0
break
elif [[ "${char}" == '"' ]]; then
if [[ ${inQuotes} -eq 1 ]]; then
inQuotes=0
else
inQuotes=1
fi
else
echo -n "${char}"
fi
offset=$(( ${offset} + 1 ))
done
echo ""
return $(( ${offset} + 1 ))
}
read -r first_line < "${input}"
a=0
headings=$(echo "${first_line}" | awk -F"${SEP}" {'print NF'})
if [ "${input}" = "/dev/stdin" ]; then
while read -r line
do
lines_str+="$line"$'\n'
c=1
done < "${input}"
else
lines_str="$(cat "${input}")"
c=0
fi
lines_num=$(echo "${lines_str}" | wc -l)
while [[ ${a} -lt ${headings} ]]; do
field="$(csv_nextField "${first_line}")"
first_line="${first_line:${?}}"
head_array[${a}]="${field}"
a=$(( ${a} + 1 ))
done
#c=0
echo "["
while [ ${c} -lt ${lines_num} ]
do
read -r each_line
each_line="$(echo "${each_line}" | sed 's/$(printf '\r')//g')"
if [[ ${c} -eq 0 ]]; then
c=$(( ${c} + 1 ))
else
d=0
echo " {"
while [[ ${d} -lt ${headings} ]]; do
item="$(csv_nextField "${each_line}")"
each_line="${each_line:${?}}"
echo -n " \"${head_array[${d}]}\": "
case "${item}" in
"")
echo -n "null"
;;
null|true|false|\"*\"|+([0123456789]))
echo -n ${item}
;;
*)
echo -n "\"${item}\""
;;
esac
d=$(( ${d} + 1 ))
[[ ${d} -lt ${headings} ]] && echo "," || echo ""
done
echo -n " }"
c=$(( ${c} + 1 ))
[[ ${c} -lt ${lines_num} ]] && echo "," || echo ""
fi
done <<< "${lines_str}"
echo "]"
#!/bin/bash
CSV to JSON converter using BASH
Usage ./csv2json input.csv > output.json
input=$1
[ -z $1 ] && echo "No CSV input file specified" && exit 1
[ ! -e $input ] && echo "Unable to locate $1" && exit 1
read first_line < $input
a=0
headings=echo $first_line | awk -F, {'print NF'}
lines=cat $input | wc -l
while [ $a -lt $headings ]
do
head_array[$a]=$(echo
a=$(($a+1))
done
c=0
echo "["
while [ $c -le $lines ]
do
read each_line
if [ $c -ne 0 ]; then
d=0
echo -n "{"
while [ $d -lt $headings ]
do
each_element=$(echo
if [
echo -n ""${head_array[$d]}":"$each_element","
else
echo -n ""${head_array[$d]}":"$each_element""
fi
d=$(($d+1))
done
if [
echo "}"
else
echo "},"
fi
fi
c=$(($c+1))
done < $input
echo "]"
This should give with quatation and array of json objects
Question, the script runs fine but does not output a json file?
Problem if field value have more than 254 characters.After that field every other field will the same
I have a field that has the following value
"doc":0000000000000000000000000000000000000000000000000000000000000000,
what is interesting is that the all zeros is failing to be parsed by JSON tools ...
they either want a 0 or a "0000000000000000000000000000000000000000000000000000000000000000"
Is there a way that we can put quotes around all values even if they are numbers? or is that outside the accepted formatting of JSON?
Here's another version that works with Busybox, and does not rely on eval, altough using cut in order to find corresponding headers has a little performance impact.
Function should be highly portable
Use with
jsonOutput = "$(CSV2JSON2 "inputFile")"