Created
November 12, 2019 16:36
-
-
Save raypereda/718573bde62d1d07200e54ede1df967b to your computer and use it in GitHub Desktop.
escapes double-quotes in the last field of a CSV line
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
line = '1,2,3,4,5,"He said "thanks" to the cashier"' | |
p line | |
# "1,2,3,4,5,\"He said \"thanks\" to the cashier\"" | |
parts = line.split(",", 6) | |
p parts | |
# ["1", "2", "3", "4", "5", "\"He said \"thanks\" to the cashier\""] | |
text = parts[5] | |
p text | |
"\"He said \"thanks\" to the cashier\"" | |
text = text[1...-1] | |
p text | |
# "He said \"thanks\" to the cashier" | |
text.gsub!('"', '""') | |
p text | |
# "He said \"\"thanks\"\" to the cashier" | |
parts[5] = text | |
p parts | |
# ["1", "2", "3", "4", "5", "He said \"\"thanks\"\" to the cashier"] | |
line = parts.join(",") | |
p line | |
# "1,2,3,4,5,He said \"\"thanks\"\" to the cashier" | |
line = '"' + line + '"' | |
p line | |
# "\"1,2,3,4,5,He said \"\"thanks\"\" to the cashier\"" | |
# escapes double-quotes in the last field of a CSV line | |
# assume only the last field is a string | |
# all others are non-string values | |
# n is the number of fields | |
def fix(line, n) | |
parts = line.split(",", n) | |
text = parts[-1] | |
text = text[1...-1] # strip outer double-quotes | |
text.gsub!('"', '""') # escape double-quotes | |
parts[-1] = text | |
'"' + parts.join(",") + '"' # add the outer double-quotes | |
end | |
line = '1,2,3,4,5,"He said "thanks" to the cashier"' | |
p fix(line, 6) | |
# "\"1,2,3,4,5,He said \"\"thanks\"\" to the cashier\"" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In general, fixing CSV files has too much guesswork. This might work in the special case when the only string field is the last one.
We should explore how erroneous CSV is created. There make a flag or switch to turn on double-quote escaping.