function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
dturkeldturkel 

DataLoader: Escaping Quotation Marks within a CSV (that contains commas and quotation marks in data)

I need to use Data Loader to load data that can have commas "," and double-quotation marks within any number of fields.

 

Outside of mangling the data by removing the quotations marks, or substituting the character with another, do I have any other answer?

 

The quotation mark is an important piece of data in the field(s) in question.  I can't unquote the entire value of my field due to the embedded commas.  Using single quotes (') to surround field data also does not appear to work.

 

The CSVFileReader.java class appears to be the culprit, by way of its tokenization and non-accomodation of this situation.  Surely this has come up in the 6 years this class has existed...

 

 

/**
 * Parse a CSV or tab delimmited file into lines of fields. One line is returned in each call to getNextLine. Each line
 * is returned as an ArrayList of String fields. This parser auto-detects comma or tab delimmiters based on the first
 * line. This file correctly handles embedded quotes, delimmiters, and newlines, based on the way MS Excel and other
 * apps do CSV format. Note that this is different that the way StreamTokenizer handles things itself, which is why I
 * needed to add the getNextToken () method that wraps the StreamTokenizer.nextToken () method. StreamTokenizer doesn't
 * handle embedded newlines in a quoted string. Because CSV format allows embedded newlines in quoted strings, record
 * index values in the array won't necessarily agree with line numbers in a text editor, although they will agree with
 * row numbers when the file is viewed in Excel. We should probably give some accessor to ask the text-editor
 * appropriate line number for the current record.
 */

 

 

dturkeldturkel

My mistake.  Apparently the answer is to use two double quotes as a valid escape ("") to represent a single quote.  DataLoader's parser does handle this.