function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
GlennWGlennW 

Multi-byte character support

Is there full support in the 3.0 API for multi-byte character support?

Regards;
GlennW

Rick BanisterRick Banister
Yes, it works if you observe a couple of important rules:
1. If you're writing to a database, the database must be the same character set.
2. If you're writing to a file system, use OutputStreamWriter instead of FileWriter.

To quote from the API Doc:

"The salesforce.com server supports either full Unicode characters or ISO-8859-1 characters, depending on the instance. You can determine the encoding ahead of time using the describeGlobal call. The encoding specified by the describeGlobal call is the character set that is supported by that sforce instance.

"The response from the server will be in UTF-8 or ISO-8859-1 encoding, depending on the character set supported by the instance. This is usually handled for you by the SOAP client. All servers accept either encoding, but the ISO-8859-1 server cannot support characters outside of the ISO-8859-1 range. Data sent to that server outside of the valid ISO-8859-1 range may either be truncated or cause an error."

You cannot convert UTF-8 to ISO-8859-1 or vice versa; the high-order characters are not compatible. You can tell what an Oracle database is by the following query:

select * from NLS_DATABASE_PARAMETERS
where parameter in ('NLS_CHARACTERSET','NLS_LANGUAGE','NLS_TERRITORY');

Also, if you write to the file system, use FileOutputStream instead of FileWriter, which does not support either UTF-8 or ISO-8859-1.

// NOT UTF-8 OR ISO-8859-1 COMPLIANT
FileWriter fw = new FileWriter(updateFilename, false);
fw.write(buf.toString() + "\n");

// UTF-8 COMPLIANT
FileOutputStream os = new FileOutputStream(updateFilename, false);
OutputStreamWriter fw = new OutputStreamWriter(os, "UTF-8");
fw.write(buf.toString() + "\n");

Once again, you CANNOT convert between the two. My current client is going to have to reload their entire reporting database because it was created as ISO-8859-1, and they are on UTF-8 for Salesforce.