+ Start a Discussion
SaranSaran 

Converting Blob to String

Hi All,

I have a problem in converting the body of pdf file to a string. 

I tried something like this
Say for example the pdf file is stored in static resource :

My code looks like:
blob b = [select body from staticResource where name = 'testDoc'].body;
string s = b.tostring(); // here i am getting an error like "blob is not a valid UTF-8" 

I found what is the casue but I am not able to overcome that.
The issue is that the body of pdf will have some special character like " ê , ç , ã ".

So that it is showing such error I think. Can anyone help me out !!

Thanks in advance.
Best Answer chosen by Saran
Darshan FarswanDarshan Farswan
Hi Saraz,

Again I cannot confirm on the data that is present in your file. But in order to get the text out of a blob, you can try various EncodingUtil methods. Here is what Salesforce suggests : https://help.salesforce.com/apex/HTViewSolution?urlname=How-do-you-convert-a-Blob-to-string-1327108626373&language=en_US (https://help.salesforce.com/apex/HTViewSolution?urlname=How-do-you-convert-a-Blob-to-string-1327108626373&language=en_US)

Again its not always the fact that you can convert a pdf file to String. Chances are very high that you might not get expected result due to encoding in the pdf creation itself and it might not be supported in Salesforce.

Thanks
Darshan

All Answers

Darshan FarswanDarshan Farswan
Hi Saraz,

The Blob variable b contains data in binary form, therefor you are getting UTF-8 error. So you might encode it before reading the data. Please use the following:
 
String s = EncodingUtil.base64Encode(b);

Thanks
SaranSaran
Hi Darshan,

But here the problem is even all the text is converted to binary data.

When i used the debug log to check the value i am getting something like this.

JVBERi0xLjQKJeLjz9MKMyAwIG9iaiA8PC9MZW5ndGggNTc5L0ZpbHRlci9GbGF0ZURlY29kZT4+c3RyZWFtCnicjZRNU6QwEIbv+RV91AsmQPjQU2TaIVNAkI/RmnUP7A7rjjUOCupW+es3gF+zpcUWB5ok3W/nqbe5J6cFoeD6JhRrggU5J/eEGtRyOfwhJiz05g1hFGLy7TuFNbEccL

I think everything is getting converted to binary form

Any other alternative way?????

Thanks
Darshan FarswanDarshan Farswan
Hi Saraz,

Again I cannot confirm on the data that is present in your file. But in order to get the text out of a blob, you can try various EncodingUtil methods. Here is what Salesforce suggests : https://help.salesforce.com/apex/HTViewSolution?urlname=How-do-you-convert-a-Blob-to-string-1327108626373&language=en_US (https://help.salesforce.com/apex/HTViewSolution?urlname=How-do-you-convert-a-Blob-to-string-1327108626373&language=en_US)

Again its not always the fact that you can convert a pdf file to String. Chances are very high that you might not get expected result due to encoding in the pdf creation itself and it might not be supported in Salesforce.

Thanks
Darshan
This was selected as the best answer
Jayesh W.Jayesh W.
Hello All,

This thread is pretty old and also has best answer but link in best answer is dead, hence providing solution it may help someone.

Below code should work.
string s = b.tostring()
s = EncodingUtil.base64Decode(EncodingUtil.base64Encode(b)).toString();

Thanks,
Jayesh W.
Joep MutsaertsJoep Mutsaerts
Hello All, 
just to help you with some actual working code: 
To show/display/embed PDF on a salesforce page, from base64 string or from an external PDF url:

1) create a Visualforce page called HelloWorld: 
<apex:page controller="DownloadPDF" sidebar="false" showHeader="false" contentType="application/pdf">
    <script>
        window.location.href = "data:application/pdf;base64,{!match1}"; // switch between match1,match2
    </script>
</apex:page>
2) create the controller DownloadPDF
public class DownloadPDF {
// Option 1: use an example PDF document base64 string 
    public String getMatch1() {
        string s = 'JVBERi0xLjcKCjEgMCBvYmogICUgZW50cnkgcG9pbnQKPDwKICAvVHlwZSAvQ2F0YWxvZwog' +
  'IC9QYWdlcyAyIDAgUgo+PgplbmRvYmoKCjIgMCBvYmoKPDwKICAvVHlwZSAvUGFnZXMKICAv' +
  'TWVkaWFCb3ggWyAwIDAgMjAwIDIwMCBdCiAgL0NvdW50IDEKICAvS2lkcyBbIDMgMCBSIF0K' +
  'Pj4KZW5kb2JqCgozIDAgb2JqCjw8CiAgL1R5cGUgL1BhZ2UKICAvUGFyZW50IDIgMCBSCiAg' +
  'L1Jlc291cmNlcyA8PAogICAgL0ZvbnQgPDwKICAgICAgL0YxIDQgMCBSIAogICAgPj4KICA+' +
  'PgogIC9Db250ZW50cyA1IDAgUgo+PgplbmRvYmoKCjQgMCBvYmoKPDwKICAvVHlwZSAvRm9u' +
  'dAogIC9TdWJ0eXBlIC9UeXBlMQogIC9CYXNlRm9udCAvVGltZXMtUm9tYW4KPj4KZW5kb2Jq' +
  'Cgo1IDAgb2JqICAlIHBhZ2UgY29udGVudAo8PAogIC9MZW5ndGggNDQKPj4Kc3RyZWFtCkJU' +
  'CjcwIDUwIFRECi9GMSAxMiBUZgooSGVsbG8sIHdvcmxkISkgVGoKRVQKZW5kc3RyZWFtCmVu' +
  'ZG9iagoKeHJlZgowIDYKMDAwMDAwMDAwMCA2NTUzNSBmIAowMDAwMDAwMDEwIDAwMDAwIG4g' +
  'CjAwMDAwMDAwNzkgMDAwMDAgbiAKMDAwMDAwMDE3MyAwMDAwMCBuIAowMDAwMDAwMzAxIDAw' +
  'MDAwIG4gCjAwMDAwMDAzODAgMDAwMDAgbiAKdHJhaWxlcgo8PAogIC9TaXplIDYKICAvUm9v' +
  'dCAxIDAgUgo+PgpzdGFydHhyZWYKNDkyCiUlRU9G';
        return EncodingUtil.base64Decode(s).toString();
    }

// Option 2: use an example PDF document from a URL,              
// in this case: http://www.africau.edu/images/default/sample.pdf 
// In Setup, add the endpoint to Remote Site Settings             

// @future(callout=true) (I didn't need this line to get it to work)
    public String getMatch2() {
        Http httpProtocol = new Http();
        HttpRequest request = new HttpRequest();
        String endpoint = 'http://www.africau.edu/images/default/sample.pdf';
        request.setEndPoint(endpoint);
        request.setMethod('GET');
        HttpResponse response = httpProtocol.send(request);
        return response.getBody();
    }
}
This will give you a salesforce page with an embedded PDF from a base64 source, showing a PDF with HelloWorld. 
In case you copy and paste this code, make sure you fix the indentation (under Debug>>Fix indentation.