+ Start a Discussion

Read uploaded PDF or DOC file content in Apex

I need to know how to read the uploaded PDF or DOC file content in Apex. Is there any way to do it? The code below doesn't seem to work for me

public static String blobToString(Blob input, String inCharset){
    String hex = EncodingUtil.convertToHex(input);
    System.assertEquals(0, hex.length() & 1);
    final Integer bytesCount = hex.length() >> 1;
    String[] bytes = new String[bytesCount];
    for(Integer i = 0; i < bytesCount; ++i)
       bytes[i] =  hex.mid(i << 1, 2);
    return EncodingUtil.urlDecode('%' + String.join(bytes, '%'), inCharset);
You can't do this in Apex as far as I know. You'd need something that knows how to parse the PDF/DOC file into something useful, rather than just a chunk of binary data. The usual advice is to do this off platform using something like Java and Heroku and then return the raw text from the document to the Apex method.