+ Start a Discussion

How can I use a PDF Parser in Apex?

Are there any good examples of using a PDF Parser in Apex?
Vasani ParthVasani Parth
If you mean you want to look at the bytes that make up the file using Apex code you can't directly. You can turn it into a base64 string using EncodingUtil.base64Encode but as the string characters then don't align with the byte boundaries it is very hard work to do anything useful (and you are likely to run into CPU and heap governor limits).

AFAIK,Salesforce does not contain PDF Parsing library. So,as of now, it is not possible to read through pdf. 

Please mark this as the best answer if this helps
Moritz DausingerMoritz Dausinger
Extracting data from PDF can be tricky and I don't think that Apex offers the possibility to read PDF documents. PDF parsing comes especially difficult if you want to extract specific data fields and not just the whole text. Unlike HTML, the PDF standard does not include structural tags like <h1> or <table> which makes the data extraction process more difficult.

Our app Docparser (https://docparser.com/blog/pdf-salesforce-integration/) comes however with a Salesforce integration. You can for example post PDF files from Salesforce to Docparser, extract certain data fields and then post the data back to Salesforce. Happy to answer your questions!