function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
AJ MayerhoferAJ Mayerhofer 

Salesforce pdf parser?

Hi all! Any help is appreciated.

I'm attempting to create a VisualForce page where the user can upload a pdf of an Invoice, and that pdf will be parsed to get item information and store it in Salesforce.

I have currently set up a VisualForce page where a user can upload a pdf. However, I have run into 2 pertinent questions:

1. Is there a way to get this pdf into the Salesforce backend? I haven't found anything in Object Manager for storing pdfs. Right now the user can select a file but I don't do anything with it.

2 (Main question). Does there exist a Salesforce tool for converting pdfs to XML or JSON? I've found 3rd party applications but would prefer to use a Salesforce-backed application so users are assured their data is secure.

Is this a potential lead for what I'm trying to do?
https://salesforce.stackexchange.com/questions/108054/parsing-pdf-file-attachments
https://developer.salesforce.com/docs/atlas.en-us.pages.meta/pages/pages_javascript_intro.htm
AbhishekAbhishek (Salesforce Developers) 
Hi,

I have a suggestion it might guide you.

Quotes have a powerful built-in tool in Salesforce to generate PDF files based on templates. In this post, I will explain how you can take advantage of this feature to generate Invoices. In most cases, the template used for a Quote is different from the one used for an Invoice. Salesforce only brings you the option to create Quote PDFs within the Quote object and it doesn’t allow you to change the name of the file. Below I will explain how you can generate Invoice PDFs files based on Quote templates from any Object in Salesforce by just clicking a custom button.

Requirements:

Quotes must be enabled in your Salesforce organization. Click here(https://help.salesforce.com/HTViewHelpDoc?id=quotes_enable.htm&language=en_US) for more information.


Features:

Hack Quote Templates to dynamically generate PDFs.
Easily generate Invoice PDF files based on the synced or latest created Quote.
Save PDF as an Attachment, allowing you to change the name of the file.
No need to use third-party libraries.


How it works:
I’m using PageReference.getContent() method to retrieve the PDF blob. Ideally, this logic would happen within a trigger context or in a background process. However, the getContent() method cannot be used in Triggers, Scheduled Apex, or Batch jobs. This only gives us the option of using an Apex Controller class.

The Quote Template

Create a new Quote Template:
Go to App Setup > Customize > Quotes > Templates.
Click New.
For “Template Name”, use “Invoice”.
Click Save and customize your template.
Take a look at the URL and copy the template Id. The parameter name is named “summlid”.
Once you are done, click Save and on the next screen remember to Activate the new template.


The Custom Setting:

We will need a Custom Setting to manage the Invoice PDF generation. I named it “Application Properties” and it has one single field named “value” (String [255]). These are the required records:
Invoice_Footer_Height: 100 by default. You will have to play with this value as it depends on how long is the header of your template.
Invoice_Header_Height: 100 by default. You will have to play with this value as it depends on how long is the footer of your template.
Invoice_Template_Id: Salesforce Id of the template. Is the same value as the “summlid” parameter from the previous step.
Quote_Template_Data_Viewer_URL: URL that Salesforce uses to generate Quote PDF files. The value is:
/quote/quoteTemplateDataViewer.apexp?id={!QuoteId}&headerHeight={!InvoiceHeaderHeight}&footerHeight={!InvoiceFooterHeight}&summlid={!InvoiceTemplateId}
In the Code Resources section you will find a sample CSV file.


The Controller:

I’ve named the controller ‘InvoicePdfWsSample’. Is a global class because it has a webService method. This method receives a list of Opportunities Ids and generates an Attachment with the Invoice PDF file for each one.



/**
 * @author      Valnavjo <valnavjo_at_gmail.com>
 * @version     1.0.0
 * @since       10/08/2014
 */
global with sharing class InvoicePdfWsSample {
    /**
     * Default header height for invoice
     */
    private static final String DEFAULT_INVOICE_HEADER_HEIGHT = '100';
     
    /**
     * Default footer height for invoice
     */
    private static final String DEFAULT_INVOICE_FOOTER_HEIGHT = '100';
     
    /**
     * Webservice method that is called from a custom button to generate
     * an invoice PDF file using quote templates feature.
     * It generates the invoice based on:
     *      - The synced Quote, or
     *      - The latest Quote
     * If the Opportunity doesn't have any Quotes, this method doesn't do
     * anything.
     * 
     * This method uses PageReference.getContent().
     *
     * @param oppsIdList {List<Id>} list of Opportunity Ids from where the method
     *                   will generate the Invoice PDF.
     * @return {String} with an error message, if any. Blank otherwise.
     */
    webService static String generateInvoicePdf(List<Id> oppsIdList) {
        try {
            //From list to set
            final Set<Id> oppsId = new Set<Id>(oppsIdList);
 
            //Get template Id for Invoice and url to hack pdf generation
            final String invoiceTemplateId = Application_Properties__c.getAll().get('Invoice_Template_Id').value__c;
            String invoiceHeaderHeight = Application_Properties__c.getAll().get('Invoice_Header_Height').value__c;
            String invoiceFooterHeight = Application_Properties__c.getAll().get('Invoice_Footer_Height').value__c;
            final String quoteTemplateDataViewerUrl = Application_Properties__c.getAll().get('Quote_Template_Data_Viewer_URL').value__c;
             
            //Pre-validations
            //Invoice_Template_Id and Quote_Template_Data_Viewer_URL are mandatory 
            if (String.isBlank(invoiceTemplateId) || String.isBlank(quoteTemplateDataViewerUrl)) {
                String errorMsg = 'Invoice Template Id or Quote Template Data Viewer URL are blank, please review their values in Application Properties custom setting.';
 
                return errorMsg;
            }
             
            //Default values for invoice header/footer height
            if (String.isBlank(invoiceHeaderHeight)) invoiceHeaderHeight = DEFAULT_INVOICE_HEADER_HEIGHT;
            if (String.isBlank(invoiceFooterHeight)) invoiceFooterHeight = DEFAULT_INVOICE_FOOTER_HEIGHT; 
 
            //Iterate over Opps and generate Attachments list
            final List<Attachment> attList = new List<Attachment>();
            for (Opportunity opp : [select Id,
                                           (select Id, Invoice_number__c, IsSyncing, CreatedDate
                                            from Quotes
                                            order by CreatedDate DESC)
                                    from Opportunity
                                    where Id IN :oppsId]) {
                //No Quotes, no party
                if (opp.Quotes.isEmpty()) continue;
 
                //Synced quote
                Quote theQuote = null;
 
                //Try to get the synced one
                for (Quote quoteAux : opp.Quotes) {
                    if (quoteAux.IsSyncing) {
                        theQuote = quoteAux;
                        break;
                    }
                }
 
                //No synced Quote, get the last one
                if (theQuote == null) theQuote = opp.Quotes.get(0);
 
                PageReference pageRef = new PageReference(
                    quoteTemplateDataViewerUrl.replace('{!QuoteId}', theQuote.Id)
                                              .replace('{!InvoiceHeaderHeight}', invoiceHeaderHeight)
                                              .replace('{!InvoiceFooterHeight}', invoiceFooterHeight)
                                              .replace('{!InvoiceTemplateId}', invoiceTemplateId)
                );
 
                attList.add(
                    new Attachment(
                        Name = 'Invoice #' + theQuote.Invoice_number__c + '.pdf',
                        Body = pageRef.getContent(),
                        ParentId = opp.Id
                    )
                );
            }
 
            //Create Attachments
            if (!attList.isEmpty()) insert attList;
             
            return '';
        } catch (Exception e) {
            System.debug(LoggingLevel.ERROR, e.getMessage());
 
            final String errorMsg = 'An error has occured while generating the invoice. Details:\n\n' +
                                     e.getMessage() + '\n\n' +
                                     e.getStackTraceString();
             
            return errorMsg;
        }
    }
}

User-added image

I hope you find the above information is helpful. If it does, please mark as Best Answer to help others too.

Thanks.
 
AJ MayerhoferAJ Mayerhofer
Hi Abhishek,

Your reply is very helpful and that information will come in handy to my team. Thank you for that.

However, I was moreso wondering if there is a Salesforce tool for converting a pdf to XML or JSON (for parsing)? We want to be able to take a pdf of an Invoice and extract the items, storing them in Salesforce.
AbhishekAbhishek (Salesforce Developers) 
No, I have seen only scenarios like  Copy Data From PDF to Salesforce.

I hope you find the above information is helpful. If it does, please mark as Best Answer to help others too.
AJ MayerhoferAJ Mayerhofer
Copying data from PDF to Salesforce sounds useful. Any resources for that?
AbhishekAbhishek (Salesforce Developers) 
Right now I only have this,

https://docparser.com/blog/pdf-salesforce-integration/