+ Start a Discussion

Is there a good way to strip html email from all tags?



We have a requirement that says to update case description with the body of the last inbound email. That part is easily handled with a workflow.


The workflow simply runs everytime an inbound email is created in the system and then carries out a field update where case description is updated with TextBody of the email.


However, some email clients only send out HTML emails and do not include a plain text version of the email body. In that case the description will be updated with nothing (empty string). A possible workaround for this is to check if the TextBody is empty and if not, use the TextBody else use HtmlBody.


The formula used as field update looks like this: IF(ISBLANK(TextBody), HtmlBody, TextBody)


And this is where the topic question comes into the picture. The problem is that HtmlBody contains all HTML tags as well which makes it impossible for users without HTML knowledge to read the text.


According to SF Premium Support, emails that arrive in the system are passed through a number of  "parsers" before the text is populated in the case description field and that's why the HTML codes aren't showing by default.


My question is this. Does anyone have experience with a HTML parser/stripper we could utilize in this situation? Basically, what we need is to remove any HTML tags, scripting tags, stylesheet tags etc. etc. and only be left with the plain text version of the email.




Søren Nødskov Hansen