Best answer

Extract Pattern from email text

  • 31 March 2023
  • 5 replies
  • 194 views

Userlevel 1

Hi, 

I’m trying to use Formatter to extract a couple elements of data from emails and having trouble coding the Python expression to match the specific string I’m after. 

From other guides in this community, the suggestion is to first of all strip out the HTML of the email. However, when I do so, I end up with the following:

From: John Smith Message Body:Firstname: JohnLastname: SmithPhone: 0780000000Email: johnsmith@hotmailtest.co.ukJobTitle: CEOYearsOfService: Over two yearsSalary: £100,000 +Settled: NoMessage: Please call me on 0780000000

The <br> is stripped and as such, elements such as JobTitle and YearsOfService is merged. 

I am able to extract the email address directly from the HTML version of the email with existing Transform options. 

Can someone help with the what code to use in the Pattern field so that I can strip out the YearsofService, Salary, Settled components please?


<html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body>From: John Smith Message Body:<br>Firstname: John<br>Lastname: Smith<br>Phone: 0780000000<br>Email: johnsmith@hotmailtest.co.uk<br>JobTitle: CEO<br>YearsOfService: Over two years<br>Salary: £100,000 +<br>Settled: No<br>Message: Please call me on 0780000000 <br>

This is the code I’ve tried. 

(?<=Salary:)(.*?)(?=\n)

Thanks!

icon

Best answer by Todd Harper 31 March 2023, 19:53

View original

This post has been closed for comments. Please create a new post if you need help or have a question about this topic.

5 replies

Userlevel 6
Badge +8

I don’t know Python, but here’s how I would do it with JavaScript:

const htmlString = inputData.YOUR_STRING;

let newString = htmlString.replace(/<br>/g, " ").replace(/(<([^>]+)>)/ig, "");

const outputObject = {
yearsOfService: newString.split("YearsOfService: ")[1].split("Salary")[0].trim(),
salary: newString.split("Salary: ")[1].split("Settled")[0].trim(),
settled: newString.split("Settled: ")[1].split("Message")[0].trim()
};

output = [{outputObject}];

 

Userlevel 1

I don’t know Python, but here’s how I would do it with JavaScript:

const htmlString = inputData.YOUR_STRING // `<html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body>From: John Smith Message Body:<br>Firstname: John<br>Lastname: Smith<br>Phone: 0780000000<br>Email: johnsmith@hotmailtest.co.uk<br>JobTitle: CEO<br>YearsOfService: Over two years<br>Salary: £100,000 +<br>Settled: No<br>Message: Please call me on 0780000000 <br>`;

let newString = htmlString.replace(/<br>/g, " ").replace(/(<([^>]+)>)/ig, "");

const outputObject = {
yearsOfService: newString.split("YearsOfService: ")[1].split("Salary")[0].trim(),
salary: newString.split("Salary: ")[1].split("Settled")[0].trim(),
settled: newString.split("Settled: ")[1].split("Message")[0].trim()
};

output = [{outputObject}];

 

Where would you apply JS in Formatter? It only gives me the following options:

 

 

Userlevel 6
Badge +8

You would actually do this in a Code by Zapier step with “Run Javascript” as the event, not in a Formatter action. (Side Bonus: you extract each of these bits all in one action rather than using a Formatter step for each)

 

 

Userlevel 6
Badge +8

Also, for clarification, you do not need to type out the HTML in the right side Input Data box. You can just put the same Outlook body you placed in the “Input” field from your previous screenshot.

Userlevel 1

 

 

Worked perfectly! Thank you!