Hello, I am trying to convert text I retrieve from a invoicing tool. The text does contain the german umlauts (ä, ö, ü). And the data is formatted as HTML (i.e. ä = ä / ü = ü etc.).
I used the Zapier Format/Convert HTML to Markdown, but it does convert the umlauts not correctly: &aauml; = a / ü = u.
Is there a better way to convert HTML to text with umlauts?
Thanks :-)
Page 1 / 1
@MarijnVerdult Great, works perfect! Thank you so much for your help!!!
Great question! For that, you can use the Regular Expression module (don’t forget to import it at the beginning of your Code Step)
How it works is that you substitute everything between < and > with “” (i.e. nothing). The key here is .*?; which is the RegEx expression for lazy, so to say everything that matches it. Without the question mark it would look for the first < and the last > and substitutes everything in-between.
@MarijnVerdult Sorry for a second questions… how do I remove all HTML code (i.e.<strong></strong> etc.)?
@MarijnVerdult Thanks a lot! It did work and all special characters have been replaced. Thanks
@michael291 - I’m sorry, it looks like I both made a type and that I mixed Python with JS code. Please find here the proper code:
where you would need to declare all characters you want to replace. You should then declare “firstname” as input and your output will the the correct string.
Pretty sure there might be a smarter way of doing this but this will get the job done!