Question

CAN YOU HELP ME CLEAN THE TEXT?


HI.

 

BASICALLY WE ARE EXTRACTING NEWS ARTICLE TO A DIFFERENT WEBSITE USING RSS FEED HOWEVER, WHEN THE CONTENT IS ADDED THERE ARE SOME CODES ADDED IN IT. HOW DO I CLEAN IT OR FORMAT IT PROPERLY SO I CAN JUST ADD PLAIN TEXT TO IT?


See sample

 


8 replies

Userlevel 7
Badge +14

Hi @ME12345 

You would need to be more specific with what you exactly mean by the “codes” that are added. (examples)

See the text in the photo there are added codes like this “|| {cmd: []}; googletag.cmd.push(function() { googletag.defineSlot('/21776187881/fw-responsive-main_content-slot2', [[468, 60], [728, 90], [300, 100], [320, 50]], 'div-gpt-ad-1665767472470-0').defineSizeMapping(gptSizeMaps.banner1).”

 

 

I managed to remove the HTML using the Remove HTML Tags.

 

------------------------------------------------------------------

 

Now I want to separate the article by paragraphs and I use code by zapier. do you think i can fix the format?

 

 

See how the spaces looks like

 

​​​​

 

See how the article looks like

 

 

 

How can i retain the format like that?

 

Thanks.

 

 

Userlevel 7
Badge +14

@ME12345 

Another option to consider trying is to use AI to help you parse and prep the data.

ChatGPT: https://zapier.com/apps/chatgpt/integrations#triggers-and-actions

OpenAI: https://zapier.com/apps/openai/integrations#triggers-and-actions

For some reason it now becomes super messy. 😑

Userlevel 7
Badge +14

@ME12345 

Trying to parse and prep data that varies in format can be a challenge.

You may want to have AI summarize the contents, then format from that output.

Do you have any sample steps I can follow with how you use the AI to summarize the content?

 

Userlevel 7
Badge +14

@ME12345 

Really depends on your requirements.

Here’s an option...

Zap action: OpenAI - Summarize Text

 

Reply