Question

Formatter Chunking for AI prompt not feeding into next step for Looping


Userlevel 1

I have a Zap to summarize Gong Call Transcripts through OpenAI. I am using the ‘Split Text into Chunks for AI Prompts (beta)’ in Formatter to chunk the transcripts into a size that fits the OpenAI token limit. I can see in the Formatter Results that it is successfully ‘chunking’. However in the subsequent ‘Looping’ step, the ‘Output Chunks Chunk’ from the Formatter step is not being delivered as a ‘Chunk’, it is still the entire transcription. As a result I am not able to run the Flow into OpenAI as the prompt exceeds the token limit.

 

What am I doing wrong? See screenshots below for the current setting of the ‘Formatter’, and the ‘Looper’ step next. I am trying to follow the Zapier-provided template for OpenAI chunking of large form text.

Formatter screenshot:

Split Text into Chunks for API Prompts in the Test is successfully outputting in line-items

Looper Screenshot:
 

Any/all next steps are not recognizing the line-item breaks from the formatter step

 


11 replies

Userlevel 7
Badge +14

Hi @cashion 

Good question.

Can you post screenshots of what the output from the Formatter step looked like?

Have you tried testing the Looping step?

Userlevel 1

Here are screenshots from the Formatter output, where it appears to be working as expected in the preview, with two ‘chunks’ of a large text body:
 

However, in the Looper step the input ‘chunk’ block is not chunked, it is still pulling in the full body of the transcript. Notice the Output Chunk Chunks is still pulling the whole conversation, (from Hello to bye):
 

And you can see the ‘chunks’ are not actually getting split out in Looper:
 


 

 

Userlevel 7
Badge +14

@cashion 

Read below carefully for context.

 

This variable shows that step 3 output only 1 chunk.

 

When looking at the Formatter output, it actually shows 1 line item (see first “1” directly below the “output” which corresponds to the “1” above for the “output # of Chunks”) that has an array of chunks (2).

 

When I tested with lorem ipsum my Formatter output didn’t have the “1” line item directly below the “output”, so that’s likely causing the issue with the Looping step.

 

Userlevel 1

Troy I think you’ve keyed in on the issue that my output on the formatter step is generating that extra ‘1’ before ‘chunks’ and it’s throwing off the following steps, assuming that everything below is just one big chunk again. I fixed what you suggested, but still seeing a 2 chunk size input feeding into just 1 large chunk on the next step. What do you think is going on there with the first ‘1’, and how can I get rid of it?

Userlevel 7
Badge +14

@cashion 

We would need to see updated screenshots with how the Zap steps are configured along with data examples from the step outputs.

Userlevel 7
Badge +14

@cashion 

It may be related to how the original data received on the webhook is formatted, which we would need to see screenshots of.

 

Userlevel 1

@Troy Tessalone 

Here’s the json from the webhook, used to query the Gong API to grab a transcript:
 

Here’s what is pulled in from the webhook (two screenshots at the start of the result, and one from the end):
 

 

 

 

Userlevel 7
Badge +14

@cashion 

Ah, ok, as suspected the data from the Gong API for the transcription is an array of items.

 

Try adding this Zap action: Formatter > Utilites > Line Items to Text

Use the “text” field so it will join all the transcription “text” into 1 “text” field.

Then you can pass it to this Zap action: Formatter > Text > Split Text into Chunks for AI Prompts

 

Userlevel 1

Wonderful! That seems to have done it. Thank you @Troy Tessalone 

Userlevel 1

Hey @Troy Tessalone I got another related question here for you - you can see that my Transcript is just text, but I’ve lost all my information about who is saying what (speaker id, or better yet the name of the speaker). Any suggestions how I can get the names of my speakers into the transcription? It’s leading to ChatGPT not recognizing who is saying what, and getting it wrong in summaries.

Userlevel 7
Badge +14

@cashion 

You’d have to include the speakerId, but there can be multiple sentences for a single speaker, so you may have to handle the data differently.

Best bet is to get the raw JSON and use that, but that’s an advanced approach.

 

Reply