Skip to main content

I want to build a Zap to track U.S. Department of Defense contracts on a spreadsheet. Here's what's up:

Every day, the DoD publishes a new story to an RSS feed detailing the day's new contracts. Here's the RSS feed: https://www.defense.gov/DesktopModules/ArticleCS/RSS.ashx?ContentType=400&Site=945&max=10

And here's the story in the latest <item>: https://www.defense.gov/News/Contracts/Contract/Article/3916067/

That story describes 34 new contracts, each described in a different paragraph. (Another day, it might be 23, or 50, or 7.) I want to use something like ChatGPT or Gemini to extract data from each contract and put it into a Google Sheet (or other sheet) with a standard set of headings: Service, Awardee, Location, Contract Value, Contract Type, Contract Details, Work Locations, Procurement, Contracting Activity.

The problem I'm running into is that variable number of contracts/paragraphs, which means there would be a variable number of outputs each time this is run. I've been toying around with ChatGPT's "extract structured data," but it seems like it's set up for a single task, not a task of variable length.

I feel like I need to do something like:
1. Trigger: New item in RSS Feed
2. Fetch the copy from the linked item
3. Break the copy into 'n' chunks
4. Start a loop of length 'n'
5. For each 'n' extract the data and push it to a new row of the Google Sheet

I guess I just don't really know how to do Steps 3 and 4 here, and I could use some help. Am I thinking about this the right way? Or is there a faster, all-in-one approach to this?

Hi and welcome to the Community, @Matt Gross! 🎉

Hmm, perhaps you could use the Looping by Zapier app for this? 🤔

It would create a set of loops from the list of different contracts it receives, allowing the Zap to run the AI actions and Google Sheets for each of those contracts separately. It can handle a varying number of items (max: 500), so it wouldn’t have to be the same number of contracts each time. Check out our Loop your Zap actions guide to learn more.

In terms of breaking up the contracts for each loop if the RSS trigger doesn’t output the different contracts as line items you could use a Formatter action to turn the text it into line items - see: Create line items in Zaps > Text to Line-item.

Hope that helps to get you pointed in the right direction. If you run into any issues or have further questions just let us know!


Thanks for this! The problem I’m having now, though, is actually fetching the text properly in the first place. All the AI engines (GPT, Gemini, AI by Zapier) I’m trying either tell me they can’t visit live websites (which shouldn’t be true) or return incomplete text.

For instance, I believe there are 34 paragraphs in this Department of Defense writeup: https://www.defense.gov/News/Contracts/Contract/Article/3916067/, each of which describes a different contract. Yet no matter how I instruct the AIs—fetch the text, copy the text, etc.—in tests they return at most 6 paragraphs, often rewritten or summarized. (And often they don’t even know how many paragraphs they’re returning!) Any idea what instructions (and which agent) I should be using here?


Hey @Matt Gross🙂

Does the RSS trigger output the full set of paragraphs for an article?

If so, you could select them directly from the RSS trigger rather than attempting to get an AI action to visit the web page and extract them. But you may need to use a Formatter action as I mentioned to split up the paragraphs into different line items. 

If it doesn’t then output all the paragraphs then I would suggest using an app like Web Parser by Zapier to extract the content from the web page. Or another app like Browse AI that’s specifically designed for web ‘scraping’ content from a page - see Scrape any website without any code using Browse AI. that should be able to get you the content without rewriting or summarising it.

Do you think that would work?


Reply