Skip to main content

The question is in the title: how can you create your own deduplication system/steps into a Zap?

But first a quick primer on deduplication!

There are two types of Zap trigger with Zaps:

  1. "Instant" webhook triggers give us whatever they receive at the time, with zero context. This helps them move quickly.
  2. Polling triggers run less often, but save the most recent version of whatever information they received. The next time the Zap checks for new information, it can check against the previous version to see if that info's been sent already, if it hasn't then the Zap will trigger on that information.

The list of information that the Zap has already received is called the deduplication list, because we don't want to trigger on items. You can learn more about deduplication in this help doc.

Sometimes, (for various reasons) webhooks will send information is the same as previous info, and a Zap will fire twice for the same event.

Can you think of any ways to create your own 'deduplication' list/rules?


Awesome question! I've got a few different techniques for handling this.


The first, is to use filters and stop-gaps whenever possible. The way that I have this work is to have a step at the end of every Zap that updates either a Storage by Zapier variable, a Google Sheet, or something similar with a specific value to say that the Zap has run (essentially an external record). Then, I always include a filter at the second step of the Zap to check that value to ensure that it did not already run. If two webhooks are received at the exact instant, then this might not work, but usually they are received with ever the slightest delay between the two, allowing for the value to be set and checked before running the second time.


The second, is for when I'm including a step to create something like a Google Sheet Row, Asana Task, Trello Card, CRM Lead, or Process Street checklist. Many services that connect to Zapier have both a 'Create' Action, and a 'Find' Search. Of these, many have a 'Find and Create if not Found' checkbox option. Whenever possible, use this option to ensure that your Zap defaults to finding the record instead of always creating new ones. Then, if a duplicate comes through, it will just find the record and harmlessly end. If it is running the first time, it will follow through with what you were already going to do by creating the record.



Those are great options @BlakeBailey, thanks!

Of these, many have a 'Find and Create if not Found' checkbox option

That's given me an idea! If you used the Google Sheets Find or Create New Row action, you could use that as a deduplication table in and of itself.

After the trigger, add a Google Sheets Find or Create New Row step. Then add a filter step. Have the filter set to look at the field called 'Zap Data was found' and set it to 'Boolean - is False'. That way a Zap will only continue if a new row was created, and not if it was found. In other words the Zap will only continue if the data is new!



Chiming in on @BlakeBailey's solution - if you're worried about multiple simultaneous triggers, then you can always add a randomised 1-10 second delay after the trigger:

How to create zap delays of less than 1 minute (between 1 - 10 seconds)



Going further with this - it seems to me from reading this solution, this would only work with web hooks from the source software API - not when manually polling the source software API?

I have source software that doesn’t have web hooks enabled, so there is no way to “send” an event or update to Zapier. I have to get Zapier to “poll: get” data.  

The problem is Zapier won’t retrieve anything unless there is a new record in the deduplication field!

So I don’t think I can even get Zapier to pull everything across into an intermediary software like Google Sheets unless there is a new record created.

And it certainly won’t pull over “changed” records, as there is no “changed/update” field to begin with!


Hi @gwpac You’re right that these solutions are more about creating deduplication if you’re using webhooks to get data to a Zap.

As you’re creating a poll, I’m assuming that you’re creating your own app integration using the Zapier Platform, is that right? If you’re having some trouble with that then I’d recommend either creating a new post in the Developer Discussion category with your question, or contacting the Support Team using the contact form: https://zapier.com/app/contact-us

 

I hope that helps!


Hi @Danvers I’m not using the developer platform, but it has been recommended to me. I’m just using a web hooks poll step in my zap.  It works, other than this issue.


Ah, I’m with you now @gwpac. The way that polling works is that the Zap will go to the app and check to see if there are any new records, based on the deduplication criteria that’s specified. If there’s nothing new then, no, it wont retrieve it. 

 

I would definitely create a new post in the Developer Discussion category explaining exactly what you’re trying to do (the information that you’re trying to get, the issue that you’re running into) to see if anyone has any thoughts. It’s also very possible that using the Developer Platform to create a custom integration will be the way to go here. You don’t need to create a whole integration with multiple actions and triggers - you can just build the one that you need. 


Reply