Handling deduplication on restarting a Zap after a month?

  • 10 March 2022
  • 3 replies

Userlevel 1

Scenario : 
I have a Zap which I ran for a whole month and switched it off. Consider there are 10000 ids in Deduplication table.

Now Exactly after a month I restarted the zap. There are 10000 new records in the source system and the starting poll now returns 10000 new records. 

How the Zap is going to handle this 10000 new records considering the constraints we have on zap execution i,e. 30 seconds and the zap considers only the top 100 records.

This post has been closed for comments. Please create a new post if you need help or have a question about this topic.

3 replies

Userlevel 7
Badge +9

When a user pauses a Zap and turns it back on the de-dup table is reset and repopulated. Only new stuff created after the Zap was re-enabled will be processed. The 10000 objects will be ignored. 

If Zapier pauses a Zap because monthly Zaps were exceeded or some other reason, those tasks may be replay-able via Zap History 

To process that volume of tasks at once, to “catch up” you might have a look at the new Transfer feature. To see how to enable support in your integration see this article.

Userlevel 1

@Zane Thanks fir the reply.
I have couple of more scenarios that needs answers for. 

Scenario 2 :

The API which I use in the Zap may return more than 10000 records at once. Considering 30 second execution time per Zap, I need to have a filter on timestamp on or before the last successful zap execution. But even it is not possible right now to know the time stamp of last successful execution. How can I handle this scenario?

Scenario 3 :

As It is mentioned in the doc that zapier picks only the top ~100 Ids for every zap, there can be a scenario where there are 200 new Ids my API returns from the last successful zap execution time and this can be a valid scenario for my application. what happens to those left out 100 records and is there any way that we can handle this?

@Zane It would be helpful for me if you provide the answers for these questions. Thanks in advance.

Userlevel 7
Badge +9

With 10,000 objects to handle at once, I’d look to a Transfer Zap, rather than a regular Zap (see links above).

If a running Zap with a polling trigger ends up bringing back, say, 200 new ids, Zapier will process the latest 200 and throttle/enqueue the others. The user gets an email that they had some throttled tasks and can go to the History UI choose to run them from there. 

To take complete control of this have a look at REST Hook triggers. Your API will push new objects as soon as they’re created. The deduplicator is not involved (your app will be responsible for not sending the same event twice). Your hook sending implementation could spread out the messages so you’re not feeding Zapier 10,000 objects at once.

Note, as you’re thinking through this volume of data, please consider that your users will need a Zapier subscription that includes an appropriate number of tasks to support the workflow, and set this expectation very clearly with them. This is a factor in the design of these limits. Your users will not happy be happy if you suddenly eat through their monthly plan by surprise. The Transfer feature is more squarely targeted and folks with a desire to process lots of data at once.