Using JS and Storage to mitigate webhook race conditions

  • 21 March 2021
  • 4 replies
  • 317 views

Userlevel 1

I sometimes have situations where an external system sends the same event multiple times to a webhook simultaneously, but I only want one of them to make it through. In other words, I need to throttle all of the events variably so they arrive at different times, then use a logic gate to allow only the first one to make it through. (You can’t use the built-in “delay” option, because this will simply delay all of them an equal amount -- they’ll still arrive simultaneously, just… later.)

After continued iteration, this my current solution:

  1. Use the JS variable delay trick first suggested by @AndrewJDavison_Luhhu in this post to force every task run to experience a delay between 0 and 10 seconds (exclusive).
  2. Use Zapier’s Storage as a simple state manager to quickly shut the gate after the first winning task run. I use Python, but you can also use JS. You want to pick a reference for the incoming event that can be used to identify duplicates (e.g., an external ID of some kind), and some sort of timing within which only one event should go through.

In this example, I’m receiving external object IDs and I only want one per day to make it through. With Storage I keep track of every external_id and the most recent date on which it was updated. If that record already equals today, I stop it from proceeding, because another task already updated it. Otherwise, it can proceed, and as part of doing so it sets the value to “today”. Because it happens in a singly Python motion, it should be blazing fast -- faster than the random variable delay between multiple events.

Example:

external_id = input_data['external_id'] # value provided by the webhook event
date_now = input_data['now'][:10] # the date part of {{zap_meta_utc_iso}}
store = StoreClient('3bbb62ee-019b-421c-9a00-b862ef67398d') # the UUID of the StoreClient "app" I already created in Zapier
date_last_stored = store.get(external_id) # retrieve existing date value for this external_id, if any

if date_now == date_last_stored:
continue_workflow = False # It's already been triggered today. Do not continue.
else:
continue_workflow = True # Continue
store.set(external_id, date_now) # Immediately update the stored value so no other tasks make it through today

output = [{'continue': continue_workflow}] # Filter on this in the next action step

 


This post has been closed for comments. Please create a new post if you need help or have a question about this topic.

4 replies

Userlevel 7
Badge +12

Thanks so much for sharing this, @robbie_frame_ai! I’m sure it’ll be really helpful for folks 🙂

@AndrewJDavison_Luhhu have you seen this variation on your variable delay solution?

Userlevel 7
Badge +14

Hi @robbie_frame_ai 

Try using Delay After Queue.

TIP: You can set the Delay for less than 1 minute using decimals for the value.

Delay After Queue

This is an advanced version of Delay For.
Instead of resuming after a set amount of time from now, it will lookup when the last delay for this step or a given shared queue will resume, and use that instead if found.

This action is used mostly to prevent Race Conditions and Rate Limiting, but can be used in any scenario where a Zap or multiple Zaps may trigger many tasks in parallel and you want to force them to run in series.

Userlevel 1

@Troy Tessalone Thanks! I am under the impression anything less than 1.0 is not possible due to the underlying AWS infrastructure. Are you double sure this works? If so, the JS-based variable timer can be permanently retired.

Userlevel 7
Badge +14

@robbie_frame_ai

NOTE: There could be slight variances based on infrastructure load but still better than waiting 1 minute.

TIP: When in doubt, test it out.