Skip to main content

Hi everyone

so, following up on a recent discussion on how to scrape data from web to GSheets via IMPORTXML (THANKS A LOT! 😊) (or, better, some third party GSheet addon for importing XML since IMPORTXML doesn't always work for me somehow, *have you encountered the same?*) https://community.zapier.com/discussion/152/does-anyone-know-of-any-other-smart-ways-to-scrape-data-from-websites#latest)

I now have the issue that the IMPORTXML is still Loading data when my Zap runs and wants to extract the data from the IMPORTXML, so it is sometimes returning "Loading...".

How would you go about this? I can insert a WAIT step of course and only use one row for data extraction and then append the values-only to a second worksheet containing raw data. Is that what you would do? Somehow it doesn't look that sexy to me.



@davidweiss what is wrong with using a delay step in here? delay it by one minute, or maybe even two minutes and it should work. The concern I have is how often this zap is being triggered and if instance A and instance B are triggered too close together instance A will put the URL to import into the sheet, then delay. While it's delaying Instance B could change the URL in the sheet and then delay. Instance A when finished delaying will grab the content of the sheet which will be the results of the URL from instance B... then instance B will finish delaying and also grab the same output.


This can be prevented by adding a "delay after queue" step in front of the writing the URL to the spreadsheet.


Alternatively, you could add a path after grabbing the data from the spreadsheet, and test for "Loading..." if true then delay for 2 minutes and grab the data do what the zap does in Path A... in Path B if the content isn't "Loading..." then do what the zap does. -- I personally do not like this method as it isn't DRY (don't repeat yourself) and you could forget later on that you need to edit both paths to change how things are working. -- to fix that I would keep the paths but call a webhook to trigger a second zap which does the data processing. The first zap simply puts the URL in the spreadsheet and has a few paths to test for "Loading..." once it has the data it passes that along to the second zap (no matter which path or sub-path you're on) which does the data processing.



Reply