Skip to main content

I have a workflow that needs to EDIT an existing image (remove background, adjust lighting, add a shadowing, color correct, etc.) and I want to do this using OpenAI GPT. When testing this in the OpenAI API sandbox, it allows for submitting a file (image such as a jpg) as apart of the prompt. 

EXAMPLE: 

 

However the Zapier “Generate an image” action event does not allow me to include a file in the field, and I get incredibly shody / inconsistent results when I try to simply refernce the file via URL directly in the prompt versus direct upload as tested in the OpenAI Sandbox. 

 


Other action events, such as “Convseration” support file upload. However these configurations limit response to ONLY text and I can not coax ChatGPT to provide the file in plain text format (Base64) using this approach no matter what model I use (including multi-modal models like GPT-5. 


It appears as though Zapier locks down the multi-modal aspect of these models to a limited degree, at least in the response. 

 

 

Does anyone have a viable workaround for this, or can somone from Zapier address when we can see a image upload option on the “Generate an Image” action event for this module. Ideally with the image fidelity options as described in their API documentation here: https://cookbook.openai.com/examples/generate_images_with_high_input_fidelity

Hi ​@VACoffeeGuy, welcome to the Community! 🎉

I checked on this and you’re right. Seems the Generate An Image action doesn’t have any field that would allow you to pass it an existing image for the input and while the Conversation action can accept image files, it won’t not output them. 

It would be awesome to be have this functionality added so I’d suggest contacting our Support team to put in a feature request to add support for image file input in the Generate An Image action. 

In the meantime, you might be able to use a Custom Action or API Request action to pass OpenAI’s API an input image to use to generate the image and set the input_fidelity option to high.

Hope that helps to get you pointed in the right direction. If you run into any trouble on that or need help with anything else just let us know! 🙂


Hey ​@VACoffeeGuy,

Just to add on to above, if you do end up taking the API request path and using Openai API. You may want to look into this particular API endpoint(https://api.openai.com/v1/images/edits) for editing existing images. See more of the image edit API endpoint documentation here- https://platform.openai.com/docs/api-reference/images/createEdit. Hope it helps!

PS: Need professional help? Reach me through my Zapier Solution Partner page here :)


@SamB & ​@Sparsh from Automation Jinn  - The issue that I'm running into with the API approach (I'm actually trying to use it using the python via zapier code) is that GPT takes longer than 30 seconds to generate an image. That's a problem with zapier because it's times out at 30 seconds. I'm pretty certain that with the chat GPT specific generate image module from within Zapier. They've extended that 30 seconds time out for that reason. However, I can't extend that time out using the code module. 

 

I'm currently attempting to try to use Gemini nano banana model instead. Which generates images in about 15 seconds. However, Gemini only returns a base 64 response which zapier struggles with loading as a file object for use downstream in other modules. Zapier evidently requires a file to be " hydrated" but does not appear to provide a mechanism to hydrate an outside file. 


Hey  ​@VACoffeeGuy,

Did you try using POST request in Webhooks by Zapier as it might not have that limit that Code might have? See more here- https://docs.zapier.com/platform/build/troubleshoot-action-timeouts 


@Sparsh from Automation Jinn - Unfortunately it was not documented very well so i didn’t understand the process. From what it looked like I could post a webhook out to another service, then catch the response on the other end. However that means i’ve got to go spin up, mainatin, and pay for a python environment container and keep it running. Honestly if i’ve got to do that i’ll do the entire thing in python instead of low-code in zapier. Defeats the purpose. 


Hey ​@VACoffeeGuy,

Yeah can understand. I think it’s one of the few workarounds to manage the 30s limit but yeah it becomes more technical as you said. The quick way to fix it in Zapier is to have a Zapier Enterprise plan as they have a 2 min time limit. You can see more about it here- https://help.zapier.com/hc/en-us/articles/29971850476173-Code-by-Zapier-rate-limits#h_01J70X4S4TYK18HZT9GCV54VKT. Hope it helps!


Reply