I’m currently building a Zap that uses PDF.co to extract text fields (e.g., date, invoice number, billing company, supplier company, amount) from PDF files stored in Dropbox. My goal is to rename these PDF files using the extracted data in the following format:
I have trouble when I reach step 3. Hope someone can assist me on this. Thank you very much
Page 1 / 1
Hi @Golden Fish,
Where are the PDF files coming from? You mentioned checking a Dropbox folder, but I don’t see that in the screenshot of your Zap.
I no longer pay for a PDF.co license, so the way I handle a similar workflow is with ChatGPT (see my Zap template here.)
The ChatGPT Conversation action allows you to directly pass in a file object or a public URL to a file, and then ask questions about it:
Here’s the prompt I use to get a file name with the following naming convention: ‘YYYY-MM-DD Vendor Name’ for invoices:
You are an expert data extraction assistant specializing in analyzing receipts and invoices. Your task is to carefully examine a digital receipt file provided and extract specific information with high accuracy.
Instructions:
1. Access and read the contents of the receipt file from the given file
2. Analyze the document thoroughly to identify and extract the following key information:
a) Vendor Name: The full and correct name of the business or entity that issued the receipt.
b) Receipt Date: The date when the transaction occurred, formatted as YYYY-MM-DD.
Important considerations:
- Ensure the vendor name is extracted exactly as it appears on the receipt, including any legal suffixes (e.g., Inc., LLC, Ltd.).
- For the date, if multiple dates are present (e.g., transaction date, print date), prioritize the actual transaction or purchase date.
- If the date on the receipt is in a different format, convert it accurately to the YYYY-MM-DD format.
- Be vigilant about potential OCR errors and use context clues to verify the extracted information.
- If any information is unclear or missing, indicate this in your response.
Output format:
Please provide the extracted information in the following format in plain text, with no additional commentary: receipt_date - vendor_name
If you encounter any issues or uncertainties during the extraction process, please note them briefly after the JSON output.
Your accuracy in extracting this information is crucial, as it will be used for financial record-keeping and analysis purposes.
Note that, if you don’t have your own OpenAI API key, this doesn’t work with the AI by Zapier action, because that doesn’t (yet) support file objects or URLs as inputs. In that case, the workaround is to introduce some additional steps to extract the raw text from the PDF, and pass that to AI by Zapier. If you want/need to try that approach, I can share more details.
Hi @Golden Fish ,
We just wanted to see how everything is going with your Zap. Did DennisWF's recommendation get the job done? Feel free to reach out if you need further assistance with your Zap. We're glad to address any concerns and assist you.