Skip to main content
Question

Automating PDF File Renaming for Invoices and Letters in Dropbox


I need to automate the process of reading and renaming PDF scans uploaded to Dropbox. The PDFs are either invoices/bills from various vendors or official letters (mainly from tax authorities). The goal is to:

  1. Detect a new file upload in Dropbox.
  2. Extract content from the PDF.
  3. Rename the file based on its content using the formats:
    • Invoices/Bills: <invoice_issuer_name>-<invoice_number>-<invoice_date>.pdf
    • Letters: <short_letter_subject>-<letter_issued_date>.pdf
  4. Save the renamed file in Dropbox.

I currently use Dropbox + OpenAI + PDF.co via Zapier, but parsing is messy and inconsistent due to varying PDF formats. Is there a better, more reliable way to automate this process? Any tool or workflow recommendations would be appreciated.

Thanks in advance!

Did this topic help you find an answer to your question?

7 replies

JammerS
Forum|alt.badge.img+6
  • Zapier Staff
  • 2344 replies
  • February 26, 2025

Hi ​@Demonil,

 

Welcome to the Community.

 

Extracting specific data from various PDF formats can be challenging, but using OCR tools like Adobe Acrobat Pro may help. Document parsing tools like Docparser or Parseur and AI-based solutions like Rossum can also improve accuracy and integrate with Zapier. Once the data is extracted, Zapier can automate the renaming and saving of files. Success depends on document consistency, so consulting support teams for these tools may be beneficial.

 

I hope this helps. Please let us know if you have any more questions or issues.


michaeltoth
Forum|alt.badge.img+1

Hey ​@Demonil,

Can you share more specifics about the issues you’re having with your current setup? I built something similar for a client before, and I also used Dropbox, OpenAI, and PDF.co with success. 

I had the most success when I split the OpenAI requests into a few different steps, rather than trying to do all of it in a single prompt. In your case, it would probably be something like this:

  1. Trigger: New File in Folder in Dropbox
  2. Action: Filter - file extension is .pdf
  3. Action: OpenAI Conversation with Assistant
    1. Upload the PDF file
    2. Ask assistant to identify whether the file is an invoice or a letter. Provide examples and details
  4. Action: OpenAI Extract Structured Data
    1. Extract Letter or Invoice
      1. This lets you extract the classification as a single word, which we can create paths based off of. This is more reliable, as the Conversation with Assistant step may return this info in multiple formats
  5. Paths: One path for Letter and one for Invoice
  6. Letter Path
    1. Action: Conversation with Assistant
      1. Ask assistant to extract letter subject and letter issue date
    2. Action: OpenAI Extract Structured Data
      1. Extract letter subject and letter issue date. As before, more reliable and you can control formatting better
  7. Invoice Path
    1. Action: Conversation with Assistant
      1. Ask assistant to extract issuer name, invoice number, and invoice date
    2. Action: OpenAI Extract Structured Data
      1. Extract issuer name, invoice number, and invoice date. As before, more reliable and you can control formatting better

Let me know if this helps you to solve your issue!


  • Author
  • Beginner
  • 3 replies
  • February 27, 2025

Hey ​@michaeltoth!

First off, thanks a lot for the detailed instructions! I’ve made it to Step 6, but I had to upload the PDF to OpenAI separately using the File Upload action (Step 3). That part works fine—OpenAI correctly identifies the document type as either an invoice or a letter.

So, my current Zap looks like this:

 

 

📌 The Issue:
I’m unable to pass the already uploaded file to "Conversation with Assistant" in later steps. The assistant doesn’t recognize the file, and at Step 7 and Step 9, I receive the following response:

"It seems I couldn't find the specific details directly from the document. Please provide more context or details from the document, such as the section where these details might be located, or try uploading the document again if there was an issue."

Any ideas on how to make OpenAI Conversation With Assistant recognize the previously uploaded file?


michaeltoth
Forum|alt.badge.img+1

Hey ​@Demonil,

In the Dropbox screenshot you sent over, it looks like the file you’re testing with is example data from Dropbox (it says Example Input ABC Corporation). Most likely, they did not include an actual file for the testing. Try this:

  1. Go to your Dropbox trigger
  2. Go to the Test tab
  3. Click Find new records to pull up some files from your Dropbox account
    1. If nothing comes up, manually add a file to the right folder in Dropbox and try again
  4. Select a new file and at the bottom click Continue with selected record
  5. Go back to your OpenAI step and test again. It should now be using the file you are testing with, which should hopefully get it to work properly!

  • Author
  • Beginner
  • 3 replies
  • February 27, 2025

Hey ​@michaeltoth!

I’ve updated the response. Apologies for the delay! As I mentioned earlier, it recognizes the type after adding an extra action(File Upload), but now I can't read the file content to retrieve the IDs, dates, etc, even though it has already been uploaded and processed by the OpenAI Assistant. 


michaeltoth
Forum|alt.badge.img+1

Hey ​@Demonil,

I remember running into similar issues. Two things:

  1. Make sure that you tell the assistant which file to use:
    1. In step 3, when you upload the file, it should return some type of file ID
    2. In step 4, in the conversation, there should be a field (I can’t remember the exact name) that requests a file. Pass the file ID from step 3 here
  2. Insert a delay step between steps 3 and 4. OpenAI takes a bit of time to process the new file upload, and this accounts for that. Try something like 30 seconds or 60 seconds

  • Author
  • Beginner
  • 3 replies
  • March 2, 2025

Hey ​@michaeltoth,

Thank you very much for your ideas and guidance. I've implemented a solution with some minor adjustments - I especially appreciated the suggestion to separate the different correspondence types.

Below is the latest version that works. I realize it's not perfect(and it’s the understatement :) ) yet and it uses an extra tool (PDF.co), so I'm considering it version 0.1