Skip to content
ignitai Get the app
← Back to blog · · 11 min read

Convert invoice PDF to Excel on Mac (2026 batch workflow)

Turn a folder of vendor invoice PDFs into one clean Excel sheet on Mac — on-device on macOS 14.4+, with line items, totals, tax, and source-file provenance per row.

guides invoices mac bookkeeping

You have forty vendor invoices in a folder on your Mac. Some came from suppliers’ billing portals, some forwarded from Gmail, some saved out of OneDrive by a contractor who insists on PDFs. Your accountant wants them in a single Excel sheet — line items in one tab, header metadata in another, totals reconciled, tax broken out by rate. The deadline is Friday. You’re at the desk because this is a desk job: forty invoices is a Mac batch, not an iPad share-sheet flow.

The single-vendor case is easy. The batch case is what eats Tuesday afternoon. Templates differ. One supplier puts the PO number top-right; another buries it in the line-item description. Half the invoices are clean text PDFs; a third are scans of paper invoices a contractor mailed in. Two are password-protected because the supplier’s billing system encrypts everything.

This guide is the Mac-native version of that workflow: convert invoice PDFs to Excel on Mac, in batch, on-device on macOS 14.4+, with one consolidated XLSX, source-file provenance per row, and totals that reconcile against each invoice’s printed grand total.

Why Mac (not iPad) for invoice batches

The iPad invoice walkthrough exists for the case where one invoice arrives in Mail and you process it on the device that’s already on. The Mac version exists for the batch — the month-end stack, the audit prep, the contractor reconciliation. Three reasons the Mac wins for the multi-invoice case:

  • Drag-and-drop a folder. A folder of forty invoices in one drop. Mobile share sheets handle a few PDFs at once; a folder of forty is a Mac job.
  • Big screen for the spot-check pass. A 27” display with the consolidated Excel sheet on one half and the original invoice PDFs on the other makes the three reconciliation checks (line totals, tax, vendor name) take seconds per invoice rather than minutes.
  • macOS 14.4+ on Apple Silicon runs the on-device model fast. A typical 1–3 page invoice extracts in 3–5 seconds on an M2 or newer Mac; a batch of forty finishes in two to three minutes. The same batch on a hosted pipeline takes longer and means uploading forty pages of supplier and pricing detail.

If your Mac is Intel-era (T2 or earlier), the on-device path isn’t available; ignitai falls back to a hosted pipeline with documented zero retention. For Apple Silicon Macs (M1 onward) on macOS 14.4+, the entire batch runs locally.

Why XLSX (not CSV) is the right output for invoices

For bank statements the right answer is CSV — the destination is QuickBooks or Xero’s bank-import flow, which strips formatting and wants a flat list of transactions. Invoices are different. The destination is usually the spreadsheet itself: a workbook your accountant or AP system uses to track payables, run pivot tables on vendor × month × tax_rate, and reconcile against POs. XLSX is the right format because:

  1. Multi-sheet structure. An invoice export wants two sheets minimum: one for line items (one row per line, many rows per invoice) and one for header/totals (one row per invoice). CSV is a single flat table; jamming both into one CSV means denormalized rows that your pivot tables won’t enjoy.
  2. Number formatting per column. Currency formatting on amounts, percentage formatting on tax rates, ISO date formatting on issue/due dates. XLSX preserves these; CSV is plain text and the receiving spreadsheet has to re-infer types per column.
  3. Formulas survive. A =SUMIF(line_items.invoice_id, A2) in the header sheet that reconciles line-item totals against printed grand totals is a one-time setup in XLSX. In a flat CSV you’d re-derive it every time.
  4. Excel and Numbers both open it cleanly. No encoding question, no “treat first row as header?” prompt, no quoted-string quirks.

For invoices destined for bookkeeping review, XLSX is what you want. CSV is fine if you’re piping into a script.

Why invoice PDFs are harder than they look

“Invoice to Excel” sounds like one problem. It’s actually four, and most converters only handle the first:

  1. Header metadata. Invoice number, issue date, due date, PO number, vendor name and address, your billing address. These are scattered across the top third of the page in boxes, not in a table. A generic table extractor skips them or smashes them into the line-item grid.
  2. Line items. The actual table: description, quantity, unit price, line total, sometimes tax per line, sometimes a discount column. Templates vary wildly between vendors — even two invoices from the same supplier can diverge if they updated their billing software midway through the year.
  3. Totals block. Subtotal, tax (often split across multiple rates — standard VAT, reduced VAT, zero-rated), shipping, discounts, grand total. Visually this is a small table; structurally it’s a set of key-value pairs that need to land in specific columns alongside the line items.
  4. Payment terms and footer. Bank details, payment reference, late-fee policy. Usually irrelevant for bookkeeping export, but sometimes you need the payment reference in a separate column for AP matching.

A tool that grabs “all the tables” gives you a mess. A tool that understands an invoice as an invoice puts header metadata in normalized columns, line items in their own rows with an invoice_id foreign key, and totals in a reconciled summary. That’s what you want exported to Excel — because the whole point of consolidating is the pivot tables.

Method 1: ignitai on Mac (the on-device way)

ignitai handles invoice extraction as a language task, not a table-coordinate task. The full Mac batch flow:

  1. Drag the folder of invoices into ignitai. Or drag a Finder selection of PDFs. Or a mix. ignitai flattens them into a single batch queue — up to 500 invoices in one pass on an M-series Mac. Mixed text-PDF and scanned-PDF inputs are fine; the app routes each file through the right pipeline automatically.

  2. Describe what to extract, once for the whole batch. Plain English. The prompt that works for almost every standard US/EU vendor template:

    “For each line item, return invoice_id (use the invoice number), description, quantity, unit price, line total, and tax rate. In a separate sheet, return invoice_id, invoice_number, issue_date (ISO 8601), due_date (ISO 8601), vendor_name, vendor_address, po_number, subtotal, total_tax, shipping, and grand_total.”

    Save it as a preset. Next month it’s one click.

  3. Pick XLSX as the output format. Two-sheet workbook: line_items and invoices, joined by invoice_id. Currency formatting auto-applied to amount columns, percentage formatting to tax rates.

  4. Hit Extract. ignitai runs each PDF through the on-device model, streams results into the workbook, and shows live per-file progress. A 40-invoice batch on an M2 Mac takes about three minutes.

  5. Review the consolidated workbook. Every row carries a source_file column with the original invoice filename and a source_page column for multi-page invoices. Open in Excel for Mac or Numbers for review.

  6. Re-run failures, not the whole batch. If any PDFs failed (corrupted, password-protected, blank scans), they get listed separately. Fix them, re-run just those files, append to the workbook.

The whole batch lives on your Mac. Forty invoices’ worth of supplier names, prices, and bank details never leave the device.

Method 2: tabula-py + a Python script (the DIY path)

If your invoices all come from one supplier with a stable template and you want to own the pipeline:

brew install tabula-java   # via the tabula-py docs
pip install tabula-py pandas openpyxl

Then a script that opens each PDF, extracts the line-item table, and writes one consolidated XLSX with line-item and header sheets:

import tabula, pandas as pd
from pathlib import Path

line_items, headers = [], []
for pdf_path in Path("./invoices").glob("*.pdf"):
    tables = tabula.read_pdf(pdf_path, pages="all", lattice=True)
    for tbl in tables:
        tbl["source_file"] = pdf_path.name
        line_items.append(tbl)
    # header metadata: separate parsing pass needed (text extraction + regex)

with pd.ExcelWriter("invoices.xlsx", engine="openpyxl") as writer:
    pd.concat(line_items).to_excel(writer, sheet_name="line_items", index=False)
    pd.DataFrame(headers).to_excel(writer, sheet_name="invoices", index=False)

This works for one supplier, one stable template, all text-based PDFs with a lattice (drawn-line) layout. It breaks the moment any of the following happens:

  • The supplier changes its template. tabula infers grids from drawn lines; a redesign re-anchors the grid and column order shifts silently.
  • One invoice is a scan. A contractor mailed in a paper invoice they photographed; tabula returns nothing. You’d layer in pytesseract and accuracy collapses on dense line-item tables.
  • Multi-line descriptions. A SKU on one line and a model number on the next becomes two rows in your sheet unless you write join logic per template.
  • Header parsing. tabula extracts tables. Invoice numbers, dates, and vendor blocks need a separate text-extraction pass with vendor-specific regex per supplier.

For one-vendor stability, the script is a one-time cost. For real-world AP across a small business with twenty different suppliers, the maintenance burden eats the savings.

Method 3: Preview + Numbers (the no-install fallback)

For one clean text-based invoice you don’t want to install anything for: open the PDF in Preview, select the line-item table, Cmd-C, paste into Numbers, use Table → Convert Text to Columns, type header metadata into a separate sheet by hand, then File → Export To → Excel. Works only on clean grids — multi-line descriptions, merged cells, or any scan and the paste lands as a mashed string. For repeat AP work, save the preset in Method 1.

Method 4: Web converters (and why not for invoices)

Half a dozen “PDF invoice to Excel” web tools exist. The reasons not to use them for live AP from a Mac: you’re uploading supplier bank details and pricing to a server you don’t control; free tiers gate at one to three files before paywalls; invoice-aware extraction is rare, so most generic tools miss the header-vs-line-item separation and multi-rate tax breakouts; and the few invoice-specific tools charge per file, which adds up to a recurring fee in exchange for uploading your AP to a third party. For tutorial-level invoices with no real data, fine. For your actual vendor stack, no.

The three reconciliation checks, on Mac

Once the workbook is written, three checks separate “I have a file” from “I have an audit-ready AP sheet”:

  1. Line-total reconciliation. In the header sheet, add a column: =SUMIF(line_items.invoice_id, A2, line_items.line_total). It should equal the extracted subtotal for that invoice. Any divergence means a line was missed, duplicated, or misread. This single check catches almost every extraction error.
  2. Tax-rate sanity. Group line items by tax_rate. Each group’s line_total summed and multiplied by the rate should equal the tax broken out for that rate in the totals block. For multi-rate invoices (say, standard VAT and reduced VAT), this is the only way to catch a line item that landed under the wrong rate.
  3. Vendor name normalization. Sort the header sheet by vendor_name. Look for duplicates that differ only in punctuation or whitespace — “Acme, Inc.”, “Acme Inc”, “Acme Inc.” — and pick one canonical spelling. Push the rule into the saved prompt for next month: “Normalize vendor names to the form already used in the existing AP ledger; ask if uncertain.”

Skip these three and the AP error surfaces during reconciliation, when one invoice is over-paid by $47.83 and you have no idea which.

Importing the XLSX into your accounting tool

Most small-business accounting tools have an Excel-import path for AP. The flow is similar across them.

QuickBooks Online (Bills import):

  1. Expenses → Bills → Batch upload.
  2. Drag the XLSX from Finder.
  3. Map columns: vendor_name → Vendor, invoice_number → Bill no., issue_date → Bill date, due_date → Due date, grand_total → Amount. Leave source_file and source_page unmapped (or import as Memo for the audit trail).
  4. Review the preview, accept.

Xero (Bills CSV/XLSX import):

  1. Business → Bills to pay → New → Upload.
  2. Pick the XLSX.
  3. Map the same fields. Xero handles the line-item sheet separately if you select “Import line items” in the upload dialog.
  4. Confirm and post.

Wave, FreshBooks, Zoho Books: all accept XLSX imports for bills. The column-mapping step is the same idea everywhere — your XLSX has the columns the tool wants, plus a few it doesn’t, and you skip the extras.

Batch mode at month-end: the actual workflow

The full month-end loop, end to end:

  1. In Finder, gather all the invoices that landed in the month into one folder. Subfolders by supplier are fine if ignitai is configured to recurse.
  2. Drag the folder into ignitai.
  3. Apply the saved prompt preset.
  4. Pick XLSX.
  5. Extract.
  6. Run the three reconciliation checks above.
  7. Upload the XLSX to QuickBooks / Xero / your AP tool.
  8. Archive the source folder to iCloud Drive or your accounting-tool’s document store, with the consolidated XLSX next to it.

For a month of forty invoices, this is the difference between Tuesday afternoon and twenty minutes after the second espresso.

When this workflow doesn’t fit

Honest edge cases:

  • Invoices that arrive as EDI or XML (EDIFACT, UBL, PEPPOL). Don’t extract from PDF — your accounting tool’s e-invoicing connector parses the structured payload directly. PDF extraction is for the PDF-only suppliers, which in 2026 is most small vendors but rarely large ones.
  • Subscription invoices that are really receipts. Stripe, AWS, Google Workspace, Apple Developer all have CSV exports under their billing portals. Use those. Extracting from a hundred Stripe receipt PDFs when Stripe will hand you one CSV is a workflow built backwards.
  • Multi-page invoices with a continuation table. Tighten the prompt: “line items continue across pages until the totals block; do not treat each page as a separate invoice.” Save the preset, batch the rest.

Bottom line

For a folder of vendor invoice PDFs that needs to end up as one Excel workbook on Mac, ready for QuickBooks, Xero, or your AP system: install ignitai, drag the folder in, write the prompt once, pick XLSX, hit Extract, run the three reconciliation checks. For a single one-supplier stable workflow you want to own end-to-end, a tabula-plus-pandas script is a valid alternative if you don’t mind maintaining the per-template parsing. For everything else — multiple vendors, scans, mixed templates, anything you’d rather not upload — the on-device Mac batch is the shortest distance from invoice folder to AP-ready workbook.

The single-invoice iPad walkthrough is in the iPad invoice guide; the parallel bank-statement Mac flow (CSV instead of XLSX, transactions instead of line items) is in the Mac bank-statement guide; the broader Mac batch pattern across mixed document types is in the Mac batch guide. Same app, same presets, same iCloud-synced output. The Mac is the desk batch surface; the mobile devices handle capture as invoices arrive.

Get ignitai on the App Store — free download, $19.99/mo unlocks unlimited batch extractions after the 3-day trial.