> ## Documentation Index
> Fetch the complete documentation index at: https://docs.affinda.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Set up an invoice extractor with your AI agent

> Step-by-step guide to creating an Affinda workspace and invoice extraction schema through an MCP-connected AI agent, covering OCR mode, fields, and validation.

This workflow walks through everything your AI agent does when you ask it to set up an invoice extractor. You can follow along to understand each step, or simply ask your agent and let it run the sequence for you.

**Trigger phrases that start this workflow:**

* "Set up an invoice extractor"
* "I need to extract invoice data"
* "How do I process invoices in Affinda?"

<Note>
  Before starting, your AI client must be connected to the Affinda MCP server. See [MCP Connector and Plugin](/integrations/mcp-connector) for setup instructions.
</Note>

***

## Steps

<Steps>
  <Step title="Locate your organisation">
    The agent calls `list_organizations` to find your Affinda organisation. If you have more than one, it will ask which one to use. If none are returned, your Affinda account does not yet have an organisation — sign in to the Affinda app to create one first.
  </Step>

  <Step title="Create the workspace">
    The agent creates a workspace named "Invoices" (or a name you specify) with these defaults:

    | Setting                          | Default          | When to change                                                                         |
    | -------------------------------- | ---------------- | -------------------------------------------------------------------------------------- |
    | `visibility`                     | `organization`   | Change to `private` if only one team should access it                                  |
    | `ocr_mode`                       | `always-partial` | Change to `always-full` if you receive scanned or photographed invoices                |
    | `enable_document_splitting`      | `false`          | Enable if you upload batches of invoices as a single file                              |
    | `enable_document_classification` | `true`           | Leave enabled to catch mis-uploaded documents                                          |
    | `reject_duplicates`              | `true`           | Leave enabled to avoid processing the same invoice twice                               |
    | `model_memory_strategy`          | `auto`           | Leave on `auto`; accuracy improves noticeably after the first \~10 confirmed documents |

    <Tip>
      If you receive scanned invoices, tell your agent explicitly: *"We receive scanned invoices by email."* The agent will set `ocr_mode` to `always-full`, which is the most important single configuration choice for image-based documents.
    </Tip>

    If you upload batches containing multiple invoices per file, mention this and the agent will enable splitting and select the appropriate document splitter automatically.
  </Step>

  <Step title="Create the document type">
    The agent creates an "Invoice" document type under your organisation and attaches it to the workspace.
  </Step>

  <Step title="Create the field schema">
    The agent bulk-creates all fields in a single call. The standard invoice schema is:

    | Field slug              | Label          | Type  |
    | ----------------------- | -------------- | ----- |
    | `invoiceNumber`         | Invoice Number | text  |
    | `invoiceDate`           | Invoice Date   | date  |
    | `dueDate`               | Due Date       | date  |
    | `vendorName`            | Vendor Name    | text  |
    | `vendorAddress`         | Vendor Address | text  |
    | `billTo`                | Bill To        | text  |
    | `subtotal`              | Subtotal       | float |
    | `taxAmount`             | Tax            | float |
    | `totalAmount`           | Total          | float |
    | `currency`              | Currency       | text  |
    | `lineItems`             | Line Items     | table |
    | `lineItems.description` | Description    | text  |
    | `lineItems.quantity`    | Quantity       | float |
    | `lineItems.unitPrice`   | Unit Price     | float |
    | `lineItems.amount`      | Amount         | float |

    You can ask the agent to adjust this schema before or after creation. Adding fields later is fine, though fields added early benefit from model memory on all subsequent documents.
  </Step>

  <Step title="Add validation rules (optional)">
    The agent will offer to add two validation rules:

    1. **Subtotal + tax = total** — flags documents where the arithmetic doesn't add up.
    2. **Line-item sum = subtotal** — flags documents where individual line amounts don't match the subtotal.

    These rules send flagged documents to the review queue rather than auto-confirming them. Accept or decline based on your workflow.
  </Step>

  <Step title="Workspace is ready">
    Your workspace and document type are configured. You can now:

    * Upload invoice files via the Affinda app (drag and drop) or the API.
    * Use the agent to upload: *"Upload this invoice: \[URL or file path]."*
    * Ask the agent to check the review queue: *"What invoices are waiting for review?"*
  </Step>
</Steps>

***

## Variants

<AccordionGroup>
  <Accordion title="Tax-inclusive jurisdictions">
    If your invoices do not show a separate subtotal and tax (e.g. all amounts are GST-inclusive), remove `subtotal` and `taxAmount` from the schema and keep only `totalAmount`. Tell the agent: *"Our invoices only show a total amount inclusive of tax."*
  </Accordion>

  <Accordion title="Multi-currency invoices">
    For organisations that receive invoices in multiple currencies, connect the `currency` field to a data source of accepted ISO currency codes. Ask the agent: *"Connect the currency field to a list of accepted currencies."* The agent will walk through the `connect-validation-data` workflow.
  </Accordion>

  <Accordion title="Purchase order reference">
    If you need to match invoices against purchase orders, add a `purchaseOrderNumber` (text) field. Tell the agent: *"Add a purchase order number field."*
  </Accordion>

  <Accordion title="Recurring vendor matching">
    Connect the `vendorName` field to a data source of approved vendors so the system flags invoices from unknown suppliers. Ask: *"Validate vendor names against our approved vendor list."*
  </Accordion>

  <Accordion title="Batch files (multiple invoices in one PDF)">
    If your accounting software exports multiple invoices as a single PDF, enable document splitting. Tell the agent: *"We receive batches of invoices as a single file."* The agent will enable splitting and select the General Document Splitter automatically.
  </Accordion>
</AccordionGroup>

***

## What to expect after setup

* The first few uploads will be extracted immediately, but confidence on vendor-specific fields may be moderate until model memory has seen confirmed examples.
* After approximately 10 confirmed documents, extraction accuracy improves noticeably for recurring vendors and layouts.
* If accuracy is still low after sufficient uploads, see [Debug low-confidence results](/integrations/mcp-workflows/debug-low-confidence-results).
