Debug low-confidence extraction results with your AI agent

When extraction accuracy is lower than expected, this workflow walks your AI agent through a structured diagnosis — checking the most impactful configuration settings first before suggesting schema changes. Trigger phrases that start this workflow:

“Why are my extractions wrong?”
“Confidence scores are low”
“Field X keeps coming out wrong”
“Can we improve accuracy?”

Before starting, your AI client must be connected to the Affinda MCP server. See MCP Connector and Plugin for setup instructions.

Steps

Identify the document type

If you have not named the document type causing the issue, the agent lists your document types and asks you to confirm which one to investigate.

Read the document type configuration

The agent reads the full document type configuration including field list, field types, transformation_prompt values, and attached validation rules.

Check workspace processing settings

The agent reads the workspace settings. Three settings explain the majority of low-confidence cases:

Setting	Problem scenario	Fix
`ocr_mode: skip`	Document is an image or scanned PDF	Change to `always-partial` or `always-full`
`ocr_mode: always-partial`	Document is a scanned image, not a digital PDF	Change to `always-full`
`enable_document_classification: false`	Multiple document types in one workspace; wrong schema applied to uploads	Enable classification
`model_memory_strategy: manual`	No confirmed examples marked	Switch to `auto`

Setting ocr_mode to skip on a workspace that receives any scanned or photographed documents will produce empty extractions, not just low-confidence ones. This is the most common cause of complete extraction failure.

Sample recent extractions

The agent pulls 3–5 recent confirmed documents and reads their per-field confidence scores. The pattern tells you where to focus:

Pattern	Diagnosis
One field consistently low across documents	That field’s `transformation_prompt` needs refinement, or the field type is wrong (e.g. `text` where `date` is expected)
All fields low on specific documents	Those documents likely have an OCR-mode mismatch — check whether they are scans vs digital
All fields low across all documents	Upstream issue — check OCR mode first, then classification, then field schema

Check manual annotation patterns

The agent calls list_recent_field_annotations to see which fields users have been correcting most frequently. Repeated manual corrections on the same field are a strong signal that the field’s prompt or type is wrong.

Review recommended changes

Based on the evidence gathered, the agent presents at most three recommendations in order of expected impact:

OCR mode change — largest single lever; takes effect on the next upload.
Field-level fix — refine the transformation_prompt on the problematic field, or change its type if the current type is wrong.
Validation rule — add a rule to flag suspect extractions for human review rather than relying on raw confidence.

The agent shows you the supporting evidence (for example: “8 of 10 sampled documents had low confidence on vendorAddress”) before proposing changes.

Approve changes

The agent will not apply any changes — update_workspace, update_field, or create_validation_rule — until you explicitly approve each one. Review the recommendations and confirm which you want applied.

What this workflow does not fix

Genuinely ambiguous or unreadable documents

Some documents are legitimately difficult — low-quality scans, handwritten notes, inconsistent layouts. Confidence will be low on these regardless of configuration. The workflow flags this case rather than suggesting spurious fixes.

Sparse training signal on a new document type

A brand-new document type with fewer than ten confirmed documents will improve substantially over the next ten confirmations regardless of configuration tuning. If you have only just created the document type, the best action is to confirm a batch of documents and let model memory do its work.

OCR provider issues

If always-full OCR is producing unusable or garbled text, the problem is upstream of the schema. The agent will surface this and suggest contacting support@affinda.com rather than continuing to chase field-level fixes.

Human review queue — if users are manually correcting many fields, run the review queue workflow to catch patterns early.
Configuration guide: Confidence — how confidence scores are calculated and how to interpret them.
Configuration guide: OCR — OCR mode options explained.

​Steps

​What this workflow does not fix

​Related

Steps

What this workflow does not fix

Related