Metadata

In addition to the specific data extracted from the documents, the API response includes field- and document-level metadata to assist with document processing.

Field-level metadata

string

Identifier associated with the specific data point

rectangle

object

x/y coordinates for the rectangular bounding box containing the data

pageIndex

number

The page that the data is found on

raw

string

Raw data extracted before any processing and formatting

confidence

number

Overall confidence that the extracted data is correct

classificationConfidence

number

Confidence that the model classified the data correctly

textExtractionConfidence

number

Confidence that text was correctly extracted from the document

isVerified

boolean

Indicates whether the data has been validated by any means

isClientVerified

boolean

Indicates whether the data has been validated by a human

isAutoVerified

boolean

Indicates whether the data was auto-validated

dataPoint

string

deprecated

Unique identifier associated with this data field

contentType

enum

Type of data (text, date, date-time, enum, location, float or decimal)

parsed

string

Parsed data after post-processing and mapping

Document-level metadata

identifier

string

Unique identifier for the document (can be supplied on upload)

fileName

string

Optional file name of the document

ready

boolean

True when the document has finished processing

readyDt

date-time

Date-time when the document became ready

failed

boolean

True if an exception occurred during processing

expiryTime

date-time

ISO-8601 date-time when the document will be auto-deleted

language

string

The document’s language

pdf

string

URL to the PDF version of the document

parentDocument.identifier

string

Identifier of the original document if this one was split

childDocuments.identifier

string

Identifiers of child documents if this one was split further

pages

number

Total number of pages

isOcrd

boolean

Whether OCR was applied to extract text

ocrConfidence

number

Overall confidence in OCR text extraction

reviewUrl

string

Signed URL (60 min) for human review

extractor

string

deprecated

Extractor (AI model) associated with the collection

collection

string

deprecated

Collection that the document belongs to

workspace

string

Workspace containing the collection and document

archivedDt

date-time

When the document was archived

isArchived

boolean

Whether the document is archived

confirmedDt

date-time

When the document was confirmed

isConfirmed

boolean

Whether the document is confirmed

rejectedDt

date-time

When the document was rejected

isRejected

boolean

Whether the document is rejected

createdDt

date-time

When the document was created in Affinda

errorCode

string

Error code if processing fails. See Error Glossary for more details.

errorDetail

string

Error detail if processing fails. See Error Glossary for more details.

file

string

URL to view the original file

Developer Guide

Developer Tools

Endpoints

Deprecated Versions

Field-level metadata

Document-level metadata

Developer Guide

Developer Tools

Endpoints

Deprecated Versions

​Field-level metadata

​Document-level metadata

Field-level metadata

Document-level metadata