In addition to the specific data extracted from the documents, the API response includes field- and document-level metadata to assist with document processing.

Field-level metadata

id
string
Identifier associated with the specific data point
rectangle
object
x/y coordinates for the rectangular bounding box containing the data
pageIndex
number
The page that the data is found on
raw
string
Raw data extracted before any processing and formatting
confidence
number
Overall confidence that the extracted data is correct
classificationConfidence
number
Confidence that the model classified the data correctly
textExtractionConfidence
number
Confidence that text was correctly extracted from the document
isVerified
boolean
Indicates whether the data has been validated by any means
isClientVerified
boolean
Indicates whether the data has been validated by a human
isAutoVerified
boolean
Indicates whether the data was auto-validated
dataPoint
string
deprecated
Unique identifier associated with this data field
contentType
enum
Type of data (text, date, date-time, enum, location, float or decimal)
parsed
string
Parsed data after post-processing and mapping

Document-level metadata

identifier
string
Unique identifier for the document (can be supplied on upload)
fileName
string
Optional file name of the document
ready
boolean
True when the document has finished processing
readyDt
date-time
Date-time when the document became ready
failed
boolean
True if an exception occurred during processing
expiryTime
date-time
ISO-8601 date-time when the document will be auto-deleted
language
string
The document’s language
pdf
string
URL to the PDF version of the document
parentDocument.identifier
string
Identifier of the original document if this one was split
childDocuments.identifier
string
Identifiers of child documents if this one was split further
pages
number
Total number of pages
isOcrd
boolean
Whether OCR was applied to extract text
ocrConfidence
number
Overall confidence in OCR text extraction
reviewUrl
string
Signed URL (60 min) for human review
extractor
string
deprecated
Extractor (AI model) associated with the collection
collection
string
deprecated
Collection that the document belongs to
workspace
string
Workspace containing the collection and document
archivedDt
date-time
When the document was archived
isArchived
boolean
Whether the document is archived
confirmedDt
date-time
When the document was confirmed
isConfirmed
boolean
Whether the document is confirmed
rejectedDt
date-time
When the document was rejected
isRejected
boolean
Whether the document is rejected
createdDt
date-time
When the document was created in Affinda
errorCode
string
Error code if processing fails. See Error Glossary for more details.
errorDetail
string
Error detail if processing fails. See Error Glossary for more details.
file
string
URL to view the original file
tags
string
Tags applied to the document
confirmedBy
string
User who last confirmed the document
archivedBy
string
User who last archived the document
sourceEmail
string
Email file URL if the document was created via email ingestion