In addition to the specific data extracted from the documents, the API response includes field- and document-level metadata to assist with document processing.

Field Level Metadata

Metadata	Description
id	Identifier associated with the specific data point
rectangle(s)	x/y coordinates for the rectangular bounding box containing the data
pageIndex	The page that the data is found on
raw	Raw data extracted before any processing and formatting
confidence	Overall confidence that indicates the likelihood the data extracted is correct. This considers both classification and text extraction confidence scores
classificationConfidence	A value that indicates the confidence that the model has that the data returned is correct
textExtractionConfidence	A value that indicates the confidence that the text extracted from the document is correct (relevant for scanned documents)
isVerified	Indicates whether the data has been validated, either by a human using our validation tool or through auto-validation rules
isClientVerified	Indicates whether the data has been validated by a human
isAutoVerified	Indicates whether the data has been auto-validated
dataPoint	A unique identifier associated with that data field
contentType	Type of data. Options include text, date, date-time, enum, location, float, and decimal.
parsed	Parsed data extracted after post-processing steps, including reformatting or mapping to a defined taxonomy

Document Level Metadata

Metadata	Description
identifier	A unique identifier associated with the document. Can be specified on upload, or else will be randomly generated by Affinda
fileName	An optional filename of the file
ready	If true, the document has finished processing. Particularly useful if an endpoint request specified wait=False, when polling use this variable to determine when to stop polling
readyDt	The date-time when the document was ready
failed	If true, some exception was raised during processing. Check the 'error' field of the main return object
expiryTime	The date/time in ISO-8601 format when the document will be automatically deleted. Defaults to no expiry
language	The document's language
pdf	The URL to the document's PDF (if the uploaded document is not already PDF, it's converted to PDF as part of the parsing process)
parentDocument.identifier	If this document is part of a split document, this attribute points to the original document that this document is split from
childDocuments.identifier	If this document has been split into a number of child documents, this attribute points to those child documents
pages	The number of pages in the document
isOcrd	Boolean indicating whether the document has had OCR applied to extract text (if false, the text was extracted from an existing text layer in the document)
ocrConfidence	Overall confidence in the accuracy of text extracted from the document by OCR
reviewUrl	A signed URL that is valid for 60 mins that can be used to review and validate the data extracted by the model. Learn more in Embedded Mode.
collection	The Collection that the document is within
extractor	The Extractor that is associated with the Collection. An Extractor is an AI model used to extract data from documents
workspace	The Workspace that the Collection and document is within
archivedDt	The date-time when the document was archived
isArchived	Boolean to show if the document has been archived
confirmedDt	The date-time when the document was confirmed
isConfirmed	Boolean to show if the document has been confirmed
rejectedDt	The date-time when the document was rejected
isRejected	Boolean to show if the document has been rejected
createdDt	The date-time when the document was created in Affinda
errorCode	If the document processing fails, will return an error code
errorDetail	If document processing fails, will detail error identified
file	URL to view the file
tags	Tags applied to documents to enable filtering and searching
confirmedBy	Details of the user that last confirmed the document
archivedBy	Details of the user that last archived the document
sourceEmail	If the document is created via email ingestion, this field stores the email file's URL.