Upload a document for parsing

curl --request POST \
  --url https://{region}.affinda.com/v3/documents \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form url=https://api.affinda.com/static/sample_resumes/example.docx \
  --form collection=mEFayXdO \
  --form 'documentType=<string>' \
  --form workspace=mEFayXdO \
  --form wait=true \
  --form 'identifier=<string>' \
  --form 'customIdentifier=<string>' \
  --form fileName=Document.pdf \
  --form expiryTime=2023-11-07T05:31:56Z \
  --form language=en \
  --form rejectDuplicates=true \
  --form 'regionBias={"country": "vn"}' \
  --form lowPriority=true \
  --form compact=true \
  --form deleteAfterParse=true \
  --form enableValidationTool=true \
  --form useOcr=true \
  --form 'llmHint=<string>' \
  --form 'limitToExamples=<string>' \
  --form 'warningMessages={
  "warningCode": "too_many_pages",
  "warningDetail": "File exceeds maximum number of pages allowed, parsing the first 10 pages only."
}'

{
  "extractor": "<string>",
  "meta": {
    "identifier": "<string>",
    "pages": [
      {
        "id": 2,
        "pageIndex": 0,
        "image": "https://affinda-api.s3.amazonaws.com/media/pages/Page.png?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062",
        "height": 700,
        "width": 500,
        "rotation": 90,
        "imageTranslated": "https://affinda-api.s3.amazonaws.com/media/pages/PageTranslated.png?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062"
      }
    ],
    "workspace": {
      "identifier": "mEFayXdO",
      "name": "<string>"
    },
    "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834",
    "fileName": "Document.pdf",
    "ready": true,
    "readyDt": "2020-12-10T01:43:32.276724Z",
    "failed": false,
    "expiryTime": "2023-11-07T05:31:56Z",
    "language": "en",
    "pdf": "https://affinda-api.s3.amazonaws.com/media/documents/Document.pdf?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062",
    "parentDocument": {
      "identifier": "<string>",
      "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834"
    },
    "childDocuments": [
      {
        "identifier": "<string>",
        "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834"
      }
    ],
    "isOcrd": true,
    "ocrConfidence": 123,
    "reviewUrl": "<string>",
    "documentType": "<string>",
    "collection": {
      "identifier": "mEFayXdO",
      "name": "<string>",
      "extractor": {
        "identifier": "resume",
        "name": "<string>",
        "baseExtractor": "<string>",
        "validatable": true
      },
      "validationRules": [
        {
          "slug": "supplier_name_is_alphanumeric",
          "dataPoints": [
            "<string>"
          ]
        }
      ],
      "autoRefreshValidationResults": true
    },
    "archivedDt": "2023-11-07T05:31:56Z",
    "isArchived": true,
    "skipParse": true,
    "confirmedDt": "2023-11-07T05:31:56Z",
    "confirmedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "isConfirmed": true,
    "rejectedDt": "2023-11-07T05:31:56Z",
    "rejectedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "archivedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "isRejected": true,
    "createdDt": "2023-11-07T05:31:56Z",
    "errorCode": "document_conversion_failed",
    "errorDetail": "Unable to convert word document",
    "file": "<string>",
    "html": "<string>",
    "llmHint": "<string>",
    "tags": [
      {
        "id": 1,
        "name": "<string>",
        "workspace": "mEFayXdO",
        "documentCount": 1
      }
    ],
    "createdBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "sourceEmail": "<string>",
    "sourceEmailAddress": "<string>",
    "regionBias": {
      "country": "<string>",
      "countries": [
        "<string>"
      ],
      "squareCoordinates": [
        123
      ],
      "strict": true
    }
  },
  "data": {},
  "error": {
    "errorCode": "document_conversion_failed",
    "errorDetail": "Unable to convert word document"
  },
  "warnings": [
    {
      "warningCode": "too_many_pages",
      "warningDetail": "File exceeds maximum number of pages allowed, parsing the first 10 pages only."
    }
  ]
}

POST

documents

curl --request POST \
  --url https://{region}.affinda.com/v3/documents \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form url=https://api.affinda.com/static/sample_resumes/example.docx \
  --form collection=mEFayXdO \
  --form 'documentType=<string>' \
  --form workspace=mEFayXdO \
  --form wait=true \
  --form 'identifier=<string>' \
  --form 'customIdentifier=<string>' \
  --form fileName=Document.pdf \
  --form expiryTime=2023-11-07T05:31:56Z \
  --form language=en \
  --form rejectDuplicates=true \
  --form 'regionBias={"country": "vn"}' \
  --form lowPriority=true \
  --form compact=true \
  --form deleteAfterParse=true \
  --form enableValidationTool=true \
  --form useOcr=true \
  --form 'llmHint=<string>' \
  --form 'limitToExamples=<string>' \
  --form 'warningMessages={
  "warningCode": "too_many_pages",
  "warningDetail": "File exceeds maximum number of pages allowed, parsing the first 10 pages only."
}'

{
  "extractor": "<string>",
  "meta": {
    "identifier": "<string>",
    "pages": [
      {
        "id": 2,
        "pageIndex": 0,
        "image": "https://affinda-api.s3.amazonaws.com/media/pages/Page.png?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062",
        "height": 700,
        "width": 500,
        "rotation": 90,
        "imageTranslated": "https://affinda-api.s3.amazonaws.com/media/pages/PageTranslated.png?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062"
      }
    ],
    "workspace": {
      "identifier": "mEFayXdO",
      "name": "<string>"
    },
    "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834",
    "fileName": "Document.pdf",
    "ready": true,
    "readyDt": "2020-12-10T01:43:32.276724Z",
    "failed": false,
    "expiryTime": "2023-11-07T05:31:56Z",
    "language": "en",
    "pdf": "https://affinda-api.s3.amazonaws.com/media/documents/Document.pdf?AWSAccessKeyId=KEY&Signature=SIG&Expires=1663302062",
    "parentDocument": {
      "identifier": "<string>",
      "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834"
    },
    "childDocuments": [
      {
        "identifier": "<string>",
        "customIdentifier": "46ab8b02-0e5b-420c-877c-8b678d46a834"
      }
    ],
    "isOcrd": true,
    "ocrConfidence": 123,
    "reviewUrl": "<string>",
    "documentType": "<string>",
    "collection": {
      "identifier": "mEFayXdO",
      "name": "<string>",
      "extractor": {
        "identifier": "resume",
        "name": "<string>",
        "baseExtractor": "<string>",
        "validatable": true
      },
      "validationRules": [
        {
          "slug": "supplier_name_is_alphanumeric",
          "dataPoints": [
            "<string>"
          ]
        }
      ],
      "autoRefreshValidationResults": true
    },
    "archivedDt": "2023-11-07T05:31:56Z",
    "isArchived": true,
    "skipParse": true,
    "confirmedDt": "2023-11-07T05:31:56Z",
    "confirmedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "isConfirmed": true,
    "rejectedDt": "2023-11-07T05:31:56Z",
    "rejectedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "archivedBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "isRejected": true,
    "createdDt": "2023-11-07T05:31:56Z",
    "errorCode": "document_conversion_failed",
    "errorDetail": "Unable to convert word document",
    "file": "<string>",
    "html": "<string>",
    "llmHint": "<string>",
    "tags": [
      {
        "id": 1,
        "name": "<string>",
        "workspace": "mEFayXdO",
        "documentCount": 1
      }
    ],
    "createdBy": {
      "id": 1,
      "name": "Carl Johnson",
      "username": "carljohnson",
      "email": "[email protected]",
      "avatar": "https://affinda-api.s3.amazonaws.com/media/user-avatar.png?AWSAccessKeyId=KEY&Signature=SIG"
    },
    "sourceEmail": "<string>",
    "sourceEmailAddress": "<string>",
    "regionBias": {
      "country": "<string>",
      "countries": [
        "<string>"
      ],
      "squareCoordinates": [
        123
      ],
      "strict": true
    }
  },
  "data": {},
  "error": {
    "errorCode": "document_conversion_failed",
    "errorDetail": "Unable to convert word document"
  },
  "warnings": [
    {
      "warningCode": "too_many_pages",
      "warningDetail": "File exceeds maximum number of pages allowed, parsing the first 10 pages only."
    }
  ]
}

Authorizations

Authorization

string

header

required

Basic authentication using an API key, e.g. {Authorization: Bearer aff_0bb4fbdf97b7e4111ff6c0015471094155f91}. You can find your API key within the Settings page of the Affinda web app. You can obtain an API key by signing up for a free trial.

Query Parameters

snake_case

boolean

Whether to return the response in snake_case instead of camelCase. Default is false.

Body

multipart/form-data

Document to upload, either via file upload or URL to a file

file

File as binary data blob. Supported formats: PDF, DOC, DOCX, TXT, RTF, HTML, PNG, JPG, TIFF, ODT, XLS, XLSX

url

string | null

URL to download the document.

Example:

"https://api.affinda.com/static/sample_resumes/example.docx"

collection

string

Uniquely identify a collection.

Example:

"mEFayXdO"

documentType

string | null

The document type's identifier. Provide if you already know the document type.

workspace

string

Uniquely identify a workspace.

Example:

"mEFayXdO"

wait

boolean

default:true

If "true" (default), will return a response only after processing has completed. If "false", will return an empty data object which can be polled at the GET endpoint until processing is complete.

Example:

true

identifier

string

deprecated

Deprecated in favor of customIdentifier.

customIdentifier

string

Specify a custom identifier for the document if you need one, not required to be unique.

fileName

string | null

Optional filename of the file

Example:

"Document.pdf"

expiryTime

string<date-time> | null

The date/time in ISO-8601 format when the document will be automatically deleted. Defaults to no expiry.

language

string | null

Language code in ISO 639-1 format. Must specify zh-cn or zh-tw for Chinese.

Example:

"en"

rejectDuplicates

boolean | null

If "true", parsing will fail when the uploaded document is duplicate of an existing document, no credits will be consumed. If "false", will parse the document normally whether its a duplicate or not. If not provided, will fallback to the workspace settings.

Example:

true

regionBias

string

A JSON representation of the RegionBias object.

Example:

"{\"country\": \"vn\"}"

lowPriority

boolean

Explicitly mark this document as low priority.

Example:

true

compact

boolean

If true, the returned parse result (assuming wait is also true) will be a compact version of the full result.

Example:

true

deleteAfterParse

boolean

If true, no data will be stored after parsing. Only compatible with requests where wait: True.

Example:

true

enableValidationTool

boolean

If true, the document will be viewable in the Affinda Validation Tool. Set to False to optimize parsing speed.

Example:

true

useOcr

boolean | null

If true, the document will be treated like an image, and the text will be extracted using OCR. If false, the document will be treated like a PDF, and the text will be extracted using the parser. If not set, we will determine whether to use OCR based on whether words are found in the document.

llmHint

string | null

Optional hint inserted into the LLM prompt when processing this document.

limitToExamples

string[] | null

Restrict LLM example selection to the specified document identifiers.

warningMessages

object[]

Show child attributes

Response

Only returned when wait=True, will return the created document

extractor

string

required

Developer Guide

Developer Tools

Endpoints

Deprecated Versions

Upload a document for parsing

Authorizations

Query Parameters

Body

Response