Uploading Documents via the API
While a range of API endpoints drive the Affinda solution, uploading documents for processing (and receiving the response) is core to the product.
Request Body
The following parameters may be included in the API POST request to Affinda using the Upload a document for parsing endpoint.
Note, all individual parameters are optional, however, one (and only one) of the following must be specified:
- File or URL
- Workspace or Collection
Parameter | Description |
---|---|
file | File as binary data blob. Supported formats: PDF, ZIP, DOC, DOCX,XLS, XLSX, ODT, RTF, TXT, HTML, PNG, JPG, TIFF, JEPG |
url | URL to a document to download and process |
collection | Uniquely identifier for the Collection to upload the document to - The Collection identifier can be found by either using the Get list of all collections endpoint or through the app - If Collection is specified, the document will not be classified to a different Collection if it is the wrong type (however it may be rejected depending on settings) |
workspace | Uniquely identifier for the Workspace to upload the document to - The Workspace identifier can be found by either using the Get list of all workspaces endpoint or through the app - If Workspace is specified, Affinda will attempt to classify the document into a relevant Collection based on document type |
wait | If "true" (default), will return a response only after processing has been completed. If "false", will return an empty data object which can be polled at the GET endpoint until processing is complete. |
customIdentifier | Specify a custom identifier for the document. |
fileName | The optional filename of the file |
expiryTime | The date/time in ISO-8601 format when the document will be automatically deleted. Defaults to no expiry. |
language | Language code in ISO 639-1 format. Must specify zh-cn or zh-tw for Chinese. |
rejectDuplicates | If "true", parsing will fail when the uploaded document is duplicate of an existing document, no credits will be consumed. If "false", will parse the document normally whether its a duplicate or not. If not provided, will fallback to the workspace settings. |
regionBias | A JSON representation of the RegionBias object. Influences geocoding and other results. |
lowPriority | Explicitly mark this document as low priority. |
compact | If true, the returned parse result (assuming wait is also true) will be a compact version of the full result. |
deleteAfterParse | If true, no data will be stored after parsing. Only compatible with requests where wait: True. |
enableValidationTool | If true, the document will be viewable in the Affinda Validation Tool. Set to False to optimize parsing speed. |
Adjust parameters to reduce parsing time
To optimise response times, include these parameters when submitting a document:
- enableValidationTool: False
- deleteAfterParse: True
- compact: True
Note, with the above parameters, the document can not be viewed in the Affinda interface or the data pulled at a later date.
Updated about 1 month ago