Submission and Retrieval Options
When integrating document data extraction (parsing), there are three primary ways to submit documents and retrieve the parsed data:Synchronous Parsing (wait=true)
Synchronous Parsing (wait=true)
Explanation:
- Users submit a document, and the API processes it immediately
- User application waits and the request stays open (
wait = true
) until parsing is complete - When finished, the API directly returns the parsed document data
- Simplest integration
- Immediate retrieval of results after completion
- Suitable for documents that need to be processed quickly
- Not suitable for large documents or high-volume scenarios
- Can lead to timeouts for documents requiring lengthy processing
- Interactive apps where quick, synchronous response times are critical
Asynchronous Parsing with Polling (wait=false)
Asynchronous Parsing with Polling (wait=false)
Explanation:
- Users submit a document and receive an immediate acknowledgment (document ID)
- User application periodically checks (polls) the API endpoint using repeated GET requests to determine when parsing is complete
- Once processing finishes, user retrieves the parsed data.
- Avoids connection timeouts; ideal for longer or variable processing times
- Better suited for handling multiple simultaneous document submissions
- Additional complexity with polling logic
- Generates higher API call volume (frequent polling checks)
- Slight delay between actual completion and data retrieval, depending on polling interval
- High-volume scenarios, large documents, or batch processing jobs where exact completion timing isn’t critical.
Asynchronous Parsing with Webhooks (resthook)
Asynchronous Parsing with Webhooks (resthook)
Explanation:
- Users submit a document, receiving an immediate acknowledgment
- Rather than polling, your application receives a webhook (callback notification) directly when the document is ready for export
- In some cases this will be when the document is finished parsing, but in other use cases this may be when the document has been validated
- After receiving the webhook, your application retrieves the parsed data with a GET request.
- Most efficient asynchronous method—reduces unnecessary polling.
- Lower overall API usage
- Provides real-time notifications upon completion
- Slightly higher setup complexity (webhook listener infrastructure required)
- Real-time workflows or event-driven architectures where timely data retrieval is essential, or API usage optimization is needed
- Scenarios where the application must wait until the document has been fully validated before receiving results
Request Body
The following parameters may be included in the API POST request to Affinda using the Upload a document for parsing endpoint. Note, all individual parameters are optional, however, one of the following must be specified:- File
- URL
Workspace
(and optionally documentType
specified if the document type is known at upload).