Pre-Processing
Remove duplicate documents
Configure Affinda’s de-duplication to detect and remove duplicate documents during ingestion, keeping your workflows clean and avoiding double-processing.
Settings for removing duplicates can be found for each workspace in the Workflow Settings under ‘Pre-processing’. If Affinda identifies a duplicate and the setting is enabled, the document will be automatically updated to be Rejected. The user can still manually override and place the document back into a Workspace if required.
Affinda uses Document Binary Match to detect and reject duplicates. Documents with matching fields (e.g the same Invoice Number) will not be rejected by enabling Remove Duplicates. To do this, users will need to support this logic on their side and can utilise our APIs to delete unwanted documents.