Document Splitting

Affinda's AI models are trained to assume a document only has a single instance of an invoice or other document type. However, a file often contains multiple different documents that need to be separated for the model to be effective in picking up all the relevant fields. This is particularly common in the Accounts Payable use case for paper invoices that have been scanned together into a single file.  

Automatic Document Splitting

Customers have the option to turn on 'Automated Document Splitting'. This functionality means that when a file is uploaded, Affinda will check to determine if there are multiple different documents (e.g. multiple invoices) within the original file and apply splits automatically where identified. This saves time and ensures that no documents are missed.

Enabling automatic document splitting

Activating the Automatic Document Splitting feature is easy. Navigate to the Workspace settings within the app and select the appropriate document splitting model for your use case and the documents in you Workspace.

By default, we offer a document splitting model for the Accounts Payable use case. This splitting model has been pre-configured to split when a new invoice or other AP document has been identified.

Additional custom splitting models can be applied to a customer's Organization that understands the specific use case, documents processed and requirements. To learn more about creating a custom splitting model, get in touch with Affinda to find out more.

Editing the split

While the document splitter automatically splits a document, users will still have the option to manually split or combine documents again through the 'Edit' pages functionality. This gives you full control over your documents, even after the initial split (see below for more information).

Manual Splitting (or Editing the Automated Split)

Within the validation interface, we provide the option for users to 'Edit Pages'. This will bring up a new interface that will allow users to split the document into multiple parts, as well as delete irrelevant pages and rotate pages to be the right way up. 

What happens when a document is split?

If there are any edits made to the file, the AI model will re-parse the data to give the most accurate predictions. Any field validations made will be lost.

When a document is split into multiple components, new files are created in your account. These new files are created with a suffix added to the file name (e.g. [filename]_1, [filename]_2, etc..).

Within the API response of the original file, users will also be able to find the identifier of the new files created, so that they can then get the data from these newly created files. The PDF file of the documents is also included in the response so that new documents created can be added to your platform.