Document Splitting

Affinda's AI models are trained to assume a document only has a single instance of an invoice or other document type. However, often, a document will contain multiple invoices that need to be separated for the model to be effective in picking up all the relevant fields. This is particularly common for paper invoices that have been scanned together into a single file.  

Automatic Document Splitting

Customers have the option to turn on 'Automated Document Splitting'. This functionality means that when a document is uploaded, Affinda will check to determine if there are multiple different documents (e.g. multiple invoices) within the original document and apply splits automatically where identified. This saves time and ensures that no documents are missed.

Enabling automatic document splitting

Activating the Automatic Document Splitting feature is easy! Simply navigate to the Workspace settings within the app, and you'll find the option to enable it.

Sensitivity

We understand that each customer may have unique requirements, which is why we provide users with four different modes to adjust the sensitivity of splitting. The mode can be selected in the workspace settings.

Note: A split is required when a document contains multiple documents within itself (e.g. multiple invoices attached together).

The four available modes

  • Leave: Automatic splitting feature is turned off.
  • Conservative: Documents that appear to require a split will present a warning.
  • Recommended: Documents with high likelihood of requiring a split are automatically split. Documents that may require a split will only present a warning.
  • Aggressive: All documents that appear to require a split will be split automatically.

Editing the split

While the document splitter automatically splits a document, users will still have the option to manually split or combine documents again through the 'Edit' pages functionality. This gives you full control over your documents, even after the initial split (see below for more information).

Manual Splitting (or Editing the Automated Split)

Within the validation interface, we provide the option for users to 'Edit Pages'. This will bring up a new interface that will allow users to split the document into multiple parts, as well as delete irrelevant pages and rotate pages to be the right way up. 

What happens when a document is split?

If there are any edits made to the file, the AI model will re-parse the data to give the most accurate predictions. Any field validations made will be lost.

When a document is split into multiple components, new files are created in your account. These new files are created with a suffix added to the file name (e.g. [filename]_1, [filename]_2, etc..).

Within the API response of the original file, users will also be able to find the identifier of the new files created, so that they can then get the data from these newly created files. The PDF file of the documents is also included in the response so that new documents created can be added to your platform.