Types of Data Extraction Models
Affinda’s data extraction models are called ‘Extractors’. These Extractors can be configured in different ways to meet your use case and the way that end customers will interact with the platform.
Types of Data Extraction Models
There are 4 main types of Extractors that customers may use, each with different capabilities.
Extractors | Self-Learning | Description |
---|---|---|
Base Extractors | No | - Pre-built to work out of the box for many customers - No self-learning component so performance will not improve with validation of documents in your Organisation - Has a defined standard schema, of which some fields can be disabled via the Collection settings - e.g., Invoice or Receipts model |
Tailored Base Extractors with Standard Fields | Yes | - A Tailored Extractor is a model that is self-learning– it starts from a Base Extractor and then learns & improve on the document formats confirmed through the validation process - Works out of box and then improves over time - Fields can be disabled as with the Base Extractor |
Tailored Base Extractors with Custom Fields | Yes | - Similar to the above, but with added custom fields for data not captured by the standard schema - Typically, up to a maximum of 5 custom fields per Extractor |
Custom Extractors for a Bespoke Document Type | Yes | - New Extractor created for a document type not currently supported by Affinda’s base extractors - Requires initial setup (annotation + training) before it will predict data - Self-learning capability |
Configuring Extractors
The choice of what type of Extractors to use for Collections within your Organisation will ultimately impact the complexity of the solution, the accuracy of the data extraction over time and the cost.
Please refer to the guides below to understand how tailored and custom models work and their applicable use cases:
Updated 5 months ago