Glossary of Terms

Platform Structure

Organization

Consider an Organization as a central hub that contains all your Workspaces, Document Types, and uploaded documents which are accessible when you log into the Affinda platform. Typically, the name of your Organization matches the company name provided during initial registration.Within Affinda, an Organization functions as a collaborative space accommodating multiple users. The Organization Owner, who initially sets up the trial account, can manage the accounts of other users, assigning and adjusting their access to specific queues as necessary.If your company’s Organization account already exists, it is recommended to create additional user accounts directly within user settings. Inviting colleagues in this manner allows them immediate access without having to complete the trial registration process.

Workspace

A Workspace enables you to organize related document processes efficiently. Each Workspace can handle one or several Document Types and is typically used as a broader organizational structure, representing either a specific client (useful for business process outsourcing companies) or a particular department within your organization.

Document Type

A Document Type defines a category of documents that you wish to classify and extract information from. Documents of the same type should have similar structural and semantic characteristics, and require extraction of the same fields. This grouping allows for consistent and efficient use of extraction rules or models. Common examples include invoices, purchase orders, and bank statements.

Collection (Legacy)

Collection is a legacy term that was previously used to refer to what is now called a Document Type. If you encounter references to “Collection” in older documentation or system interfaces, it refers to the same concept as Document Type.

Concepts

OCR

**OCR (Optical Character Recognition) **is the technology that converts text in scanned documents or images into machine-readable text. In Affinda, OCR enables the platform to extract and process data from non-editable files like PDFs and images.

Fingerprinting

An advanced algorithm that identifies similar documents by analyzing unique textual and visual features. This creates a distinctive ‘fingerprint’ for each document, enabling precise matching and retrieval of relevant examples from Model Memory. These examples are then provided to the model to enhance accuracy and context awareness when processing newly uploaded documents.

Model Memory

Model Memory is a validated set of reference data and documents that Affinda’s models use to enhance accuracy over time. By leveraging Retrieval-Augmented Generation (RAG), Model Memory enables Affinda to dynamically reference previously validated documents, allowing the model to predict future documents more accurately without requiring constant retraining.

Human in the loop (HITL)

A process where human input is included in an AI-driven workflow to review, correct, or approve results. In Affinda, this typically occurs during the validation stage, where users verify and adjust extracted data to ensure accuracy before it’s used downstream.This approach combines the speed of automation with the accuracy and judgment of human oversight. Affinda offers a simple and intuitive interface for HITL.

Annotation

The manual labeling of data fields by drawing a box over the field in documents to help train or refine the AI model.

Confirming a Document

The process of reviewing and finalizing a document in the Affinda validation interface. When a document is confirmed, it signals that all extracted data has been reviewed and is accurate—no further changes are needed. Confirmed documents are used by the Model for continuous learning to improve accuracy.

Affinda Functions

Splitting

The process of automatically separating a multi-document file (like a PDF with multiple invoices) into individual documents for more accurate processing.

Classify

The step where Affinda identifies and labels the type of each document (e.g., invoice, resume, contract) to then route it to the correct workflow and extraction model.

Extraction

The process of identifying and pulling out specific data fields (such as names, dates, amounts) from a document into a structured and usable format.

Validation

Extracted data can be validated automatically using data mappings and rules to ensure accuracy, or manually reviewed through a human-in-the-loop process.

Overview

Ingestion

Pre-Processing

Splitting & Classification

Extraction

Machine Validation

User Validation

Data Export

Admin Controls

Platform Structure

Concepts

Affinda Functions

Overview

Ingestion

Pre-Processing

Splitting & Classification

Extraction

Machine Validation

User Validation

Data Export

Admin Controls

​Platform Structure

​Concepts

​Affinda Functions

Platform Structure

Concepts

Affinda Functions