Field Formatting

Field formatting settings allow users to customize how data is extracted, processed, and displayed in Affinda's platform. These settings ensure the extracted data meets specific requirements and can be easily integrated into downstream workflows.

Field formatting options can be accessed by Admins at the top right corner of the document validation UI, under Configure Fields.

Field Name

The field name represents the label for the extracted data in the validation UI. It is user-defined and helps in identifying the field in the document processing pipeline.

Field Description

Users may optionally enhance the model's predictions by providing additional context—such as how the data is typically labelled and where it appears on the page.

📘

Adding more documents is the recommended way to enhance model accuracy, however, a clear field description can also improve extraction results

Data Type

The data type determines how extracted values are processed and standardized. Different data types are available to ensure that structured and unstructured data is correctly categorized. The selected data type influences the structure of the data and the post-processing logic applied to extracted values, ensuring consistency and accuracy.

Data TypeFormatting Applied
TextRaw text is retained as-is with no special formatting. Suitable for general strings, labels, or descriptions.
DropdownExtracted values are mapped to a predefined set of labels or categories. See Data Mapping for more information
IntegerIdentifies whole numbers without decimal points (e.g., 15, 2024).
NumberExtracted numeric values that may include decimal precision (e.g., 123.45). Used for quantities, percentages, etc. By default, the number of decimal places is undefined.
CurrencyFormats monetary values with 2 decimal places by default.
DateStandardizes dates into ISO 8601 format for dates (YYYY-MM-DD)
Date/TimeStandardizes dates into ISO 8601 format for date and time
Date RangeFormats into a start and end date in ISO 8601 format (YYYY-MM-DD)
LocationGeocoding is applied to identify and structure the address into street, city, country, etc. fields
Phone NumberStructured to include country code, international country code, formatted number and national number in the response
LinkIdentifies the url link and the domain
True/FalseBoolean returning True or False only
ImageURL of the cropped image returned in the response
GroupDynamic object that can contain any other data type as children
TableChild fields can be created for each row in a table

Group and Table data types will need 'child' fields.

📘

For image data type extraction, please contact Affinda to learn more

Data Type Settings

Located in the settings cog next to the Data Type dropdown, these settings allow users to control how post-processing is applied to extracted values. Examples of these settings include the number of decimal places to return on number fields and the format of ambiguous dates in a document (e.g. DMY or MDY).

Unparsed values

Sometimes, the raw data extracted from a document is unable to be logically parsed into a 'typed' or standardised value using our pre-built formatting options. In these cases, a warning will appear explaining this and the value shown italicized in lighter font. In these cases, users are encouraged to either edit the annotation to improve extraction accuracy or in the case where the bounding box is correct, edit the value directly by typing the correct typed value.


Text Transformations

Transformations allow users to refine extracted text by applying a natural language prompt. Users can specify how they want text to be cleaned, reformatted, or transformed for better usability.

Affinda processes transformations using either:

  • Large Language Models (LLMs) for dynamic text refinement
  • Code-based transformations where possible, ensuring minimal variability in standardized data

This applies to fields of text data type only.

Enable Field

This setting determines whether the field should be predicted and visible in the extracted output. Users can toggle this option on or off depending on whether they want the model to extract and display the field.

Allow Multiple Values

Enable this setting if a field might have multiple values. For example, in a loan application form, there may be multiple co-applicants, each requiring their own extracted entry.

Advanced Settings

No Rectangle

Used when a field value does not explicitly appear in the document but can be inferred through reasoning.

Manual Entry Only

The field will not be predicted by the model and can only be entered manually.

Slug

Defines the label used for the field in the API response.