Field Formatting
Field formatting settings allow users to customize how data is extracted, processed, and displayed in Affinda's platform. These settings ensure the extracted data meets specific requirements and can be easily integrated into downstream workflows.
Field formatting options can be accessed by Admins at the top right corner of the document validation UI, under Configure Fields.
Field Name
The field name represents the label for the extracted data in the validation UI. It is user-defined and helps in identifying the field in the document processing pipeline.
Field Description
Users may optionally enhance the model's predictions by providing additional context—such as how the data is typically labelled and where it appears on the page.
Adding more documents is the recommended way to enhance model accuracy, however, a clear field description can also improve extraction results
Data Type
The data type determines how extracted values are processed and standardized. Different data types are available to ensure that structured and unstructured data is correctly categorized. The selected data type influences the structure of the data and the post-processing logic applied to extracted values, ensuring consistency and accuracy.
Data Type | Formatting Applied |
---|---|
Text | Raw text is retained as-is with no special formatting. Suitable for general strings, labels, or descriptions. |
Dropdown | Extracted values are mapped to a predefined set of labels or categories. See Data Mapping for more information |
Integer | Identifies whole numbers without decimal points (e.g., 15, 2024). |
Number | Extracted numeric values that may include decimal precision (e.g., 123.45). Used for quantities, percentages, etc. By default, the number of decimal places is undefined. |
Currency | Formats monetary values with 2 decimal places by default. |
Date | Standardizes dates into ISO 8601 format for dates (YYYY-MM-DD) |
Date/Time | Standardizes dates into ISO 8601 format for date and time |
Date Range | Formats into a start and end date in ISO 8601 format (YYYY-MM-DD) |
Location | Geocoding is applied to identify and structure the address into street, city, country, etc. fields |
Phone Number | Structured to include country code, international country code, formatted number and national number in the response |
Link | Identifies the url link and the domain |
True/False | Boolean returning True or False only |
Image | URL of the cropped image returned in the response |
Group | Dynamic object that can contain any other data type as children |
Table | Child fields can be created for each row in a table |
Group and Table data types will need 'child' fields.
For image data type extraction, please contact Affinda to learn more
Data Type Settings
Located in the settings cog next to the Data Type dropdown, these settings allow users to control how post-processing is applied to extracted values. Examples of these settings include the number of decimal places to return on number fields and the format of ambiguous dates in a document (e.g. DMY or MDY).
Unparsed values
Sometimes, the raw data extracted from a document is unable to be logically parsed into a 'typed' or standardised value using our pre-built formatting options. In these cases, a warning will appear explaining this and the value shown italicized in lighter font. In these cases, users are encouraged to either edit the annotation to improve extraction accuracy or in the case where the bounding box is correct, edit the value directly by typing the correct typed value.
Text Transformations
Transformations allow users to refine extracted text by applying a natural language prompt. Users can specify how they want text to be cleaned, reformatted, or transformed for better usability.
Affinda processes transformations using either:
- Large Language Models (LLMs) for dynamic text refinement
- Code-based transformations where possible, ensuring minimal variability in standardized data
This applies to fields of text data type only.
Enable Field
This setting determines whether the field should be predicted and visible in the extracted output. Users can toggle this option on or off depending on whether they want the model to extract and display the field.
Allow Multiple Values
Enable this setting if a field might have multiple values. For example, in a loan application form, there may be multiple co-applicants, each requiring their own extracted entry.
Advanced Settings
No Rectangle
Used when a field value does not explicitly appear in the document but can be inferred through reasoning.
Manual Entry Only
The field will not be predicted by the model and can only be entered manually.
Slug
Defines the label used for the field in the API response.

Updated 1 day ago