Explore standard data types that can be extracted in Affinda.
Your field’s data type determines how extracted values are processed and standardized. Different data types are available to ensure that structured and unstructured data are correctly categorized. The selected data type influences the structure of the data and the post-processing logic applied to extracted values, ensuring consistency and accuracy.
Affinda stores two versions of every extracted field: raw (exact text from the document) and parsed (after data-type formatting and other transformations). Both are included in data exports. In the app’s Document Validation interface, only the parsed value is shown by default. To show raw values alongside it, open a document, click the three dots in the top-right, and select “Show Raw Values”.Once enabled, the raw value will be shown in italics underneath the parsed value.
Raw text is retained as-is with no special formatting. Suitable for general strings, labels, or descriptions.Users can adjust the following text type options:
Standardize bullets: Removes bullet point formatting from extracted values.
**Include line break: **Maintains the line breaks from the original text in the extracted data.
Transformations allow users to refine extracted text by applying a natural language prompt. Users can specify how they want text to be cleaned, reformatted, or transformed for better usability. Affinda processes transformations using either:
Large Language Models (LLMs) for dynamic text refinement
Code-based transformations, where possible, ensuring minimal variability in standardized data
Date Format Disambiguation - Specify how the model should interpret ambiguous date formats (e.g., 03/04/2024). Choose the format that best matches your regional preference—DMY for UK-style dates or MDY for US-style dates.Expected Tense - improve predictions by indicating the expected tense of the date (Past or Future).Default Day and Month - Select the default for when dates are missing a day or a month.
Structured to include country code, international country code, formatted number and national number in the response.Users can adjust the following Phone Number options:
Default country code: Specify the default country code when one isn’t found on the document.