Post-Processing

Affinda completes post-processing on different data point types extracted to ensure that the data is accurate and usable in downstream systems.

Date format

Dates will often be represented differently depending on the region. The most common date format difference is whether the month appears before or after days in the entire date. For example, dates in the United States are often presented as MM/DD/YY, whereas in the United Kingdom they are often presented as DD/MM/YY.

As it can be ambiguous at times which format is correct, we have enabled two features to solve this issue:

  1. Preferred date format

Within the user account settings, we can set a preferred date format for a customer. The default setting for new accounts is DD/MM/YY. To change the user account setting, navigate to your Collection settings. 
Preferred date format options include:

  • DD/MM/YY
  • MM/DD/YY
  • YY/MM/DD
  1. Document-specific format

While a user will have a 'preferred' date format associated with their account and invoices submitted, not every invoice will always follow this format. Where there is a date that 'breaks' this format, all other dates within the document will be extracted in the format of a format that does work.
For example, if the preferred date format is set to 'DD/MM/YY' and there is a date in the document of '06/17/2022', it is clear that this format is actually 'MM/DD/YY'. As a result, all other dates will be parsed according to this format, effectively overriding the preferred format.

Note that all dates will be returned in the extracted data in the ISO 8601 format YYYY-MM-DD.

Decimal separator

The invoice extractor will automatically identify if the decimal separator used in the document is a decimal point (e.g. $1,000.40) or a decimal comma (e.g. €1,000.40). No additional settings are required to enable this, as our AI model will understand this automatically.