Confidence
Affinda provides confidence scores for data extracted to help you assess the reliability of data extracted from your documents. Rather than using a confidence value from the model that does not strongly reference the data from the documents validated by users, our approach to confidence is to provide significantly higher weight to documents of a similar format to the uploaded document.
Benefits
- Confidence is not calculated without the context of other validated documents, instead, it pays particular attention to the date from very relevant documents
- The ceiling for how confident the model can be in the predictions is much higher (up to 99%)
Limitations
- While our method of calculating confidence delivers strong results when using Affinda's platform at scale, it will take 2-3 examples of the same document format before confidence will be returned on fields
How Confidence is Calculated
- Fingerprint Matching
When users upload a document, our Fingerprinting algorithm identifies suitable reference document(s) from Model Memory that is provided to the model to help guide the extraction.
- User Validation
Whenever someone validates data from a processed document, we store the data results as our "ground truth" for confidence calculation. Accuracy is measured by comparing the model's predictions with validated annotations.
- Confidence Determination:
For each new document, we look at the accuracy results from up to the last five validated documents that used the same reference document. We calculate an overall accuracy percentage for each field.
Common Questions
What if there's no previous validation for my reference document?
If there are no prior validated data of that document format, we provide annotations without a confidence score.
How many validations are required to see confidence scores?
Only one validated document per reference is required to begin showing confidence scores.
Updated 11 days ago