Document Redaction

While the most common export from the Affinda platform is structured data to be ingested into a downstream system, Affinda also offers document redaction capabilities on any document type. With these redaction capabilities, we edit the PDF so that the original text is completely removed and not just masked by an overlay.

Resume Redaction

Click here for information about our pre-configured Resume Redactor.

Configuring a Document Type for redaction

The steps to create a new Document Type that is suitable for redaction are very similar to those required for the typical extraction of structure data.

Follow this tutorial to create your new Document Type
Edit every field to ensure ‘Allow Multiple Values’ (found in Advanced Settings) is enabled (this ensures that if a field is repeated within the document, each version is redacted)
Upload documents to view fields to be redacted in the document validation interface
(optionally) Edit and update model predictions
Use the Get Redacted Document endpoint to return a redacted PDF version of the original document

Get in contact with the Affinda team to discuss your redaction use case and to enable a ‘redaction’ setting on your document type that will optimise for this output

Exporting redacted file via API

curl -L -X GET "https://api.affinda.com/v3/documents/<DOCUMENT_ID>/redacted" \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -o redacted_<DOCUMENT_ID>.pdf

Frequently Asked Questions about Redaction

What if the model only redacted the first instance of a field?

To ensure the model redacts every version of the field on your document, you need to enable ‘allow multiple’ in your fields configuration.To do this, go to your Configure Document Type Interface >Locate your field > navigate to Advanced Settings > enable “Allow Multiple Values”

For Redaction, we recommend setting all fields to “Allow Multiple Values” to avoid this issue.

How do I improve the performance of my redaction?

The performance of the redaction is determined by the performance of the underlying extraction model. To improve, we recommend adding more example documents and validating correct example documents to build Model Memory.Follow the Improving Model Accuracy Tutorial for step-by-step instructions.

Overview

Ingestion

Pre-Processing

Splitting & Classification

Extraction

Machine Validation

User Validation

Data Export

Admin Controls

Resume Redaction

Configuring a Document Type for redaction

Exporting redacted file via API

Frequently Asked Questions about Redaction

Overview

Ingestion

Pre-Processing

Splitting & Classification

Extraction

Machine Validation

User Validation

Data Export

Admin Controls

Resume Redaction

​Configuring a Document Type for redaction

​Exporting redacted file via API

​Frequently Asked Questions about Redaction

Configuring a Document Type for redaction

Exporting redacted file via API

Frequently Asked Questions about Redaction