Search & Match Algorithm

Normalised Scoring

Our search algorithm outputs raw scores for each field in a non-normalized way and they can be arbitrarily large depending on the type of the field. In order to make these scores usable, we need to normalize these scores to give a percentage score that is easily understood and comparable.

To do so, we calculate the maximum possible score for each category by taking the sum of each field's highest score and using that as the reference when calculating the % overall score for each resume.

Worked example

Search parameters: Job title and years of experience are specified

  • First candidate:
    • Job title score = 0.8
    • Years experience = 1
    • Overall score = 1.8
  • Second candidate:
    • Job title score = 1.6
    • Years experience = 0
    • Overall score = 1.6
  • Maximum score:
    • Highest job title score = 1.6
    • Highest years experience score = 1
    • Overall maximum score = 2.6
      • Candidate A = 1.8/2.6 = 69%
      • Candidate B = 1.6/2.6 = 62%

Category weights are applied to category scores to influence the overall percentage score.

Inverse Document Frequency

To determine the match for each category, Affinda’s search algorithm considers the relative frequency that a search term appears within resumes or job descriptions. This means that the category scores will consider how rare a match for this criterion is across the wider pool of candidates and as such associate higher relevance or importance to this particular criterion.

The result of this is that very common and often generic skills or other criteria (such as management, leadership, and communication) will have a much lower impact on the overall category score compared to more specialised things very relevant to the job or industry. See below for some worked examples in action.

Worked examples

Example 1:

Search terms - ’python’, ‘communication’

Candidate A - matches on python but not communication

Candidate B - matches on communication but not python

Result - Candidate A scores higher than Candidate B. The frequency of the term ‘python’ is lower than ‘communication’ across the entire pool of candidates searched.

Example 2:

Search term - job title: ‘reinsurance analyst’

Candidate A – job title is ‘reinsurance associate’

Candidate B – job title is ‘’financial analyst’

Result - Candidate A scores higher than Candidate B. While both candidates match one of the terms in the search, the frequency of ‘reinsurance’ is far lower than ‘analyst’ in the candidate pool so Candidate A scores higher.