Model fairness

Automated decision-making

Financial institutions cannot simply accept every application for e.g. a mortgage loan or non-life insurance product. To avoid harmful commitments for both bank and client, acceptance is based on selection criteria, often with the help of mathematical models. In these processes, discrimination should be prevented, as we consider it unethical and it is also strictly prohibited. Until recently, it was relatively easy to prevent model discrimination: by omitting variables mentioned in Article 1 of the Dutch Constitution from the model, discrimination could be avoided.

However, the use of advanced machine learning algorithms is becoming more widespread. While such models achieve high performance e.g. as selection criteria, their complexity reduces their transparency. In becoming somewhat more of a ‘black box’, the chance of ‘latent’ discrimination increases. For example, neural networks with different layers can identify very complex patterns.

Bias quantification

Anti-discrimination laws are clearly written, but clearly not by data scientists. Let's phrase fairness in the context of a binary decision: you either get a "yes", or you don't. Algorithmic fairness is the field of data science that gives us the perspectives to quantify fairness. Unfortunately, there are multiple 'definitions' of fairness, and they are mutually incompatible. In other words, there’s a choice to be made. Here are three options (loosely based on the guidelines from the UK authority ICO):

  1. Demographic parity: decisions should be representative of the population.
    1. The distribution of people that gets a "yes" should look just like the distribution of all people. So, within statistical error, just as many males as females should get a "yes".
  2. Error parity: decisions should be fair towards those that deserve it.
    1. Does everyone who should get "yes" have an equal chance of getting it? Are we not holding out on males or females that ought to get a "yes"?
  3. Equal calibration: decisions should be correct, fairly for everyone.
    1. For everyone that did get a "yes", did they ought to, with equal chance? Are we not giving males or females more "yes"es than we ought to?

As mentioned, these options are mutually incompatible. There is not a right choice, it all depends on the context of the machine learning framework at application. At RiskQuest we have plenty of experience with algorithmic fairness and bias quantification; whether it is identifying the key metrics for the right application or actually doing the bias quantification on the data. If you have interest in joining our team, feel free to reach out to us