Financial institutions cannot simply accept every application for e.g. a mortgage loan or non-life insurance product. To avoid harmful commitments for both bank and client, acceptance is based on selection criteria, often with the help of mathematical models. In these processes, discrimination should be prevented, as we consider it unethical and it is also strictly prohibited. Until recently, it was relatively easy to prevent model discrimination: by omitting variables mentioned in Article 1 of the Dutch Constitution from the model, discrimination could be avoided.
However, the use of advanced machine learning algorithms is becoming more widespread. While such models achieve high performance e.g. as selection criteria, their complexity reduces their transparency. In becoming somewhat more of a ‘black box’, the chance of ‘latent’ discrimination increases. For example, neural networks with different layers can identify very complex patterns.
Anti-discrimination laws are clearly written, but clearly not by data scientists. Let's phrase fairness in the context of a binary decision: you either get a "yes", or you don't. Algorithmic fairness is the field of data science that gives us the perspectives to quantify fairness. Unfortunately, there are multiple 'definitions' of fairness, and they are mutually incompatible. In other words, there’s a choice to be made. Here are three options (loosely based on the guidelines from the UK authority ICO):
As mentioned, these options are mutually incompatible. There is not a right choice, it all depends on the context of the machine learning framework at application. At RiskQuest we have plenty of experience with algorithmic fairness and bias quantification; whether it is identifying the key metrics for the right application or actually doing the bias quantification on the data. If you have interest in joining our team, feel free to reach out to us