Mitigating Bias in Machine Learning

Advances in the field of deep learning combined with increased availability and decreased cost of computational resources in recent years, has led to an explosion in the use of data powered modelling. Machine learning is being used to solve problems across numerous domains including finance, education, employment, housing, the justice system and more. Artificial intelligence undoubtedly offers a path to progress; making our lives easier, improving the efficiency and efficacy of the many industries we transact with day to day; but there are growing and legitimate concerns over how the benefit (and cost) of these efficiencies are distributed.

Of particular interest are sociotechnical systems. These are systems that use algorithms or models to manage people. They make (or inform) decisions, based on incomplete and potentially erroneous representations of who we are. Representations which can be stored and processed by machines. Such systems already play a significant role in determining what we can do, have, see and where we go. The very purpose of codifying a decision policy is to deploy it at scale. But managing large numbers of people inevitably exerts a level of authority and control – the power to shape society. Sociotechnical systems codify societal values often in unintended and opaque ways. As machine learning models become ubiquitous, the need for tools to better analyse, interrogate and manage them, only becomes more urgent.

Mitigating Bias in Machine Learning aims to provide solutions. We start by examining the problem.

Part I looks at the problems and solutions from the widest perspective, zooming in on the problem over the chapters. Chapter one provides context. We take a variety of perspectives, philosophical, political, legal, technical and social. Chapter two is a practical resource for all those involved in the deployment of such systems; model developers, product and engineering managers for example. We discuss how we develop, deploy and manage machine learning systems responsibly. We consider the life-cycle of a machine learning model and present a taxonomy of pitfalls.

From part II onwards, where possible, we take a mathematically rigorous approach to answering questions about fairness.

Part II discusses how we quantify different notions of fairness. We cover a variety of models of fairness (model constraints and related metrics). We compare them to understand their similarities and differences, strengths and weakness. Where possible we relate them to philosophical ideologies of fairness discussed in chapter one.

Part III will analyse a variety of methods for mitigating bias through model interventions. We use the metrics discussed in part II to understand their impact.