In her book Weapons of Math Destruction, Cathy O’Neil discusses the impact of constructing and then (often indiscriminately) applying behaviour-focused predictive models during decision making tasks. These days such models can influence who to hire, who to fire, how to administer and improve education or health care, who to provide social services to, how to target potential consumers and how to run political campaigns. She labels predictive models that have a wide spread, pernicious, effect Weapons of Math Destruction (WMDs).
O’Neil provides numerous specific examples of WMDs, describing in some detail their capabilities and negative impacts- which, as mentioned above, span many aspects of adult life. She convincingly argues that, because of access to ever increasing amounts of data about individuals, along with the increasing ability to automate decision making using these models, both the reach of these models and their potential negative impacts are swiftly growing.
Over the course of the book, O’Neil uses these examples to draw out two overarching issues with WMDs, and predictive models more generally. The first issue is that, for any predictive model, too much faith may be put in the recommendations or pronouncements of a construct which is, in fact, poorly made and thus functionally bad, in the sense of it being a bad fit for the system it is trying to model.
O’Neil notes that such a model might generate inaccurate predictions or explanations due to generally poor construction (e.g. it was trained on bad data and is drawing incorrect conclusions as a result, or the model is constructed in such a way that there is too much variability in its predictions). However she also points out one particularly problematic type of bad modeling that can occur when the model uses a proxy measure (e.g. race) to indirectly and incorrectly infer category membership (e.g. criminal). O’Neil provides numerous examples of this and how it can lead to a vicious downward spiral for those so mis-classified. She refers to this as the ‘birds of a feather’ problem, and it is connected to the second issue she raises, as well.
The second issue, although O’Neil does not state it in precisely this way, is that models may be functional but not in line with our ideological goals. In this case, O’Neil argues that if an existing model is behaving in an unfair manner, then either it should not be used, or it should be altered to be more fair, even if its predictions are more technically accurate in the current version.
I think each of these two overarching issues can lead to some very interesting discussions, but I suggest that the first issue is at least a little more clear cut, both with respect to a practical way forward and also along the moral dimension.
The solution here would seem to involve, first, determining if a model is broken and then, second, either finding a way to improve it, or stopping its use entirely, if it can’t be fixed. Granted, neither of these steps are actually simple to do in practice. Nor is the concept of ‘a bad model’ particularly clear cut.
Adding to this challenge is the fact that there are circumstances, as noted by O’Neil, which may leave people with a lack of motivation to take these required steps- for example, circumstances in which the presence of a known bad model might be tolerated because it allows people to avoid making difficult decisions themselves.
However, I think most people would agree that those who wish to continue using broken models for this or other reasons are not behaving appropriately. On top of this, there are at least some relatively clear strategies for identifying and evaluating potentially bad models. O’Neil highlights the importance of tracking actual outcomes relative to model predictions and using this feedback to increase the accuracy of the model.
The second issue is, from my perspective, the far more challenging of the two, as is often the case where ethics and ideologies are involved. In this case, we might have a model that functions well- even extremely well. And yet, despite its efficiency and accuracy, it may contravene what we consider to be appropriate boundaries for decision making and other behaviors.
A fairly clear cut example in this case is the harnessing of predictive models to exploit individuals. Most people would agree (in principal, if not in practice), that exploitation is not ideologically acceptable. O’Neil provides the example here of for-profit universities, which relentlessly comb the internet to gain information that can then be used to target vulnerable individuals and convince them to enroll in said universities, at very high cost to both the government and the enrollees, along with perhaps little benefit or chance of success.
And yet, even such an apparently clear cut example proves challenging. At what point does the right of a company to sell to its customers cross the line into coercion and exploitation?
Similarly, O’Neil raises the disturbing specter of pre-crimes, originally described in the haunting Philip K. Dick story ‘Minority Report’, where individuals who have not yet committed crimes are targeted by police because models have predicted that they likely will commit crimes in the future. Once again, in this situation (currently becoming uncomfortably less science fiction and more science fact) the potentially high accuracy of any such model actually intensifies the moral dilemma.
Here, much more so than in the for-profit university example, a question is raised that cannot be easily avoided: at what point does ‘collateral damage’ become acceptable, if ever? If a model is very highly accurate, except for on rare occasions, do the benefits outweigh the harm?
These questions are uncomfortable and difficult to consider, and veer quickly into deep philosophical territory. Fundamental ethical concepts like utilitarianism and deontology suddenly come into play. Their seeming esotericism can be difficult to reconcile with the real and harmful outcomes that may result from the misapplication of models in this way.
As a systems scientist, I might here suggest taking, instead, a detour into systems theory, and considering the effect that these models are having on systems that were not designed with them in mind. More specifically, we might consider whether or not they are, to an unacceptable degree, distorting the original intent of these systems, and what can be done, in response, to keep the systems operating in a functionally desirable manner.
O’Neil does allude to this approach in two parts of her book. First she raises it when discussing the issue of insurance companies, which serve for her as an example of a system that has been functionally distorted away from its original purpose. A second discussion of this occurs when she proposes that we pay attention to how feedback loops from (often connected) WMDs may be broken to prevent the distortion of the systems involved, and how what might be referred to as ‘constructive’ feed back loops could be incorporated into systems instead. A minor quibble here is that she refers to these constructive feedback loops as ‘positive’ feedback loops, but this is somewhat confusing, as ‘positive feedback’ is a term which has a fairly specific technical meaning in systems theory (somewhat surprisingly, in systems theory, positive feedback loops can be destructive and negative in effect, as well as constructive, and negative feedback loops can be good, in the sense that they can stabilize a system) but her point is a good one nonetheless.
Thus, O’Neil leaves us with many examples to consider and some final questions to chew on. Her book could prove eye opening to those who have not yet considered this issue in depth, and validating for those, like me, who have been looking for a way to better articulate their ongoing concerns and efforts to be an ethical data scientist.