“Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by Cathy O’Neil.
Crown, 2016, 259 pp., $26 cloth, ISBN 9780553418811
In her book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy,” Cathy O’Neil shows how businesses, governments and other organizations have recently come to depend on mathematical models to inform their choices, and she illustrates the consequent dangers to which we are vulnerable. A book about the uses of mathematical models runs the risk of being a little dry, but, fortunately, O’Neil is an engaging writer and a good storyteller. The book should be accessible and entertaining to a wide audience. Its lessons matter to all of us, whether we are the ones applying the mathematical models or the ones to whom they are applied.
O’Neil explores these issues from the perspectives of an expert and an insider. Having spent almost a decade as a math professor, she has the curiosity, ability and skill to grapple with the mathematical models at their most complex. As a former data scientist, first for a large hedge fund and then for several tech startups, she has built and applied mathematical models herself. So she has had the opportunity to see for herself how these models stack up against the messy realities they are intended to represent. This background gives O’Neil an unusually nuanced sensitivity to the sundry applications of mathematical models.
The Idea of a “Weapon of Math Destruction”
O’Neil organizes her exposition and critique around the concept of a “weapon of math destruction,” or WMD. (The choice of this label and the familiar acronym is no doubt intended to emphasize the gravity of the dangers of weapons of math destruction, even though mathematical models do not typically incite the visceral fear that explosives do.) According to O’Neil, WMDs are mathematical models that have three characteristics: opacity, scale and damage. A brief look at each of these elements will be helpful.
An opaque model is one that is invisible, or at least inscrutable, to the people it affects. Although people might sometimes be aware that a mathematical model guides how they are treated, they cannot tell how it works. Worse, the model’s innards might not be understood even by the people who are applying it. As an example of opacity, consider the way machine learning systems sort us into “behavioral tribes.” O’Neil explains that these systems sort us on the basis of data they are fed about our behavior – for instance, our web browsing and our location history, as tracked and reported by our cellphones. Our tribe affiliation could determine what ads, entertainment, and news we are shown, or it might even affect our insurance premiums. The novelty here is that these so-called tribes may not correspond to any groupings we humans would recognize, but instead are due to patterns discovered by computers through data mining. So, even if we could examine these computers that sort us, it is unlikely we would be able to interpret their operations in terms of familiar concepts. Furthermore, system designers themselves may not be able to fully interpret the system’s behavior.
The scale of a model is simply how pervasive and vast its influence is. A model that impacts the lives of thousands of people, several times a day, has greater scale than one that affects a few hundred people a few times a year. In the examples O’Neil discusses, the models scale so massively that they reshape widespread cultural norms and values. In a compelling discussion, she shows how the U.S. News rankings of colleges transformed not just the criteria by which colleges are judged, but also the institutional aims of many U.S. colleges.
Finally, a damaging model is one that works against the interests of the people it affects. O’Neil sometimes equates damage with unfairness, but the sorts of damage she describes are diverse. In short, there are as many sorts of damage as there are ways people’s interests can be harmed. One point worth mentioning, which becomes clear when perusing O’Neil’s examples, is that the damage is typically accidental. (This makes the term “weapons” a bit misleading, since the damage caused by ordinary weapons is typically intentional.) For example, generally well-intentioned, though opaque and ill-conceived, value-added models of teacher assessment have caused good teachers to be fired or driven away from school districts that need them, with the result that already-underprivileged students are further disadvantaged.
Diversity of Damage Beyond Unfairness and Discrimination
This opacity-scale-damage definition of WMDs circumscribes an interesting class of mathematical models, but the definition itself does not offer much insight. Specifically, the focus on opacity, scale and damage tells us little about why and how WMDs have come to exist in the first place. It does not identify any single cause of WMDs, nor does it identify just one kind of damage they do. However, these limitations of O’Neil’s definition are not problems with the definition, per se; rather, these limitations seem to be inherent to the subject matter itself. WMDs, as a group, resist any simple, unifying explanation, and they are more easily recognized by their effects than by their causes. What is it about a hidden mathematical model in widespread use that makes it a WMD? Simply, it is the fact that damage ensues.
Among the various sorts of damage caused by WMDs, one familiar variety is discrimination against historically disadvantaged groups. For instance, O’Neil offers a vivid and compelling indictment of some uses of mathematical models to inform police departments’ choices about where to patrol. Most of these “predictive policing” models prescribe intensifying patrols in areas with histories of frequent crime. Unfortunately, the effects of this seemingly common-sense strategy can be discriminatory, with geography serving as a proxy for race or class. Imagine we have two neighborhoods: a richer, mostly white neighborhood, and a poorer, mostly black neighborhood. And suppose that, historically, the poorer, black neighborhood has had a higher frequency of arrests and convictions for “nuisance” crimes, such as vagrancy, minor vandalism and marijuana possession. According to the model’s prescriptions, the police will be directed to patrol more heavily in that neighborhood. The result is that any sort of activity in that neighborhood – criminal or not – is more likely to attract police attention than anything that happens in the rich, white neighborhood. This heightened attention will likely lead to disproportionately more arrests and convictions in the poorer, black neighborhood, a result that will be fed back into the system, which then will prescribe even more patrolling there. Thus, the feedback loop increases discrimination, unfairly amplifying an already established pattern of disadvantage.
O’Neil has not been alone in pointing out this kind of problem. Debates over the discriminatory effects of predictive policing are mainstream. In addition, government agencies as well as academic researchers and foundations have taken notice of big data’s more general tendencies to exacerbate discrimination. But what distinguishes O’Neil is that she is not determined to analyze just one sort of problem (important as it may be) or push just one favorite line of ethical critique. Instead, she presents the problems as she has found them. And the problems turn out to be a diverse lot.
The discussion, mentioned earlier, about the U.S. News college rankings is a case in point. Since U.S. News began its college rankings in 1983, they have become one of the primary tools that students and their parents use to compare their options for higher education. But the rankings have had an impact beyond students. Just like students compete for admission and scholarships, colleges compete for students. As a result of the tremendous influence the U.S. News rankings have on students, college administrators have put a great deal of effort into figuring out how to improve their schools’ rankings. This may seem like a terrific result: as the colleges compete, they all improve, which improves education across the board. Unfortunately, this rosy assessment ignores the crucial role of mathematical models in the story. The ranking of one school ahead of another is not just some given fact that U.S. News discovers and reports. Rather, U.S. News figures its rankings by selecting a slew of criteria, each measured and weighted differently. The criteria include surveys of university administrators, students’ SAT scores, student-teacher ratios, acceptance rates, graduation rates, and the percentage of alumni who donate to the school. Some of these criteria arguably measure something intrinsically valuable, but some are simply stand-ins for other, more genuinely important qualities. Either way, college administrators have an incentive to match their priorities to these criteria, instead of prioritizing the immediate exigencies of education itself. The result is what O’Neil describes as an arms race, in which colleges are not simply trying to out-educate each other, but seeking to out-rank each other, whatever that happens to entail.
Now, how might we give a general characterization of the damage done by the college rankings model? Maybe by drawing this conclusion: Pressure exerted by the model distorts the values and priorities of colleges, resulting in drastically increased costs for students, without coordinate improvements in the actual quality of education. If that is an accurate description, then the college rankings model is indeed a serious problem, but it is not a problem that can be easily analyzed in terms of discrimination or fairness.
Another of O’Neil’s chapters discusses the use of mathematical models to tailor the behavior of computer systems to match the profiles of particular users. For instance, social media sites such as Facebook can send individual users particular news stories they are likely to click, and political campaigns can contact the particular voters they might easily influence. At first blush, this customization sounds like a good thing. After all, delivering to individuals the information, offers, and products that are especially relevant to them seems desirable. But, in practice, the widespread use of these models is dangerous. Highly curated news feeds may expose people to only the ideas and information that antecedently appealed to their micro-targeted demographic. O’Neil’s discussion of these trends is particularly illuminating, especially in light of reports that “fake news” stories have appeared in some Facebook users’ news feeds, but not in others. The discussion of political micro-targeting in politics is just as timely. With models carefully classifying citizens according to their political proclivities — what O’Neil describes as a “merging of politics and consumer marketing” — the danger is that politicians can present themselves differently to different groups, in ways that approach outright duplicity.
The damage done by this acutely customized information-delivery and finely fitted user-interaction is difficult to categorize. The complaint is not that any one group receives unfair or discriminatory treatment, but rather that these systems deprive people of an important social good: exposure to competing arguments and new perspectives. This general type of problem has been coming into public focus, especially since the recent U.S. Presidential election, but O’Neil already had it spotted.
As should be evident from the brief sketches here, there is no simple explanation encompassing all WMDs. They operate differently, and they cause different sorts of damage. To recognize this is to see the value of O’Neil’s book. It is not one of those books that consists of a couple of novel ideas repeatedly exemplified in case after case. Instead, it comprises a rich set of thematically related cases, each of which teaches us something new. Any mere digest or excerpt of the book falls far short of capturing the whole.
The most straightforward takeaway from “Weapons of Math Destruction” is that there is an awful lot that can go wrong with the use of large-scale, opaque mathematical models. The best way to understand and come to terms with this bewildering truth is to become immersed in the multifarious examples of things going wrong. An opportunity to do just that is what this book offers its readers.
Finally, it is worth reflecting on the place of O’Neil’s book in the rapidly evolving field of data and technology ethics. The many examples O’Neil describes, because of their breadth and the issues they highlight, should become canonical case studies as data ethics takes shape as a distinctive area of study. O’Neil’s distinctive approach also deserves commendation. Although theory-driven ethical critique is invaluable, there is such a thing as premature theorization. After all, our present conceptual repertoire might not be sensitive to the complexities of new technologies. Therefore, sometimes what we need most is just to see more clearly what kinds of problems are arising. And, for this, we need technologists who understand all of the relevant complexities and can explain them to others who are not already immersed in them. O’Neil’s work is an admirable model, one that we should hope practitioners and students of new computing and information technologies will aim to imitate.