One key issue with pretrial RATs is that they predict “risk” in ways that designers claim are fair. But risk, and what is fair and to whom, are subjective measures based on human judgement.
RATs lend a veneer of scientific objectivity to the concept of predicting “dangerousness,” which is both hard to predict and rare. Some argue that acting like dangerousness is quantifiable could lead to increased preventative detention.1John L. Koepke and David G. Robinson: Danger Ahead: Risk Assessment and the Future of Bail Reform, Washington Law Review
Given that we are all innocent until proven guilty and that no one can perfectly predict whether or not someone will commit any crimes, Sandy Mason asks: “What statistical risk that a person will commit future crime justifies short-term detention — if any does?”2Sandra G Mayson: Dangerous Defendants, Yale Law Journal
As one paper argues, even if RATs’ predictions were better than judges’ and magistrates’ decisions — a claim that, the authors note, research does not verify — using these assessments could still be unfair.3Brandon Buskey and Andrea Woods: Making Sense of Pretrial Risk Assessments, The Champion
What factors are included and how much they are weighted in the algorithm, as well as the cutoff points between low, medium, and high risk may be grounded in statistics — but at the end of the day, especially when tied with a decision-making framework, they are a reflection of how much risk is acceptable risk to a particular jurisdiction’s criminal legal system decision-makers.
To dive into this subjectivity further:
“The designation of risk bins is somewhat arbitrary. What qualifies as low or high depends on the thresholds set by tool designers, and merely denotes the risk a group presents relative to other risk bins.Brandon Buskey and Andrea Woods, “Making Sense of Pretrial Risk Assessments”4Brandon Buskey and Andrea Woods: Making Sense of Pretrial Risk Assessments, The Champion
To illustrate, somewhere between 8.6-11% of those flagged for “new violent criminal activity” (“NVCA”) by the PSA were rearrested for a violent charge within six months of their release. That means 89-91% of people flagged were not arrested for a violent crime.”
Our current push towards criminal legal reform and ending pretrial incarceration is an opportunity to redefine what kind of risk we really care about. Defining this risk cannot be left to the data scientists creating these tools to work out on their own.
As legal scholar Bernard Harcourt argues: “The fact is, risk today has collapsed into prior criminal history, and prior criminal history has become a proxy for race. The combination of these two trends means that using risk‐assessment tools is going to significantly aggravate the unacceptable racial disparities in our criminal justice system.”5Bernard E Harcourt: Risk as a Proxy for Race, University of Chicago Law School
Factors we know to be unfair form the basis for the way that the tools measure risk.
Most pretrial RATs try to predict if someone will return to court if released pretrial. This really has nothing to do with what kind of a risk they pose to society, but has a lot to do with whether or not an accused person has access to reminders to come to court on the right date, transportation to court, childcare, or a job that allows them to take time off.6Six Things to Know About Algorithm-Based Decision-Making Tools, Leadership Conference on Civil and Human Rights
RATs also try to predict if someone will be arrested again, which makes them sound “dangerous” — even though the vast majority of arrests are for nonviolent crimes.7Alice Speri: Police make more than 10 million arrests a year, but that doesn’t mean they’re solving crimes, The Intercept
A new arrest demonstrates the behavior of those doing the arresting, not those who are arrested. If someone is from a neighborhood that is more heavily surveilled, they are already more likely to be arrested just based on where they live.
Researchers investigating New York City’s supervised release risk assessment tool questioned whether felony rearrest, a seemingly fair measure for dangerousness, was actually a fair thing to measure at all,8Kristian Lum and Tarak Shah: Measures of Fairness for New York City’s Supervised Release Risk Assessment Tool, Human Rights Data Analysis Group given racial disparities in arrests.
A tool might be more “accurate” because it correctly predicts what percentage of white and Black defendants will be rearrested. In other words, a tool might say it is accurate in its predictions regarding race, and therefore not racially biased.
However, that “correct” prediction assumes that Black defendants will be arrested more often, which may be true in reality due to systemic racism, but is not actually fair.
Developers may try to balance different statistical metrics of fairness, but researchers have even argued that “bias in criminal risk scores Is mathematically inevitable.”9Julia Angwin and Jeff Larson: Bias in Criminal Risk Scores is Mathematically Inevitable, Researchers Say, ProPublica
Here are a few examples of how risk assessments have led to different kinds of unfair outcomes in practice:
ProPublica’s analysis of COMPAS in Broward County, Florida, found that the COMPAS algorithm overstates the risk of Black defendants being rearrested and mislabels white defendants as low risk more often than Black defendants.10Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner: Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks, ProPublica
A study of the PSA in Mecklenburg, North Carolina, found that despite no clear impact of the PSA on racial disparities in their jails, Black defendants were still more likely to be assessed as high risk than others.11Cindy Redcross, Brit Henderson, Luke Miratrix, and Erin Valentine: Evaluation of Pretrial Justice System Reforms that use the Public Safety Assessment: Effects in Mecklenburg County, North Carolina, MDRC Center for Criminal Justice Research
In the New York City study of their supervised release risk assessment tool, researchers found that the tool did not meet many common metrics of fairness,12Kristian Lum and Tarak Shah: Measures of Fairness for New York City’s Supervised Release Risk Assessment Tool, Human Rights Data Analysis Group such as limiting false positives (such as when someone is labeled high risk but does not actually reoffend), especially for Black defendants.
RATs can’t fix systemic issues or remove race and class bias in police, judges, courts, or society.
Risk assessment tools could try to maximize “fairness,” but the way they are designed and the realistic limitations of actual data mean they often ignore biases in favor of usability or simplification.13Partnership on AI: Report on Algorithmic Risk Assessment Tools in the US Criminal Justice System
These tradeoffs are a built-in part of creating a predictive tool, but for any individual accused person who is detained just because, due to group-based data, they fall in one risk category instead of another,14Brandon Buskey and Andrea Woods: Making Sense of Pretrial Risk Assessments, The Champion this can feel especially unfair.
Until we address these issues, our determination of risk and fairness and what tradeoffs are acceptable will be bound in our history of racially and economically biased understandings of who is “dangerous” and who should be protected.
Explore the complexities of determining fairness in algorithms through an interactive tool built by MIT Technology Review.15Karen Hao and Jonathan Stray: Can you make AI fairer than a judge? Play our courtroom algorithm game, MIT Technology Review