A recent Wall Street Journal article makes the case that—in regulating artificial intelligence, including those applications used to aid the criminal justice system—we should emphasize accountability, rather than prescriptions to make every algorithm completely transparent. While authors Curt Levey and Ryan Hagemann make important points, the article misses key details about the state of machine learning and the fundamental differences between requirements demanded by government procurement agents and regulations that would affect the broader market.

Levey and Hagemann argue that calls for algorithmic transparency in areas like criminal justice risk assessment are misguided because they fail to account for the opaque nature of advanced machine-learning techniques. Furthermore, they believe transparency requirements—for both training data and source code—would unfairly undermine trade secrets and competitiveness in the market for such software.

Their argument about artificial intelligence regulation thus has three components:

  1. Transparency requirements will not be effective with machine-learning techniques, because each is a “black box.”
  2. Transparency requirements are undesirable because they undermine intellectual property and market competition.
  3. It is not appropriate for government to impose transparency requirements even on itself, including risk assessments in the criminal-justice system.

To be sure, there are good reasons to avoid broad-based algorithmic transparency requirements for every AI application. As I discussed in greater length at Cato Unbound earlier this year, such rules would stifle competitiveness and innovation.

But the criminal justice system is not an ordinary market, and the government is not an ordinary firm. Just as ordinary firms may and often do decide to use open source software, it is entirely appropriate for the government to make determinations about what it will require in contracts with its vendors.

Unlike ordinary firms, government also has constitutional obligations to be transparent, such as in upholding citizens’ rights to due process and equal protection under the law. Statutory obligations like the Freedom of Information Act and other “sunshine” laws; the jurisprudence of criminal procedure; 51 federal and state constitutions; and myriad court precedents all set out additional rules and protections. Notions of equity, predictability and, yes, transparency are at the heart of what our justice system strives to provide.

I’ve argued before that we should err on the side of transparency in the criminal justice system. This could be done by requiring, as part of procurement processes, that all algorithms that inform judicial decisionmaking in sentencing be built and operated on an open source software platform. Everything from the source code to the variable weights to the (anonymized) training data would be available for public scrutiny and could be audited for bias.

The government would likely have to pay more upfront for a transparent open source system, as it would essentially be buying the algorithms outright rather than renting them, and continued investment would be needed for their development. However, with a more open ecosystem, there are good reasons to think the costs to taxpayers could be offset by philanthropic investment and engagement from civil society.

There may indeed be mechanisms to validate risk-assessment software in criminal justice that stop short of disclosing training data, continuous outcome data or the underlying code. But such an approach requires taking unnecessary risks that the system will be abused, in addition to fomenting public backlash against the technology. Even setting aside civil liberties considerations, opting against transparency in criminal justice AI would keep a developing ecosystem opaque, when it could benefit from broad-based collaboration and the input of diverse stakeholders.

Thus, it seems that transparency would be desirable if feasible. Let’s address some of the feasibility concerns and proposed harms of transparency more specifically.

It’s first worth noting that the algorithms used in the criminal justice system today are relatively simple from a technical perspective, and do not rely on advanced neural networks. Their set algorithmic weights can be discovered via transparency and do not yet suffer from the concerns about “black box” machine learning. Most of these systems, like the Public Safety Assessment (PSA) tool used in New Jersey, can be calculated by hand in a short period of time if you have the relevant background and criminal history.

Systems like the COMPAS algorithm used in Wisconsin are proprietary, which makes it difficult to know exactly how they operate. However, based on sample tests obtained by ProPublica, they still seem to be within the realm of pen-and paper.

Below is a part of the published PSA, which illustrates how simple it really is.


There are structural limits as to the kinds of variables in play in a risk assessment. While a judge can consider any number of extraneous factors, a computer system must rely on a uniform dataset. This might include such variables as age, ZIP code, a defendant’s first contact with the juvenile courts or any past jail time. How these are weighted may be opaque in a machine-learning context, but would nonetheless be possible to analyze. What would be prohibited from consideration are variables such as race or national origin, as well as any false data. If these were used in sentencing—or potentially, even if other factors were used that might be a close proxy for these prohibited variables—it would open a conviction to appeal. That’s why transparency is important for due process.

The future of risk-assessment algorithms likely will include greater and greater uses of machine-learning techniques, so it’s worth thinking about potential transparency and accountability trade-offs. A recent National Bureau of Economic Research paper, lead by Cornell University’s Jon Kleinberg showcased the incredible gains we can make in the pretrial system with more accurate risk predictions. In a policy simulation, the authors showed that their algorithm, trained through machine learning, could cut the jail population by 42 percent with no increase in the crime rate.

As Levey and Hagemann point out, the greater the degree of complexity in machine learning, the harder it is to peer into the inner workings of the algorithm and understand how it makes decisions. But with access to the training data and the specific machine-learning methods used, it would be entirely possible to replicate the model again and again to make sure there are no anomalies, or to create proxy models to test for different kinds of machine bias or common errors. Furthermore, we are quickly developing new methods of machine learning that are more amenable to transparency, explicability and interoperability.

Levey and Hagemann’s stated goal of accountability does not have to stand in opposition to the goal of transparency. Transparency is one method to achieve algorithmic accountability. In the context of criminal justice, it is a most worthwhile mechanism. More advanced machine learning is able to help only insofar as models are based on externally valid data. And even explainability protocols and internal diagnostic tools will not be able to alert the operator about invalid data, because a neural network has no concept of validity outside the dataset it has been trained on.

Risk assessment systems also must be calibrated to societal norms. For instance, we want to be more averse to releasing individuals who are likely to commit a murder than we are to releasing a nonviolent drug offender. But if a particular jurisdiction wants to take a hard line on marijuana use, there would be a public interest in knowing about it.

This brings us back to a larger point about the difference between government regulation and requirements built into the procurement process. In addition to not having access to profit and loss signals, the procurement process is rife with rent-seeking as private companies compete on the basis of political connections, rather than the quality of goods they are selling. As such, it is entirely appropriate for government to set procurement specifications to ensure that certain needs are met. We should not conflate this with more general forms of government regulation.

The application of AI in risk assessments in the justice system won’t be perfect, especially at first. Even if they have an overall positive effect, they may introduce hard questions about differing notions of fairness. As these technologies advance and become more opaque, complete openness is the best way to protect our civil liberties, ensure public trust and root out flaws. In the long term, with an open ecosystem, we can produce far better outcomes than the status quo.

While I largely agree with Levey and Hagemann about whether it’s wise to impose broad transparency mandates on private sector algorithms, we shouldn’t carelessly extend this thinking when it comes to the application of state power. In high stakes realms where the government can keep you locked up or otherwise take away your liberties, we should make our mantra: “Trust, but verify.”

Image by Phonlamai Photo


Featured Publications