America is in the middle of a roiling debate about the effectiveness and scope of our criminal justice system. Do we incarcerate too many people? Are sentences too long? Are prosecutors too aggressive? Do probation and parole actually work?

Overall, Americans are asking: Are there better ways of doing this?

These questions are hard to answer because they implicate deeply held beliefs about crime, justice, public safety, and rehabilitation. But on a practical level, they’re hard to answer because we lack reliable data. As it stands, we don’t have the numbers needed to answer some fundamental questions.

For example, reformers argue that probation sentences are too long, and that people who make it through a year or two of probation without incident are very unlikely to re-offend. One bit of data that could help answer this question: How many people were re-arrested for a new felony within a year, two years, three years of their release from prison?

Seems simple. But only 13 states currently report such information. In many states, the data is probably available, but is buried in scattered probation office records, in unlinked spreadsheets or even on paper, and is never reported to any centralized data collection agency.

It turns out, this is true of lots of data points that are critical to making informed criminal justice policy decisions. In the absence of useful and reliable data, many policy makers revert to the most cautious position, despite the terrible fiscal and human consequences of involving too many people in the criminal justice system for too long.

The good news: there’s momentum to solve the problem of getting clear, actionable and timely justice data into the hands of policy makers and the public.

The bad news: the challenges are substantial.

We believe there are four main obstacles to better criminal justice data.


 “Garbage in, garbage out” is a foundational principle of social science. Analysis can only be as good as the data it relies on. There are many problems with the way criminal justice agencies collect data, including not using unique identifiers to link persons across data systems; not collecting data on race and ethnicity in a consistent way; or using different definitions of key terms such as “recidivism.”

Then there are problems with inaccurate or misleading data due to limited entry options or simple human error. The good news is that data collection actually turns out to be the most addressable part of this multi-faceted problem.

Most criminal justice agencies are already collecting a lot of data—far more than they’re able to use effectively (because of the challenges below).


Once data is collected it must be stored to be usable. But currently, most data storage solutions are built by for-profit vendors who make it hard or expensive to extract data.

Many data systems overwrite information when new data becomes available, making it impossible to analyze outcomes over time. Many agencies store data on paper, including parole case plans or participation in treatment programs, or in marooned spreadsheets that aren’t connected to other digitized data.


Comparable, reliable data that is easily accessed is required before any useful analysis can occur. But even if you have data, you need analytical capacity, and that turns out to be a serious problem in many state and local agencies.

Across the 50 state prison systems, which collectively incarcerate more than 1.5 million people at a cost of nearly $50 billion annually, and supervise another 2.3 million people in the community, there are only roughly 200 people charged with analyzing the data collected and turning it into actionable information.

In addition, data is frequently siloed between different agencies: law enforcement, courts, corrections, and supervision each have their own stores of data, which makes it difficult to analyze outcomes as people move between different facets of the criminal justice system. Often, agency staff only have access to their own silo.

Finally, agencies use different definitions of key terms such as recidivism: does it mean re-arrest, re-conviction, or re-incarceration? Over what time period? Are we calculating event- or person-based statistics? The lack of guidance and standard definitions nationally makes it hard to compare results across jurisdictions or roll up data into national findings.


The goal of analysis is to inform and shape decisions. But in order to do that, the data needs to be actionable. If you don’t report the right information to the right people at the right time, none of the work to improve data collection, storage and analysis is going to have much impact. Solid policy making is often hampered by old data.

For example, in April 2019, the U.S. Bureau of Justice Statistics released its most current data on state prison populations—from 2017. Stale data increases risk and uncertainty and can paralyze decision makers.

Additionally, data that is presented in a dense, complex, or misleading fashion won’t have much impact. The way data is presented should match the user’s needs, goals, and authority.

So, the challenges of getting good criminal justice data are substantial. However, we believe they can be met.

Model state legislation can be drafted to direct agencies to report standardized, usable data, and provide funding for collection and storage efforts. Private grant making to expand storage and analytical capacity in states and localities can be ramped up. Experts are looking at best practices for data analysis and presentation.

There is a lot to be done. But voters and policy makers need answers as they sort through complex criminal justice issues, and reformers should be the ones working to provide them.

Featured Publications