Jeffrey Liebman, a former Obama administration official who is now at the Harvard Kennedy School of Government, has authored a paper that brings together some of the disparate strings of evidence-based policy and performance management.
The paper, Using Data to Make More Rapid Progress in Addressing Difficult U.S. Social Problems, includes a fairly explicit criticism of evidence-based policy as it is commonly understood. But it also proposes a practical solution to some of its limitations that is based on his work at Harvard’s Government Performance Lab.
Liebman begins by acknowledging recent advances in the evidence-based policy.
In our [Harvard Government Performance Lab] work, we have found that in every state and local government social service agency with which we have worked there are multiple officials who understand that some interventions in their field are “evidence based” and others are not. Moreover, although I am not aware of any comprehensive time series on the number of rigorous impact evaluations of U.S. social policy interventions completed per year, it certainly appears that the pace at which evaluation evidence is being developed is increasing.
But he quickly shifts to some practical criticisms of this approach:
And yet all of this momentum does not seem to be enough to move the dial on challenging social problems. Part of the problem is that we need a lot more innovation, experimentation, and evidence — at least 20 times what we are currently producing. Most evaluations of social programs find disappointing results, and a large portion of programs that look successful in an initial evaluation fail in replication. So we need to innovate and test at a much more rapid pace. Another part of the problem is that even when successful interventions are discovered governments don’t fund them at scale. Yet another part of the problem is that evidence becomes stale very quickly. … Finally, even the best models can fail when delivered on a large scale if staff quality and other implementation details are not sustained at the level of the original experiment.
He then proposes a broader approach:
Impact evaluations, while extremely valuable, are a relatively small portion of the hard work that needs to be done with data and data analysis if we are going to move the dial on difficult social problems. Human service agencies need to be making greater use of data and analysis throughout their operations. Moreover, the rhetoric about using evidence to find out “what works” orients policy makers incorrectly toward thinking that program effectiveness is a static concept and that the budget process is the primary way to achieve greater effectiveness.
Instead, political leaders should be reviewing data on whether programs are doing better this month than last month (or this year than last year) and holding agencies accountable for re-engineering their processes and those of their contractors to produce continually rising performance trends over time.
To do this, Liebman proposes a solution he calls “active contract management,” which combines “high frequency review of data and regular collaborative meetings between government agency staff and service providers.” The process includes the following four steps, which he proposes for state governments:
- Identifying a target population and hypotheses about policy interventions that could potentially affect the policy issue in question;
- Using data on risk levels and intervention cost effectiveness to refer the right people to the right services;
- Tracking services in real time and collaborating with service providers; and
- Annually comparing outcomes for individuals referred to different services to make decisions about how to allocate resources and adjust referrals.
The core of the process is data-driven performance-based management, a technique that has been promoted by his colleague Bob Behn at Harvard and Harry Hatry at the Urban Institute, among others.
Liebman acknowledges some of the challenges in data-driven decision making, but believes they can be resolved.
These sorts of comparisons are not always straightforward. Results need to be adjusted to account for differences in the populations being served by different providers. Otherwise, providers who target the most difficult cases will be penalized. And short of randomization, there is no way to adjust for differences that are not measured in available data.
But quite often there are opportunities to use regression-discontinuity strategies to compare outcomes for people just above and below thresholds for referral to services, and there are opportunities to replace idiosyncratic referral processes with deliberate ones that involve randomization so as to facilitate comparisons of relative effectiveness. Moreover, even when only unadjusted outcomes by service type can be calculated, they can be quite revealing.
Liebman goes on to propose a variety ways to address some human capital and other barriers to the approach, including providing technical assistance and developing an initial cadre of high-performing human service agencies that could serve as models. He also proposes expanding the technique to break down silos and address social issues more holistically, citing the Strive Partnership as a model.
Will the technique work? Liebman is not certain, but he believes it could. He concludes:
[O]ur current mechanisms for funding and evaluating social programs do not produce a culture of continuous learning and improvement, nor do they generate opportunities for fundamental reengineering of systems to produce better results. … My conjecture is that if we take advantage of the great expansion in the availability of data and analysis tools to
actually try to move the dial on social problems in a data-driven, outcomes-focused way, we might find that we succeed.