Program Evaluation of US Social Programs: Policymakers, Foundations

Highlights:

As discussed in our previous Straight Talk report, the pattern of disappointing effects in most rigorous program evaluations is compelling and needs to be taken seriously if we hope to make progress in solving social problems.
We suggest that policymakers, foundations, and researchers view this as the central challenge in social policy, and re-deploy program and research funding toward addressing it.
A specific approach we have promoted to address this challenge is a tiered-evidence strategy, which includes:
(i) Expanding the implementation of programs backed by strong (“top tier”) evidence of sizable, sustained effects on important life outcomes;
(ii) Funding and/or conducting rigorous evaluations of programs backed by highly-promising (“middle tier”) evidence, in order to hopefully move them into the top tier; and
(iii) Building the pipeline of promising programs through modest investments in the development and initial testing of many diverse approaches (as part of a “lower tier”).

This Straight Talk report—the second in a two-part series—starts where part one left off:

The bottom line is that it is harder to make progress [in solving social problems] than commonly appreciated…. The pattern of disappointing effects for most rigorously-evaluated programs—along with findings of important positive effects for a few—is compelling and transcends multiple fields. It needs to be taken seriously.

This, we believe, is the proverbial 800-pound gorilla. If policymakers, philanthropies, and researchers choose to ignore it, it is hard to see how they can make meaningful progress in solving social problems. Instead, we are likely to see more of what we described in the last report: (i) large public expenditures on programs that, when rigorously evaluated, are often found not to produce better outcomes than existing community services; and (ii) government and philanthropic investments in expensive evaluation studies that generate largely disappointing findings and contribute little to the body of evidence about what is effective in addressing key social problems.

A far better alternative to ignoring the gorilla, we believe, is for policymakers, philanthropies, and researchers to accept it as the central challenge that must be addressed to make progress in social policy, and to re-deploy program and research funding toward doing so. One specific approach to addressing this challenge that we have adopted in our team’s grantmaking efforts at the Laura and John Arnold Foundation is a tiered-evidence strategy for building and using credible evidence about program effectiveness. We have also promoted this strategy to policymakers in various iterations since 2009 (e.g., [1],[2]) and it has been incorporated into law and policy in number of areas including, in recent years, the Education Innovation and Research (EIR) program enacted in 2015 and the Family First Prevention Services Act for children at risk of entering the foster care system, enacted earlier this year.

Here are the key elements of a tiered-evidence strategy:

Expand programs and practices backed by strong (“top tier”) evidence of effectiveness. There are some programs and practices (“interventions”) that have been shown in multiple well-conducted randomized controlled trials (RCTs), or a large, well-conducted multisite RCT, to produce sizable, sustained effects on important life outcomes. Illustrative examples of interventions in this top tier category include the Nurse-Family Partnership—a nurse visitation program for low-income, first-time mothers during pregnancy and children’s infancy (shown to reduce child abuse/neglect and injuries by 20 to 50 percent over follow-up periods of between two and 15 years, versus the control group); and Career Academies, which are small learning communities in low-income high schools, offering academic and technical/career courses as well as workplace opportunities (shown to increase average earnings by $2,500 per year over the eight years following high school graduation, versus the control group).

Relatively few interventions with such strong evidence currently exist (others are shown here under the top tier category). But where they do exist, we believe there is every reason to expand them without delay. If done effectively and on a large scale, they could improve the lives of millions of Americans. An example of a philanthropic initiative in this area is our Foundation’s Moving the Needle initiative, which helps fund the expansion of interventions with top tier evidence (or close to it), coupled with a new RCT evaluation to make sure that the sizable effects found in prior studies can be reproduced when the intervention is delivered on a larger scale.

Focus other funding for program services and research/evaluation studies on a central goal: growing the number of interventions in the top tier. We see this as the great imperative in social policy for reasons related to the aforementioned gorilla: (i) most interventions, when rigorously evaluated, are found to produce disappointing effects, compared to the control group; and (ii) this is true even of interventions whose evidence is pretty good but not yet conclusive—too often the findings do not hold up in subsequent, more definitive studies, as we have discussed in previous Straight Talk reports (e.g., [1] [2]).

In other words, until an intervention has been rigorously demonstrated to produce meaningful positive effects across multiple studies or study sites (the top tier category), it is hard to have confidence that faithful delivery of the intervention in new settings would actually move the needle on key social problems. In medicine, this is the reason that the Food and Drug Administration has, since the 1960s, generally required at least two well-conducted RCTs showing effects on important health outcomes before it will allow a pharmaceutical drug to be licensed for market (link, see page 4).

As the fastest way to grow the top tier in the near term, fund and/or conduct RCTs of interventions that are backed by highly-promising (i.e., “middle tier”) evidence of effectiveness. Specifically, there are a number of interventions whose evidence base is especially promising but not yet strong—for example, because the prior study or studies only have short-term follow-up, were conducted in a single site, or used high-quality comparison-group designs but not random assignment. Illustrative examples at the high end of this tier include the Child FIRST home visitation program for low-income families with young children and Teen Options to Prevent Pregnancy (TOPP). Experience shows that some of these findings would successfully reproduce in a replication RCT (as described here), whereas others would not (as described here). But for policymakers, philanthropic foundations, and researchers seeking to grow the top tier in fairly short order, these highly-promising interventions offer the best odds. For that reason, they are the main focus of our team’s grant funding for RCTs.

Build the pipeline of promising interventions through modest investments in the development and initial testing of many diverse approaches (as part of a “lower tier”). Candidates for the lower tier may include interventions that are already in use but have never been rigorously evaluated. They may also include new interventions that have compelling logic or circumstantial evidence to suggest they may work (e.g., for a delinquency prevention program, evidence that delinquent peers are a key risk-factor in youth crime, and a plausible strategy for preventing the formation of delinquent peer groups).

As the gorilla knows, the large majority of these interventions, if fully developed and then rigorously evaluated, would be found to produce little or no improvement over existing practice. But to quote physicist Leonard Mlodinow, even a coin weighted toward failure will sometimes land on success. That is why we suggest testing many different approaches, initially with a modest investment (e.g., a pilot study to see if a new intervention can be implemented successfully in adherence to its key elements, and whether it produces the hoped-for short-term impacts), followed by greater investment in development and testing conditioned on initial positive findings.

There are many ways that the policy and research community can advance a tiered-evidence strategy. Policymakers can embed such a strategy in new government programs or retrofit existing programs to incorporate the approach (here is a template). In addition, individual government and philanthropic funders, program providers, and researchers can focus their efforts (as our team does) on advancing particular tiers within the strategy. These steps, if implemented widely in adherence to rigorous evidence standards, offer a path to progress in solving social problems that spending-as-usual cannot.

Policy Area

Study Report Accuracy

Credible Positive Finding

Archives

How to solve U.S. social problems when most rigorous program evaluations find disappointing effects (part two – a proposed solution)

Highlights:

Policy Area

Study Report Accuracy

Credible Positive Finding

Archives

Subscribe to Our Newsletter