How “official” evidence reviews can make ineffective programs appear effective (part one in a series)

Highlights:

A recent Department of Health and Human Services (HHS) report reviews existing evidence on the long-term effects of four home visiting program models for families with young children. Report findings include: (i) “all four… models have found improvements in child development and school performance among children up to age 7” and (ii) for one of the models—Parents as Teachers (PAT)—“the best estimate is that lifetime benefits exceed costs by 244 percent.”
We believe the report’s findings of long-term effects for several of the models are erroneous, and we use the PAT findings to illustrate how this occurred.
The highest-quality randomized controlled trials (RCTs) of PAT, as identified by HHS in a separate initiative, found no meaningful effects on any child or family outcomes at the age 3 follow-up, when the families had graduated from the program. This makes it very unlikely that longer-term effects (school age or lifetime) will materialize.
The new HHS report does not mention these disappointing RCT results and bases its conclusions on much weaker studies that likely overstate PAT’s true effects (e.g., non-RCTs with self-selection into the program).
Such selective reporting is common in government and nonprofit evidence reviews. The resulting exaggeration of effects for many programs diverts attention and funding from those that do have credible evidence of effectiveness.
The response from the lead author of the HHS review, and our brief rejoinder, follow the main report.

This is the first in a series of Straight Talk reports that seeks to illustrate how “official” evidence reviews—in this case, by the U.S. Department of Health and Human Services (HHS)—sometimes show certain programs as effective when the evidence suggests exactly the opposite.

In September, HHS issued a report—Evidence on the Long-term Effects of Home Visiting Programs (link)[i]—that summarizes research findings on the long-term effects of home visiting programs for families with young children. The report specifically examines the effects of four widely implemented home visiting program models—Early Head Start’s Home-Based Option, Healthy Families America, Nurse-Family Partnership, and Parents as Teachers (PAT). Here are two of the report’s key findings:

“All four evidence-based models have found improvements in child development and school performance among children up to age 7.”

“Home visiting’s lifetime benefits generally exceed its costs.”

We did a double take when we read these findings because we previously reviewed the main studies that are the basis for this report and reached very different conclusions. Although we agree on the evidence of sizable, sustained effects for the Nurse-Family Partnership (link), our careful review of studies of the other three program models found a pattern of weak or no sustained effects for participating children and families.

Are we missing something? You be the judge. Taking one of these program models—Parents as Teachers (PAT)—as an illustrative case study, here’s how we believe the HHS evidence review made disappointing findings look positive.

First, let’s examine the disappointing findings. HHS has an ongoing process, which pre-dates this report, for assessing the quality of studies of home visiting effectiveness. As part of that process, HHS has screened a large number of studies of PAT and rated two of these studies as “high” in study quality. We have reviewed both studies and agree that they are credible. Here is what the studies found:

Wagner et al. 1999[ii] was a randomized controlled trial (RCT) with a sample of 497 families with an infant age 0 to 6 months. At the study’s longest follow-up, when the children reached age 3, it measured a total of 49 outcomes across a range of areas (e.g., parent behaviors, household economic status, and child development). The study found statistically-significant effects on only four of these outcomes—one was a positive effect (i.e., favored the PAT group), the other three were adverse effects (i.e., favored the control group).

Drotar et al. 2009[iii] was an RCT with a sample of 527 families with an infant age 0 to 9 months. At the study’s longest-term follow-up, when the children reached age 3, the study measured a total of 12 outcomes across a range of areas (e.g., child cognitive development, behavior, and school readiness). The study found a statistically-significant effect on only one of these outcomes (a positive effect on child competence in playing with a new toy). The other 11 effects were not statistically significant and showed no pattern of superior outcomes for the PAT group.

In summary, the two strongest studies measured 61 outcomes at the age 3 follow-up, and found two statistically-significant positive effects, and three statistically-significant adverse effects. This is a pattern of results one would expect to see if PAT has no true effects—because the tests of statistical significance produce a false positive or adverse finding for about one in 20 outcomes when a program’s true impact is zero.

Given these disappointing findings from the two most rigorous evaluations of PAT, how did the new HHS report conclude that PAT produces “improvements in child development and school performance” and that PAT’s “lifetime benefits exceed costs by 244 percent”? The report reached these conclusions by overlooking (i.e., not mentioning) these disappointing rigorous findings and instead focusing on findings from lower-quality studies that are likely to overstate the program’s effects, as follows:

The HHS report cites findings from Drazen et al. 1993,[iv] which compared 20 children who had graduated from PAT to a similar group of 20 children who had not participated in PAT. In addition to having a very small sample, this study was not an RCT (i.e., the children and their families were not randomly assigned to the PAT or a control group). Instead, this was a comparison-group study that could easily overstate PAT’s effects because the children in the PAT group came from families that, in most cases,[v] had volunteered for PAT and had all graduated from the program, indicating a certain degree of family motivation and functional capacity. By contrast, the children in the comparison group came from families that did not similarly volunteer or graduate, suggesting a lower average level of family motivation/functionality. The study cannot rule out the possibility this difference in family characteristics, rather than the program, accounts for the superior outcomes observed for the PAT group.

For the benefit-cost finding, the HHS report only cites an analysis from the Washington State Institute for Public Policy (WSIPP), which shows PAT’s lifetime benefits exceeding costs by 244 percent. When we looked up the WSIPP analysis, we found that it cited five studies as the basis for the analysis—two RCTs that, as we discuss above, found no meaningful effects (Wagner et al. 1999 and Drotar et al. 2009); a third RCT that found no meaningful effects;[vi] and two other studies with significant flaws (a comparison-group study that the authors themselves describe as having “a weak, quasi-experimental design”[vii] and an RCT with sample attrition of about 60 percent[viii]).WSIPP apparently recognized that their benefit-cost estimate was flawed and, in August 2017, published an updated analysis. WSIPP’s new analysis shows an unfavorable benefit-cost ratio for PAT, with program costs exceeding lifetime benefits by $3,926 per participant.

Our bottom line: The HHS report erroneously shows program effectiveness by (i) overlooking credible disappointing findings; and (ii) focusing instead on findings from much weaker studies that are likely to be overstated. Unfortunately, we believe this is not an isolated example but rather an all-too-common practice in government and nonprofit evidence review initiatives. Our next two Straight Talk reports will examine variations on this problem in other well-known evidence review efforts.

Why does this matter? In the case of home visiting, there are program models with credible evidence of important effects on participants’ lives, such as the Nurse-Family Partnership and Child First. But if HHS labels programs with weak or disappointing findings as evidence based, the Department makes it very difficult for state and local officials charged with implementing home visiting programs to distinguish the subset of program models with true evidence of effectiveness from all the others. And HHS gives programs like PAT no incentive to revise and re-test their program model with the aim of generating true evidence of impact.

Response provided by Charles Michalopoulos, lead author of the HHS report

The Straight Talk review makes some excellent points about our recent brief on evidence of long-term effects of home visiting programs, and we generally agree. But it’s important to understand the brief’s purpose and how the findings will be used.

We are designing a long-term follow-up study of the federal Maternal, Infant, and Early Childhood Home Visiting (MIECHV) program, which has provided more than $2 billion to states to fund home visiting programs since 2010. The evaluation includes four national models of home visiting: Early Head Start-Home-based option (EHS), Healthy Families America (HFA), Nurse-Family Partnership (NFP), and Parents as Teachers (PAT). These models are included in the study because that’s where states invested most of their MIECHV funds when the evaluation began in 2011.

We compiled the evidence included in the brief to help us decide where to focus the new follow-up study. Prior evidence can suggest where we are most likely to find effects, where there are gaps in knowledge, and what we should collect to do a strong benefit-cost analysis.

The first part of the brief included studies rated moderate or high quality by an HHS-commissioned evidence review and that examined effects of the four models for families with children age 5 and older. The earliest follow-up timepoint we are considering for our new study is kindergarten; therefore, we did not include the high-quality PAT studies mentioned in the Straight Talk review since they use data collected prior to kindergarten. The Straight Talk review rightly raises concerns about the PAT analysis we did include, but our conclusions are not based on any single study or model. For example, results for PAT are consistent with larger-scale studies for the other three models, which suggest that our long-term cross-model study should examine child development and school performance when children are in kindergarten or early elementary school.

We agree that NFP has the strongest record of sustained impacts, but that is because there is little evidence on the long-term effects of the other models. NFP has followed children through age 21, but PAT has no studies that met our criteria past age 5 and HFA has none past age 9. We simply don’t know whether those models benefit families when children are 10, 15, or 20 years old.

We also agree that the benefit-cost findings are most credible for NFP because NFP has collected more data on which a strong benefit-cost analysis can be based. The brief notes that the short follow-up periods for some studies introduce considerable uncertainty in projected benefit-cost ratios. For our design, however, the main lessons are that the benefits of home visiting may accrue over a long time and can come from outcomes such as public assistance and earnings.

We’re excited about the study we’re planning. It presents an opportunity to study multiple models at scale on an equal footing, both regarding how long families are followed and the outcomes that are examined. There are substantial gaps in past research that our new study can address.

Rejoinder by the LJAF Evidence-Based Policy team

We appreciate the lead author’s thoughtful response, and agree that HHS’s review of the existing evidence can be valuable for generating hypotheses to inform the design of the long-term follow-up study of the federal home visiting program (MIECHV). We note, however, that the HHS report goes beyond identifying hypotheses. The report makes claims about the long-term effectiveness of home visiting program models that, as the author’s response acknowledges for the PAT program, are not supported by credible evidence. These claims include:

Page 2: “… all four national models included in this summary [i.e., PAT, NFP, EHS, and HFA] have had sustained effects as children get older.”

Page 3: “All four evidence-based models have found improvements in child development and school performance among children up to age 7.”

Page 9: “For Parents as Teachers, the best estimate is that lifetime benefits exceed costs by 244 percent, and there is a 67 percent chance that benefits exceed costs.”

A straightforward reading of the report, in other words, would lead a reader to believe that existing evidence supports PAT’s long-term effectiveness when in fact that is not true.

As to the HHS report’s omission of the two high-quality RCTs showing no meaningful effects for PAT, the author’s response explains that these studies measured outcomes at age 3, and so fell outside the report’s focus on outcomes in kindergarten and beyond. The response goes on to state that, “We simply don’t know whether those models [including PAT] benefit families when children are 10, 15, or 20 years old.” It is true we don’t know for certain, but the fact that two high-quality RCTs of PAT found no meaningful effects at age three, when the families in these studies had graduated from the program, makes it very unlikely that longer-term effects will materialize (based on the experience of social program RCTs over the past half-century). For that reason, we believe these studies are relevant to the report’s focus and should have been included as evidence that long-term effects are improbable.

We conclude by noting that we share the author’s enthusiasm about the ongoing study of MIECHV, and very much look forward to the findings.

References

[i] Michalopoulos, Charles, Kristen Faucetta, Anne Warren, and Robert Mitchell. Evidence on the Long-Term Effects of Home Visiting Programs: Laying the Groundwork for Long-Term Follow-Up in the Mother and Infant Home Visiting Program Evaluation (MIHOPE). OPRE Report 2017-73. Washington, DC: Office of Planning, Research and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services.

[ii] Wagner, M., Clayton, S., Gerlach-Downie, S., & McElroy, M. (1999). An evaluation of the northern California Parents as Teachers demonstration. Menlo Park, CA: SRI International.

[iii] Drotar, D., Robinson, J., Jeavons, L., & Lester Kirchner, H. (2009). A randomized, controlled evaluation of early intervention: The Born to Learn curriculum. Child: Care, Health & Development, 35(5), 643–649.

[iv] Drazen, Shelly, and Mary Haust (1993). “Raising Reading Readiness in Low-Income Children by Parent Education.” American Psychological Association, August 23. Toronto, Ontario, Canada.

[v] Most of the families in the PAT group enrolled voluntarily, but six enrolled under duress—e.g., threat that their children would be placed in foster care if they did not enroll in a parenting program.

[vi] This is the Teen PAT RCT reported in Wagner, Mary and Serena Clayton (1999). “The Parents as Teachers Program: Results from Two Demonstrations.” The Future of Children 9(1): 91-115. Of the 19 effects that the study measured for the full sample, only one was statistically significant, and it favored the control group.

[vii] Pfannenstiel, J.C., & Seltzer, D.A. (1989). New parents as teachers: Evaluation of an early parent education program. Early Childhood Research Quarterly, 4(1), 1-18.

[viii] Wagner, M., Spiker, D. (with Hernandez, F., Song, J., & Gerlach-Downie, S.). (2001). Multisite Parents as Teachers evaluation: Experiences and outcomes for children and families (SRI Project P07283). Menlo Park, CA: SRI International.

How “official” evidence reviews can make ineffective programs appear effective (part one in a series)

Highlights:

Response provided by Charles Michalopoulos, lead author of the HHS report

Rejoinder by the LJAF Evidence-Based Policy team

References

Policy Area

Study Report Accuracy

Credible Positive Finding

Archives

How “official” evidence reviews can make ineffective programs appear effective (part one in a series)

Highlights:

Response provided by Charles Michalopoulos, lead author of the HHS report

Rejoinder by the LJAF Evidence-Based Policy team

References

Policy Area

Study Report Accuracy

Credible Positive Finding

Archives

Subscribe to Our Newsletter