Filtered by category: Evaluation Clear Filter

Estimating Treatment Effects with the Explanatory Item Response Model

How Much do Scoring Methods Matter for Causal Inference?

The manner in which student outcomes are measured, scored, and analyzed often receives too little attention in randomized experiments. In this study, we aimed to explore the consequences of different scoring approaches for causal inference on test score data. We compared the performance of four methods, Classical Test Theory (CTT) sum scores, CTT mean scores, item response theory (IRT) scores, and the Explanatory Item Response Model (EIRM). In contrast to the CTT- and IRT-based approaches that score the test and estimate treatment effects in two separate steps, the EIRM is a latent variable model that allows for simultaneous estimation of student ability and the treatment effect. The EIRM has a long history in psychometric research, but applications to empirical causal inference settings are rare. Our results show that which model performs best depends on the context.

How to Read this Chart: Statistical power (y-axis) by missing item response rate (x-axis) and estimation method (color and shape) shows that the relative performance of each approach depends on the context. The EIRM and IRT-based scores are more robust to missing data and provide the most benefits to power when the latent trait is heteroskedastic. Legend: skew = latent trait is skewed, het = latent trait is heteroskedastic, mar = item responses are missing at random, sum = sum score, mean = mean score, 1PL = IRT theta score, EIRM = explanatory item response model.

Comparative Model Performance

Read More

Reading Aloud to Children, Social Inequalities and Vocabulary Development: Evidence from a Randomized Controlled Trial

The shared book reading intervention

We designed a four-month intervention that integrated a school-based book loan along with information on the benefits of shared book reading (SBR) for children and provided tips for effective reading practices. We did this using weekly flyers, a short phone call and six text messages sent to the parents. This intervention was aimed at fostering children’s language skills by enhancing the frequency and the quality of parent-child interactions around books. To assess the impact of this intervention, we used a randomized experiment, which involved a large, random sample of 4-year olds (N=1880) who attended 47 pre-primary schools located in the city of Paris. This evaluation design marks a significant improvement over previous studies in that the results are applicable to a much larger population. Our large sample size, sampling design and high participation rates of schools and families helped us achieve this outcome.

Important features of the SBR

Three features of this intervention are especially important. First, it was focused on accessibility of information messages for families with low education and an immigrant background. Second, the intervention has an intensive and continued format, aimed at fostering a persistent change in parenting routines. Third, its focus on parent-child interactions around books and the enjoyment of this activity for both parents and kids.  

Read More

The Impact of a Standards-based Grading Intervention on Ninth Graders’ Mathematics Learning

What is Standards-based Grading?

Typically, U.S. classroom teachers use some tests and assignments purely for summative purposes, recording scores indelibly to be used in a weighted average that determines final grades. In contrast, under a standards-based grading system the teacher uses such assessment evidence both to evaluate the extent to which a student is proficient in each of the course’s learning outcomes at that particular moment in time (summative assessment), and then to provide students with personalized feedback designed to guide further learning (formative assessment). A key feature of standards-based grading is that students are then given opportunities to do further work, at home or in school, and to be reassessed for full credit. In other words, summative assessments become formative tools designed to promote further learning, not just markers of how much students have learned already.

How did we conduct this study?

We conducted a cluster randomized controlled trial, recruiting 29 schools and randomly assigning half (14 schools) to a Treatment condition, and half (15 schools) to a Control condition.  Treatment schools implemented the standards-based grading program, called PARLO, in their ninth-grade algebra and geometry classrooms, and Control schools proceeded with business-as-usual. In our participating districts, instruction to learning standards and implementation of formative assessment were already commonly in use. Consequently, the PARLO program focused on implementing two necessary components of standards-based grading. The first was Mastery: students were rated as not-yet-proficient, proficient, or high-performance on each learning outcome, and final grades were computed using a formula based on the number of proficient and the number of high-performance learning outcomes. The second was Reassessment: after providing evidence that they had done further studying, any student could be reassessed for full credit on any learning outcome.

Read More

Can nudging mentors weaken student support? Experimental evidence from a virtual communication intervention

Does reminding mentors to reach out support virtual communication?

Not necessarily. This study tested the effect of a low-cost, light-touch intervention—mentor reminders—designed to bolster virtual outreach. The study examined impacts on the frequency of student-mentor communication and subsequent impacts on mentoring relationship quality. Unexpectedly, although students did not report receiving more (or less) outreach from mentors who received reminders, compared to mentors that did not receive reminders, treated mentors reported that their students initiated less outreach. In addition, students of treated mentors reported that they were less likely to respond to messages their received from their mentors. 


Read More

The Uncertain Role of Educational Software in Remediating Student Learning

What is the potential of educational software for remediation?

Educators must balance the need to remediate students who are performing behind grade level with their obligation to teach grade-appropriate content to all. Educational software programs could help them strike this balance by incorporating below-grade-level content into an existing curriculum, allowing students to learn at their own pace while remaining in the same classroom as their peers. If effective, this practice could save school systems the high costs of more intensive remedial interventions like high-dosage tutoring, summer school, extra coursework, and grade retention.

How did this study examine the effectiveness of educational software for remediating below-grade-level students?

This study estimates the causal effects of providing low-performing students in grades 3-6 with below-grade-level math content via an online software program. Students who scored below a designated cutoff on a prior-year math assessment were assigned a modified version of the software program. The modified software included below-grade-level content before the grade-level material. Students who scored above the cutoff received only the grade-level curriculum. We examined whether receiving the modified curriculum affected students’ completion of grade-level learning objectives, pre- and post-objective quiz scores, and math test scores

Read More

Supporting Teachers in Argument Writing Instruction at Scale: A Replication Study of the College, Career, and Community Writers Program (C3WP)

This large-scale randomized experiment found that the National Writing Project’s (NWP’s) College, Career, and Community Writers Program (C3WP) improved secondary students’ ability to write arguments drawing from nonfiction texts.

What impacts did C3WP have on student achievement?

The study team collected and scored student writing from an on-demand argument writing task similar to those in some state assessments. At the end of the year, students in C3WP districts outscored students in comparison districts by about 0.24 on a 1- to 6-point scale on each of the four measured attributes (see graph). On average, these effects are equivalent to moving a student from the 50th percentile of achievement to the 58th percentile of achievement.

Read More

Which Students Benefit from Computer-Based Individualized Instruction? Experimental Evidence from Public Schools in India

Does computer-based individualized instruction boost math learning?

 Yes. In public schools in Rajasthan, India, students who scored in the bottom 25% of their class improved by 22% of a standard deviation in math test scores (top chart). However, the average student in grades 6-8 who had access to individualized instruction did not outperform those who did not over nine months. Our results suggest that computer-based individualized instruction is most beneficial for low performers.

What is computer-based individualized instruction?

 We provided all students with a computer-adaptive math learning software called “Mindspark.” When students first log in, they take a diagnostic test, which identifies what they know and can do, and the areas in which they can improve. Then, the software presents them with exercises appropriate for their preparation level based on the diagnostic test. The difficulty and topic covered by subsequent exercises dynamically adjust to each student’s progress.

Read More

Effect of Active Learning Professional Development Training on College Student Outcomes

Is there an effect of participating in Active Learning Professional Development (ALPD) training on student performance?

Students who took a course with an ALPD instructor were three percentage points more likely to take additional classes in the same subject area compared to students who were taught by non-participant. Non-participants persisted at a rate of about 68%, so a three percentage point increase represents a 5% improvement. Importantly, ALPD training is related to higher likelihood of implementing active learning instructional practices in the classroom. We do not find any differences in students’ current course grade or performance in the next class.


How to read this chart: This figure shows that students who took a course with an ALPD trained instructor were three percentage points more likely to take another course in the same field of study in the immediate next term (p<0.05). No clear difference in course grades was evident either in the ALPD-instructed course, or in the next course taken.

Read More

The Impact of a Virtual Coaching Program to Improve Instructional Alignment to State Standards

What is the virtual coaching program tested in this study?

Feedback on Alignment and Support for Teachers (FAST) is a virtual coaching program designed to help teachers better align their instruction to state standards and foster student learning. Key components of this 2-year program include collaborative meetings with grade-level teams, individual coaching sessions, instructional logs and video recordings of teachers’ own instruction, and models of aligned instruction provided by an online library of instructional resources. During the collaborative meetings and coaching sessions, teachers and coaches use the logs, video recordings, and models of aligned instruction to discuss ways of improving alignment of their instruction to state standards. Teachers were expected to complete 5 collaborative meetings, 5 individual coaching sessions, 5 video recordings of their instruction, and 5 instructional logs per year.

 How did we assess the impact of the virtual coaching program?

 We assessed the impact of the FAST program on teachers’ instructional alignment and students’ achievement through a multisite school-level randomized controlled trial, which took place in 56 elementary schools spanning five districts and three states. We randomly assigned 29 of the 56 schools to the treatment group and 27 to the control group. The study focused on Grade 4 math and Grade 5 English language arts (ELA) and used the respective state test scores as student achievement outcomes. We used an instructional survey to measure teachers’ instructional alignment. Teacher attendance, FAST coaching logs, teachers’ instructional logs, and video recordings of teachers’ instruction were collected to describe the implementation of the FAST program.

 What did we find?

Read More

ICUE Intervention Improves Children’s Understanding of Mathematical Equivalence

Jodi L. Davenport, Yvonne Kao, Kristen Johannes, Caroline Byrd Hornburg, and Nicole M. McNeil

PDF Version

Does the ICUE intervention improve math learning?

Yes, second grade students in classrooms using the Improving Children’s Understanding of Equivalence (ICUE) materials and lessons scored higher on measures related to mathematical equivalence, including equation solving and conceptual problem solving. These higher scores came with no observable trade-offs in computational fluency.

Read More

A Cautionary Tale of Tutoring Hard-to-Reach Students in Kenya

Beth Schueler, Daniel Rodriguez-Segura

PDF Version

What was this study about?

Covid-19 school closures have generated significant interest in tutoring to make up for lost learning time. Tutoring is backed by rigorous research, but it is unclear whether it can be delivered effectively remotely. We study the effect of teacher-student phone calls in Kenya when schools were closed. Schools (j=105) were randomly assigned for 3rd, 5th and 6th graders (n=8,319) to receive one of two versions of a 7-week weekly math intervention—5-minute accountability checks or 15-minute mini-tutoring sessions—or to the control group.

Read More

How Do the Impacts of Healthcare Training Vary with Credential Length? Evidence from the Health Profession Opportunity Grants Program

Daniel Litwok, Laura R. Peck, and Douglas Walton

PDF Version

How do the earnings impacts of healthcare training vary?

This article explores how earnings impacts vary in an experimental evaluation of a sectoral job training program. We find that over the first two years in the study, those who completed long-term credentials (defined as college degrees or certificates that require a year or more of classes to earn) had program impacts that were about $2,000 larger per year than those who did not complete long-term credentials (whether they completed a short-term credential or no credential at all). A possible explanation for this finding is that those who earned a long-term credential had different experiences in the program, including more engagement with support services, and different post-program outcomes, such as greater employment in high-wage healthcare occupations like registered nurse.

Read More

Effects of Cross-Age Peer Mentoring Program Within a Randomized Controlled Trial

Eric Jenner, Katherine Lass, Sarah Walsh, Hilary Demby, Rebekah Leger, and Gretchen Falk

PDF Version

How does a cross-age peer mentoring program affect ninth-grade outcomes?

Ninth-grade students who were offered Peer Group Connection High School (PGC-HS) were less likely to receive a suspension or disciplinary referral and self-reported higher levels of school engagement and postsecondary expectations. However, offering the program had no effect on academics (credit attainment, attendance at school, GPA) and other non-cognitive skills (e.g., decision-making skills).

Read More

Examining the Impact of a First Grade Whole Number Intervention by Group Size

Ben Clarke, Christian Doabler, Marah Sutherland, Derek Kosty, Jessica Turtura, and Keith Smolkowski

PDF Version

The importance of early mathematics

The importance of a successful start to learning mathematics has been a national priority for several decades. Mounting evidence indicates that trajectories of mathematics performance are established early and remain relatively stable across time. This may in part be due to substantial disparities in young students’ access to early mathematics experiences and instruction with preschool-aged students from upper- and middle-class backgrounds already outperforming their economically disadvantaged peers.

Read More

Raising Teacher Retention in Online Courses through Personalized Support. Evidence from a Cross-national Randomized Controlled Trial

Davide Azzolini, Sonia Marzadro, Enrico Rettore, Katja Engelhardt, Benjamin Hertz, Patricia Wastiau

PDF Version

Does providing teachers with personalized support help them complete online training courses?

Yes, but not for all and not everywhere. The TeachUP policy experimentation found large effects of personalized support on course completion in nine European Union Member States among professional (i.e., in-service) teachers (+10.6 percentage points), but not among student teachers. Moreover, no effects are found in Turkey. More studies are needed to investigate the contextual and learner characteristics that drive the heterogeneous effects.

Read More

Using a Factorial Design to Maximize the Effectiveness of a Parental Text Messaging Intervention

Catherine Armstrong Asher, Ethan Scherer, James S. Kim

PDF Version

What features of text messaging campaigns for early elementary families might increase their effectiveness?

Text messaging interventions are an increasingly popular way to support students and their families. We compared how three features of text messages, sent to parents, affect the reading behavior and test scores of their early elementary school children:

Read More

Evaluation of a state-wide mathematics support program for at-risk students in Grades 1 and 2 in Germany

Ann-Katrin van den Ham and Aiso Heinze

PDF Version

Is an early mathematics support program based on formative assessment effective?

Yes, it is, according to a study conducted with 135 elementary school classes from 40 schools in Germany. The study shows that students at-risk for mathematical difficulties benefited from the two-year "Mathe macht stark (MMS) - Grundschule" (Maths makes you strong - primary school) implementation in Grades 1 and 2. This effect is maintained one year after the intervention ends and without providing Grade 3 formative assessment material. Moreover, students not at-risk for mathematical difficulties also benefited from the program, despite not being the target of the program. Hence, the formative assessment elements the teachers used in the mathematics classrooms for at-risk students were also beneficial for the other students. Interestingly, in an enhanced version of the program, including two extra teacher working hours per week, did not add value for at-risk students in the follow-up test at the end of Grade 3.

Read More

Does Principal Professional Development Improve Schooling Outcomes? Evidence from Pennsylvania’s Inspired Leadership Induction Program

Matthew P. Steinberg and Haisheng Yang

PDF Version

Is principal induction effective at raising student achievement?

Yes, according to a study of Pennsylvania’s Inspired Leadership (PIL) induction program. In schools where principals completed the PIL induction program, teachers became more effective, resulting in modest improvements in student achievement of approximately 1-2 weeks of additional schooling. These benefits were concentrated in schools that served the most economically disadvantaged and minority students in urban districts.

Read More

KIPP Middle Schools Increase Students’ College Enrollment Rates

Ira Nichols-Barrer, Maria Bartlett, Thomas Coen, & Phil Gleason

PDF Version

Do KIPP Middle Schools Boost Long Run Student Outcomes?

Yes they do, according to a rigorous national study of 13 KIPP middle schools. Building on prior studies of KIPP that show KIPP middle schools have strong positive effects on students’ middle school achievement, this study found that KIPP middle schools also improve longer-term rates of enrollment in four-year college programs. Winning a lottery-based admissions offer to a KIPP middle school increased a student’s probability of enrolling in college by 7 percentage points, even though a third of these students never enrolled at KIPP. Adjusting for enrollment, attending KIPP increased college enrollment rates by 13 percentage points. This boost is similar in size to nationwide disparities in college enrollment across racial groups—a relevant benchmark since nearly all KIPP students are Black or Latinx. 

Read More

We Have Skills, Effective and Efficient Social Skills Instruction for Early Elementary

Keith Smolkowski, Hill Walker, Brion Marquez, Derek Kosty, Claudia Vincent, Carey Black, Gulcan Cil, & Lisa A. Strycker

PDF Version

Can Social Skills Instruction be Efficient and Effective?

Yes. A rigorous study shows that the We Have Skills program efficiently and effectively taught the academically related social skills needed for early elementary students to succeed in school. We Have Skills appealed to children, and teachers quickly mastered and readily implemented the program in their classrooms.

Read More