Filtered by tag: Measurement Remove Filter

Design and Analytic Features for Reducing Biases in Skill-Building Intervention Impact Forecasts

Daniela Alvarez-Vargas, Sirui Wan, Lynn S. Fuchs, Alice Klein, & Drew H. Bailey

PDF Version

Despite policy relevance, long term evaluations of educational interventions are rare relative to the amount of end of treatment evaluations. A common approach to this problem is to use statistical models to forecast the long-term effects of an intervention based on the estimated shorter term effects. Such forecasts typically rely on the correlation between children’s early skills (e.g., preschool numeracy) and medium-term outcomes (e.g., 1st grade math achievement), calculated from longitudinal data available outside the evaluation. This approach sometimes over- or under-predicts the longer-term effects of early academic interventions, raising concerns about how best to forecast the long-term effects of such interventions. The present paper provides a methodological approach to assessing the types of research design and analysis specifications that may reduce biases in such forecasts.

What did we do?

Read More

A Framework for addressing Instrumentation Biases when using Observation Systems as Outcome Measures in Instructional Interventions

Mark White, Bridget Maher, Brian Rowan

PDF Version

Many educational interventions seek to directly shift instructional practice. Observation systems are used to measure changes in instructional practice resulting from such interventions. However, the complexity of observation systems creates the risk of instrumentation biases. Instrumentation bias is bias resulting from changes to the ways that an instrument functions across conditions (e.g., from pre-test to post-test or between control and intervention conditions). For example, teachers could intentionally show off intervention-specific practices whenever they are observed, but not otherwise use those practices. Alternatively, an instructional intervention could shift instruction in ways that increase observation scores without impacting the underlying instructional dynamics that support student learning.

This conceptual paper with a case study exemplar provides a validity framework for using observation systems to evaluate the impact of interventions. Inferences about an intervention’s impact generally involve determining whether a teaching practice has changed within some setting. Observation scores, the evidence for these conclusions, are specific raters’ views of how a rubric would describe observed lessons. The conclusions are far more generalized than the observation scores. The framework (see Figure below) systematically breaks down the processes necessary to operationalize an aspect of teaching practice and sample from a setting to obtain observation scores that can be generalized to draw conclusions.

Read More

How to measure quality of delivery: Focus on teaching practices that help students to develop proximal outcomes

Diego Catalán Molina, Tenelle Porter, Catherine Oberle, Misha Haghighat, Afiya Fredericks, Kristen Budd, Sylvia Roberts, Lisa Blackwell, and Kali H. Trzesniewski

PDF Version

How much students benefit from a school intervention depends on how well the intervention is delivered

When a new curriculum is introduced at a school, the quality of its implementation will vary across teachers. Does this matter? In this study, teachers varied widely in how well they implemented a 20-lesson social and emotional blended-learning curriculum. Teachers who delivered the program at higher quality, for example, encouraged student reflection and participation and provided feedback to students on how to improve skills. Teachers who delivered the program at higher quality had students with higher levels of motivation (growth mindset, effort beliefs, and learning goals) at the end of the program compared to teachers who delivered at lower quality.

Read More

Modeling and Comparing Seasonal Trends in Interim Achievement Data

James Soland & Yeow Meng Thum

PDF Version

Introduction

Interim achievement tests are often used to monitor student and school performance over time. Unlike end-of-year achievement tests used for accountability, interim tests are administered multiple times per year (e.g., Fall, Winter, and Spring) and vary across schools in terms of when in the school year students take them. As a result, scores reflect seasonal patterns in achievement, including summer learning loss. Despite the prevalence of interim tests, few statistical models are designed to answer questions commonly asked with interim test data (e.g., Do students whose achievement grows the most over several years, tend to experience below-average summer loss?). In this study we compare the properties of three growth models that can be used to examine interim test data.

Read More

Performance Evaluations as a Measure of Teacher Effectiveness When Implementation Differs

James Cowan, Dan Goldhaber, Roddy Theobald

PDF Version

Overview

We use statewide data from Massachusetts to investigate the school role in teacher evaluation. Schools classify most teachers as proficient but differ substantially in how frequently they assign other ratings. We show these patterns are driven by differences in the application of standards across schools, not by differences in the distribution of teacher quality.

Read More

Gather-Narrow-Extract: A Framework for Studying Local Policy Variation Using Web-Scraping and Natural Language Processing

Kylie L. Anglin

PDF Version

Many education policy decisions are made at the local level. School districts make policies regarding hiring, resource allocation, and day-to-day operations. However, collecting data on local policy decisions has traditionally been expensive and time-consuming, sometimes leading researchers to leave important research questions unanswered.

This paper presents a framework for efficiently identifying and processing local policy documents posted online – documents like staff manuals, union contracts, and school improvement plans – using web-scraping and natural language processing.

Read More

Mitigating Illusory Results through Preregistration in Education

Summary by: Claire Chuter

PDF Version

Good researchers thoroughly analyze their data, right? Practices like testing the right covariates, running your analyses in multiple ways to find the best fitting model, screening for outliers, and testing for mediation or moderation effects are indeed important practices… but with a massive caveat. The aggregation of many of these rigorous research practices (as well as some more dubious ones) can lead to what the authors call “illusory results” – results that seem real but are unlikely to be reproduced. In other words, implementation of these common practices (see Figure 1 in the article), often leads researchers to run multiple analytic tests which may unwittingly inflate their chances of stumbling upon a significant finding by chance.

Potential Solutions

Read More

The Methodological Challenges of Measuring Institutional Value-added in Higher Education

Tatiana Melguizo, Gema Zamarro, Tatiana Velasco, and Fabio J. Sanchez

PDF Version

Assessing the quality of higher education is hard but there is growing pressure for governments to create a ranking system for institutions that can be used for assessment and funding allocations.  Such a system, however, would require a reliable methodology to fairly assess colleges using a wide variety of indicators. Countries with centralized governance structures have motivated researchers to develop “value-added” metrics of colleges’ contributions to student outcomes that can be used for summative assessment (Coates, 2009; Melguizo & Wainer, 2016; Shavelson et al. 2016). Estimating the “value-added” of colleges and programs, however, is methodologically challenging: first, high- and low-achieving students tend to self-select into different colleges– a behavior that if not accounted for, may yield to estimates that capture students’ prior achievement rather than colleges’ effectiveness at raising achievement; second, measures considering gains in student learning outcomes (SLOs) as indicators at the higher education level are scant. In our paper, we study these challenges and compare the methods used for obtaining value-added metrics in the context of higher education in Colombia.

How to best estimate value-added models in higher education?

Read More

Latent Profiles of Reading and Language and Their Association with Standardized Reading Outcomes in K-10th Grade

Barbara R Foorman, Yaacov Petscher, Christopher Stanley, & Adrea Truckenmiller

PDF Version

Differentiated instruction involves tailoring instruction to individual student’s learning needs. While critical to effective teaching, an understudied first step in differentiated instruction is understanding students’ learning profiles – that is, their strengths and weaknesses in knowledge and skills.  It is only after a student’s learning profile is understood that a teacher can individualize instruction. But how can educators best measure learning profiles to facilitate differentiated instruction?

Descriptive approaches such as informal reading inventories lack the psychometric rigor required for purposes of classification, placement, and monitoring growth.  However, quantitative approaches to classifying and clustering (i.e., grouping) students by skill classes and validating the clusters by relating them to standardized tests is a reliable tool for creating profiles. The objective of this study was twofold. First, to determine the profiles of reading and language skills that characterized 7,752 students in kindergarten through 10th grade. Second, to relate the profiles to standardized reading outcomes.

Read More