Wednesday February 28, 12:00 PM - 3:00 PM
Workshop A
Stan: A Flexible Open-Source Platform for Bayesian Analysis

Andrew Gelman, Columbia University

Bayesian data analysis offers many advantages, most notably the ability to include prior information and partially pooled data from multiple sources via hierarchical modeling, and an automatic accounting for uncertainty by which inferences may be piped directly into decision analysis. We shall briefly explore the benefits of Bayesian analysis and then demonstrate and explain how Bayesian models may be fit using Stan, an open-source package written in C++ and runnable from R, Stata, Python, and other environments. We will then address questions and discuss the utility of Stan and Bayesian inference and computation more generally.

Wednesday February 28, 12:00 PM - 3:00 PM
Workshop B
The Stanford Education Data Archive: Using Big Data to Study Academic Performance

Sean F. Reardon, Stanford University
Andrew D. Ho, Harvard University
Benjamin R. Shear, University of Colorado - Boulder
Erin M. Fahle, Stanford University


The Stanford Education Data Archive (SEDA) is a publicly available dataset based on roughly 300 million standardized test scores from students in U.S. public schools. SEDA now contains average test scores by grade (grades 3-8), year (2009-2015), subject (math and reading/language arts), and subgroup (gender and race/ethnicity) for all school districts in the U.S. Scores from different states, grades, and years are linked to a common national scale, allowing comparisons of student performance over states and time. SEDA was constructed by Sean Reardon and Andrew Ho.

This workshop will provide a detailed description of SEDA’s contents and construction. It will include a description of how test scores are linked to a common scale, the sources and magnitude of uncertainty in the estimates, and appropriate use in descriptive and causal research. The workshop will include code, activities, and examples using Stata and R. Participants should bring a laptop with R or Stata, or be prepared to work from raw data using their preferred statistical program.
More information about SEDA is available at

Wednesday February 28, 12:00 PM - 3:00 PM
Workshop C
Practical Measurement in Improvement Science
Sola Takahashi, WestEd
Jonathan Dolle, WestEd


In the day-to-day work of educational systems, educators and administrators require data to help them determine if program changes they introduce are leading to improvements. Data designed and collected for accountability or research do not typically have the specificity, frequency, or contextual information necessary to routinely guide improvement efforts occurring in classrooms, schools, or districts. Practical measurement is an approach to collecting, analyzing, and using data to inform such efforts.

In this workshop, we will explore the central principles of improvement science, a continuous improvement method that relies on diverse actors including educators and administrators. Participants in the workshop thinking about applied work in education will learn to articulate a shared aim and theory for achieving that aim, and will also learn from and refine their theory of improvement through successive cycles of measurement and testing. Through illustrative cases, we will discuss how practical measures are developed and utilized in the application of improvement science methods.

Wednesday February 28, 12:00 - 3:00 PM
Workshop D
New Matching Methods for Causal Inference

Jose R. Zubizarreta, Harvard University

In observational studies of causal effects, matching methods are extensively used to approximate the ideal study that would be conducted if controlled experimentation was possible. In this workshop, we will explore new advancements in matching methods that overcome three limitations of standard matching approaches, and we will:
(1) directly obtain flexible forms of covariate balance, ranging from mean balance to balance of entire joint distributions, (2) produce self-weighting matched samples that are representative of target populations by design, and (3) handle multiple treatment doses without resorting to a generalization of the propensity score, instead balancing the original covariates. We will discuss extensions to matching with instrumental variables, in discontinuity designs, and for matching before randomization in experiments.
The methods discussed build upon recent advancements in computation and optimization for big data. We will use the statistical software package 'designmatch' for R.

Participants will gain a clear picture of role of matching for causal inferences, and its pros and cons. They will learn how to construct balanced and representative matched samples, improving on each aspect in relation to traditional matching methods on the estimated propensity score. The target audience of the workshop is applied researchers with quantitative training and familiarity with traditional regression methods. Facility with R is ideal, but not strictly necessary as well-documented step-by-step code will be provided.

Back to Top