Wednesday February 28, 12:00 PM - 3:00 PM
Workshop B
The Stanford Education Data Archive: Using Big Data to Study Academic Performance

Sean F. Reardon, Stanford University
Andrew D. Ho, Harvard University
Benjamin R. Shear, University of Colorado - Boulder
Erin M. Fahle, Stanford University


The Stanford Education Data Archive (SEDA) is a publicly available dataset based on roughly 300 million standardized test scores from students in U.S. public schools. SEDA now contains average test scores by grade (grades 3-8), year (2009-2015), subject (math and reading/language arts), and subgroup (gender and race/ethnicity) for all school districts in the U.S. Scores from different states, grades, and years are linked to a common national scale, allowing comparisons of student performance over states and time. SEDA was constructed by Sean Reardon and Andrew Ho.

This workshop will provide a detailed description of SEDA’s contents and construction. It will include a description of how test scores are linked to a common scale, the sources and magnitude of uncertainty in the estimates, and appropriate use in descriptive and causal research. The workshop will include code, activities, and examples using Stata and R. Participants should bring a laptop with R or Stata, or be prepared to work from raw data using their preferred statistical program.
More information about SEDA is available at

Show all professional development options?