Many studies in education, human development, sociology, public health, and allied fields are longitudinal, or multilevel, or both. In longitudinal studies, it is often possible to repeatedly observe participants. This allows the assessment of growth in academic achievement or change in mental health status. Multilevel data arise because participants are clustered within social settings such as classrooms, schools, and neighborhoods. These settings often form a strict hierarchy, as when classrooms are nested within schools, which are in turn nested within districts. This environment may form a cross-classified structure, when schools draw students from multiple neighborhoods and neighborhoods send students to multiple schools. The nested versus cross-classified organization of these settings create the need for different analytic approaches.

Data that are both longitudinal and multilevel include studies of school effects on student academic growth and neighborhood and family effects on changes in mental health. In some cases the participants will migrate across social settings over time. Children may experience a sequence of classrooms during a school year and families may move to a new neighborhood. In other cases, the participants will remain in place, but the character of a school or neighborhood may change. This short course will consider the issues of analysis and, to a limited extent, design, that arise in longitudinal and multilevel research settings.

The starting point for our study will be the axiom that a statistical model represents a tentative conceptual model about the sources of variation in an outcome. The model is based on assumptions that must be made explicit and, when possible, verified. The model should reflect the measurement scale of key explanatory and outcome variables. The model guides not only the summary of quantitative evidence, but also the design of future research.

This short course in HLM will begin by considering two-level studies in which persons (level-1 units) are nested within organizations (level-2 units) such as schools. We will then consider two-level studies of individual change. We will view time-series data (level-1) as nested within persons (level-2). The level-1 model specifies how an individual is changing over time as a function of person-specific "micro-parameters." The level-2 model describes the population distribution of the micro-parameters of individual change as a function of macro-parameters.

The next phase will examine three-level models. Our initial focus will be the case in which repeated measures (level-1) are nested within individuals (level-2), who are themselves nested in organizations (level-3). Not all multilevel data involves a pure nesting. In many important cases, observations are cross-classified by two higher levels of random variation: (a) individuals may be nested in "cells" defined by the cross-classification of schools and neighborhoods, and (b) time-series observations may be cross-classified by the child and the classroom when repeated measures are collected on children who change classrooms during the elementary years. We will explore these cases and situations that involve both nesting and crossing of random factors.

All of the studies considered to this point involve nearly continuous outcomes, for which the normal distribution is at least plausible. The next step will be to generalize two- and three-level models to other types of outcomes: binary outcomes, counts, ordered outcomes, and multinomial data. All of these cases fall into the framework of the hierarchical generalized linear model.

Within this short course we will analyze statistical issues that cut across applications, including: (1) efficiency and robustness of inferences, (2) Bayes and empirical Bayes shrinkage estimation of random effects, (3) exploratory analyses and model checking, (4) univariate and multivariate hypothesis tests & confidence sets, and (5) optimal research design. The course will conclude by addressing methods to estimate hierarchical linear models from incomplete data. Software for the efficient analysis of two-level models in the presence of missing data will be demonstrated.