Detailed 2-level model with covariance

This lab will walk you through the estimation of a detailed 2-level model using brms, including an LKJ prior on a covariance matrix.

Start by setting up the environment:

Pre-processing the data

The model we will be using is the same one from class. We will be predicting the scores on a reading exam for students in grade one in Tenessee in the early 1980s.

Since we will only be modeling Black and white students, keep only rows that are one or the other. (*Note: this is very bad statistical practice. We're doing it here just to keep the model a little bit cleaner*)

Finally, we will standardize age and center class size and teacher experience to help with estimation and interpetation:

Specify the model

The first step in specifying the model in R is to convert from the 'standard' hierarchical notation to the 'expanded' single-equation notation. Recall from the lecture that the model we are estimating looks like this (priors omitted … for now).

$$ \begin{aligned} S_{ik} &\sim \mathrm{Norm}(\mu_{ik},\sigma)\\ \mu_{ik} &= \beta_{0k} + \beta_{1k}Age_i + \beta_{2k}Female_i + \beta_{3k}Black_i\\ \\ \beta_{0k} &= \gamma_{00} + \gamma_{01}Size_k + \gamma_{02}Exp_i +\\ &\phantom{{}={}}\gamma_{03}TeacherFemale_k+\\ &\phantom{{}={}}\gamma_{04}TeacherBlack_k+\\ &\phantom{{}={}}\gamma_{05}PropBlack_k+\eta_{0k}\\ \beta_{1k} &= \gamma_{10} + \gamma_{11}Size_k + \eta_{1k}\\ \beta_{2k} &= \gamma_{20} + \gamma_{21}TeacherFemale_k + \eta_{2k}\\ \beta_{3k} &= \gamma_{30} + \gamma_{31}TeacherBlack_k + \\ &\phantom{{}={}}\gamma_{32}PropBlack_k+\eta_{3k} \end{aligned} $$

When we substitute all the $\beta$ terms into the main equation, and replace $\sigma$ with an $\epsilon_i$ term, we have the following:

$$\begin{aligned} S_{ik} &= \\ &\phantom{{}={}} \gamma_{00} + \gamma_{01}Size_k + \gamma_{02}Exp_i + TeacherFemale_k +\\ &\phantom{{}={}} \gamma_{04}TeacherBlack_k+\gamma_{05}PropBlack_k + \gamma_{10}Age_i\\ &\phantom{{}={}} \gamma_{11}Age_iSize_k+\gamma_{20}Female_i + \gamma_{21}Female_iTeacherFemale_k+\\ &\phantom{{}={}} \gamma_{30}Black_i+\gamma_{31}Black_iTeacherBlack_k + \gamma_{32}Black_iPropBlack_k+\\ \\ &\phantom{{}={}}\eta_{0k} + \eta_{1k}Age_i + \eta_{2k}Female_i + \eta_{3k}Black_i + \epsilon_i \end{aligned}$$

The last line (separated slightly from the rest for visual clarity) collects all of the "random-effect" terms with error terms in them.

This allows us to specify the model as an R formula object:

But what about the priors? We can use get_prior() to have brms tell us all the priors it expects to see for a moodel like this:

Each row of this output shows either a specific parameter that the model is estimating (e.g. class_prop_black) or a class of related parameter (e.g. b, which refers to all of the model coefficients).

The first column of the table shows any default priors that brms would use if we didn't specify. We can see that it will default to an LKJ prior with a concentration paramter of 1 for the correlation matrix (cor) and student-t distribtutions for the intercept (Intercept) and the standard deviations (sd for within class and sigma for between class). Importantly, there is no prior set by default for the coefficients (b), meaning it will use an improper flat prior over all real numbers unless we tell it otherwise.

We want to specify priors of $\mathrm{Norm}(500,100)$ for $\gamma_{00}$, $\mathrm{Norm}(0,20)$ for the rest of the coefficicients, a (half-)Cauchy prior $\mathrm{Cauchy}(0,40)$ for the standard deviation on the $\eta$ terms, and and $\mathrm{LKJ}(2)$ prior for the correlation matrix on the $\eta$ terms.

Estimating

Estimation is now pretty simple (if slow):

You might have gotten a warning about "divergent transitions after warmup" with the previous output. This is something that should not be ignored. The usual way to deal with just a few of these is to follow the instructions in the warning and increase adapt_delta. While we are at it, we can increase the number of iterations to get a bigger effective sample size. Both of these changes will slow down sampling!

That looks better!

Now, to interpret the estimates, it will be important to be able to map the output from the summary() to the $\gamma$ terms in our model. R formulas use the : character to signify interaction between two variables, which will help us link the estimates to our formal model.

As an example, consider the effect of having a Black teacher on the racial gap in test scores for a classroom. This is capture in the last equation of our hierarchical model:

$$ \beta_{3k} = \gamma_{30} + \gamma_{31}TeacherBlack_k + \gamma_{32}PropBlack_k+\eta_{3k} $$

This means that $\gamma_{31}$ is the coefficient we are interested in. In the 'flattened' model, this is represented by the term $\gamma_{31}Black_iTeacherBlack_k$, which interacts $Black_i$ with $TeacherBlack_k$. We find this in the output above labeled student_re_blackTRUE:teacher_re_blackTRUE.