Due Thu, Feb 18

In this assignment you will investigate fatality of police shootings for Black victims in the United States. You will use data collected by reporters at VICE News on police shootings in the 50 largest police departments in the US between 2010 and 2016 for the article Shot by cops and forgotten. The data was made available on VICE News’ Github repository.

  1. First, grab and clean up the data. (3 points)

    1. Download the data directly from Github (https://raw.githubusercontent.com/vicenews/shot-by-cops/master/incident_data.csv).
    2. To simplify the analysis, limit the observations to those with only one officer and one victim (‘subject’ in the variable names).
    3. Create two indicator variables: one indicating whether the shooting was fatal for the victim, and one indicating whether the victim is Black. (Note: normally, you would include indicators for all but one of the racial categories, but we will create a simple dichotomous categorization here.)
  2. Create, specify, and estimate a logistic regression model predicting fatality in a police shooting by whether the victim is Black. According to this model, what is the estimated probability that a non-Black victim will die after being shot by the police? What is the 95% credible interval on that probability? What is the estimated probability and 90% credible interval for Black victims? Discuss the results. (6 points)

  3. Curious to investigate the dynamics of the disparity you observed in this simple model, you decide to include a moderator variable for whether the victim was armed. To do so, create, specify, and estimate a model that predicts fatality in police shootings based on whether the victim was Black, whether the victim was armed, and the interaction between these variables. You will need to create an indicator variable for whether the victim was armed. Report the expected probability of death (with credible intervals) for unarmed non-Black, unarmed Black, armed non-Black, and armed Black victims. What do you conclude about the moderating effect of being armed? (7 points)

  4. Finally, you want look more closely at the role of the interaction term you included in the previous model. Specify and estimate a final model that includes the same predictor variables as before (whether the victim was Black and whether the victim was armed), but do not include the interaction term. Use the Widely Applicable Information Criterion (WAIC) to discuss the relative predictive power of the model with and without the interaction. (Note: you will need to include the extra argument log_lik=TRUE in your ulam() command for any models you want to calculate the WAIC for—you can add it to the code for your previous responses). What do you conclude about the interaction term. Should you include it in the model? Why or why not? (4 points)