Assignment 1, SOCI 620, Winter 2021

Due Thurs, Jan 21

In response to recent reports that the COVID-19 test positivity rates in Alberta are greater than 10%, the Explorers and Producers Association of Canada (EPAC) publishes a press release claiming otherwise. In the press release, they claim to have surveyed a number of patients who recieved the test, and performed a sophisticated Bayesian analysis that shows with 95% confidence that the positivity rate is below 10%. They note that they included prior information from a panel of experts to help bolster their analylsis, and they include the following depiction of the posterior probability density of positivity in their publication:

In light of the study’s conflict of interest (the oil industry does not want to shut down workplaces), you decide to investigate. The study gives links to the raw survey data and you are able to extract data representing a grid approximation of the posterior shown in the figure, but there is no information about what prior they used other than stating it was based on “expert input.” You believe you can use the information they did provide reconstruct the prior.

  1. Load the data in R using the following commands
surv_data <- read.csv('')
post_grid <- read.csv('')
  1. Inspect the data in surv_data and calculate the sample size and absolute number of respondents who tested positive. Report both of these numbers—does the claim made in the study seem reasonable in light of the survey results? (2 points)
  2. Use the survey results calculated in the last step to construct a likelihood on the same grid provided in post_grid (you will probably want to use the dbinom() function). Plot the likelihood. Comparing the raw likelihood to the posterior published in the study, what do you suspect about the prior they might have used? (2 points)
  3. Use the likelihood you just calculated and the posterior provided in post_grid to calculate values that are proportional to the prior used in the study. (Hint: if A ∝ B × C, then C ∝ A/B.) Plot this prior. Does this look like a “reasonable” prior to you? Why or why not? (3 points)
  4. Use a uniform prior over the proportion of people testing positive to construct your own approximate (grid-normalized) posterior. Use that posterior to take 1,000 random samples from the posterior. What is the approximate mean of this new posterior? What is the posterior probability that more than 10% of the population tested positive? What do you conclude about the true positivity rate? (3 points)