---
title: "Assignment 1, SOCI 620, Winter 2022"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
_Due Tues, Jan 25_
In response to recent reports that the [COVID-19 test positivity rates in Alberta are well over 30%](https://www.cbc.ca/news/canada/calgary/alberta-covid-coronavirus-january-17-1.6317482), we will imagine that the Explorers and Producers Association of Canada (EPAC) publishes a press release claiming otherwise. In this imagined press release, they claim to have surveyed a number of patients who recieved the test, and performed a sophisticated Bayesian analysis that shows with 95% confidence that the positivity rate is _below_ 30%. They note that they included prior information from a panel of experts to help bolster their analylsis, and they include the following depiction of the posterior probability density of positivity in their publication:
![](https://soci620.netlify.app/problem_sets/img/positivity.png)
In light of the study's conflict of interest (the oil industry does not want to shut down workplaces), you decide to investigate. The study gives links to the raw survey data and you are able to extract data representing a grid approximation of the posterior shown in the figure, but there is no information about what prior they used other than stating it was based on "expert input." You believe you can use the information they did provide reconstruct the prior.
## 1. Load the data
The raw survey data is at , and the 'reconstructed' posterior is at . Use the `read.csv()` command to load these, storing them in variables `surv_data` and `post_grid`, respectively.
## 2. Inspect the data
Inspect the data in `surv_data` and calculate the sample size and absolute number of respondents who tested positive. Report both of these numbersâ€”does the claim made in the study seem reasonable in light of the survey results?
## 3. Likelihood
Use the survey results calculated in the last step to construct a likelihood on the same grid provided in `post_grid` (you will probably want to use the `dbinom()` function). Plot the likelihood. Comparing the raw likelihood to the posterior published in the study, what do you suspect about the prior they might have used?
## 4. Reconstruct the prior
Use the likelihood you just calculated and the posterior provided in `post_grid` to calculate values that are proportional to the prior used in the study. (Hint: if $A\propto B\times C$, then $C\propto A/B$.) Plot this prior. Does this look like a "reasonable" prior to you? Why or why not?
## 5. Building a better posterior
Use a uniform prior over the proportion of people testing positive to construct your own approximate (grid-normalized) posterior. Use that posterior to take 10,000 random samples from the posterior. What is the approximate mean of this new posterior? What is the posterior probability that more than 30% of the population tested positive? What do you conclude about the true positivity rate? Can you put some bounds on what you think the real positivity rate is, given the data?