SOCI 620: Quantitative methods 2

Agenda

Probability models of social processes

  1. Administrative
  2. Probability of unemployment
  3. Bayes’ rule
  4. Hands on: random samples and grid approximations in R

Slides are licensed under CC BY-NC-SA 4.0

Administrative

Software check in

Getting started with R

  • Norm Matloff’s “fastR” introduction is simple and good:
    https://github.com/matloff/fasteR
  • For this class, completing (and understsanding!) the first eight lessons will give you a good foundation

Labs

  • Mondays 10-11am look like they’ll work, but not in this room
  • Next week (Jan 13): Leacock 808
  • Subsequent weeks: TBD

Probability of unemployment

Unemployment in Newfoundland and Labrador

  • How do we learn something about the risk of unemployment for adult residents of NL?
  • Ignoring (for now) contributing factors, we can ask:
    What is the probability that a randomly chosen adult is unemployed?
    i.e. unemployment rate (frequentist) – 9.6% according to StatCan

Probability model

  • Strategy: model the process with a parametric probability distibution, and estimate the model with a sample
  • Assuming we sample adults, of whom report being unemployed, we’ll model this with a binomial distribution
  • (Details on what this means shortly, but first, a demonstration)
Photo of misty coast with a few small buildings and many small fishing boats

Probability of unemployment

Unfortunately, our grant has run out, so we can only afford to sample 10 people:

Happy face with dollar sign Happy face with dollar sign Happy face with dollar sign Frusted face with squinted eyes Frusted face with squinted eyes Happy face with dollar sign Happy face with dollar sign Happy face with dollar sign Frusted face with squinted eyes Happy face with dollar sign

Data:

We’ll use this data to estimate the probability of unemployment in two ways

Probability of unemployment

Maximimum-likelihood (frequentist) estimation:

  1. Pick an estimator (such as sample proportion) of the probability
  2. Generate a point estimate of :

  3. Use an approximation of the sampling distribution to quantify uncertainty:

  4. Generate a confidence interval (e.g. for standard 95% CI)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E, E)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E, E, E)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E, E, E, U)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E, E, E, U, U)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)

(E, E, E, U, U, E, E, E, U, E)

Probability of unemployment

Posterior (Bayesian) estimation:

  1. Pick a prior (such as a uniform distribution) for the probability
  2. Update prior with data (one at a time or all at once)
  3. The posterior distribution describes the relative posterior probability for different values of

(E, E, E, U, U, E, E, E, U, E)

Comparing estimates

Maximum likelihood:

Posterior:

Bayesian updating

500 samples; uniform prior (click to animate)

Bayesian updating

500 samples; “informative” prior (click to animate)

Conditional probability

The “posterior” is represented as a conditional probability distribution (the probability of varying values of conditional on the value of ).

Bayes’ rule

Bayes’ rule is a simple formula that allows us to ‘flip’ a conditional probability

And for our unemployment model this becomes

Bayes’ rule

Posterior probability:

The posterior probability is our answer. It tells us everything we know about the probability of unemployment rate () given what we’ve learned from our sample ().

Bayes’ rule

Prior probability:

The prior probability is everything we claim to know about the probability of unemployment () before we ask anybody about their employment. It is the unconditional distribution of .

Bayes’ rule

Evidence:

The evidence is just the average probability of seeing our sample across all possible values of (normalizing the posterior). It is often the hardest part of a posterior to calculate.

Fortunately we can almost always ignore it.

Bayes’ rule

Likelihood:

The likelihood is where our model lives.

Building a parametric model

How to build a parametric model

  • Pretend that we already know the probability of being unemployed ()
  • Tell a story about what our sample might look like, assuming we already know

Reverse the logic of your question

In reality we know and want to learn about :

In our model we know and want to describe :

Building a parametric model

Binomial distribution

The probability of getting ‘successes’ in trials if the probability of success is :

Bayes’ rule

Likelihood:

The likelihood is where our model lives.
In this case, a binomial distribution is a good choice. Given a particular probabily of unemployment (and a sample size ), tells us how likely our sample is.

Bayes’ rule

Proportional posterior

In practice, we rarely need to calculate the “evidence” (the denominator) in Bayes’ formula:

The posterior probability is proportional to () the likelihood times the prior

Proportional posterior

Hands on: R and RMarkdown

Sample R script

Sample RMarkdown document

Image credit

Figures by Peter McMahan (source code)

Photo of misty coast with a few small buildings and many small fishing boats

Derrick Mercer, CC BY-SA 2.0, via Wikimedia Commons