SOCI 620: Quantitative methods 2

Agenda

Parsimony
& overfitting

  1. Administrative
  2. Cocaine use among adolescents
  3. The inverse logit transformation
  4. Starting simple: intercept-only
    logistic regression
  5. Hands on:
    Estimating logistic regression
    using MCMC in R

Slides are licensed under CC BY-NC-SA 4.0

Cocaine
use among adolescents
(The trouble with binary outcomes)

A still of a young James Spader in a white suite and a shirt open to expose his whole chest, hair feathered in true mid-1980s wealthy youth style.

Cocaine use among adolescents

Why not use a
standard linear
regression?

Gaussian model for binary data?

Why not use a
standard linear
regression?

Wrong support

Normal distribution has a support of (-∞,∞), but we know the outcome variable takes on only two values.

Bad intuitive “fit”

Interpretation

Under some circumstances, results can be interpreted as proportions or probabilities, but this can lead to predicted values less than zero or more than one.

Gaussian model for binary data?

Why not use a
standard linear
regression?

Gaussian vs. Bernoulli

Gaussian (normal) distribution

Bernoulli distribution

Logistic regression model


The
inverse
logit transfor­mation

A still of a young James Spader in a white suite and a shirt open to expose his whole chest, hair feathered in true mid-1980s wealthy youth style.

Inverse logit transformation

Logit function

Takes values between 0 and 1, and turns
them into values between -∞ and ∞.

Inverse logit function

(aka ‘logistic’)

Takes values between -∞ and ∞, and turns
them into values between 0 and 1.

Inverse logit transformation

Inverse Logit transformation

Inverse logit transformation

x logit–1(x)
-20.119
-0.50.119
00.119
0.50.119
20.119

Intercept-only logistic model

Why this model instead of the model we built in the first week of class?

Logistic regression allows us to include explanatory covariates.

Priors in logistic regression

Priors in logistic regression

Pr(α)

logit–1(Pr(α))

Priors in logistic regression

Priors in logistic regression

Priors in logistic regression

Intercept-only logistic regression

Median 95% C.I.

-3.34 (-3.48, -3.20)

0.036 (0.031, 0.041)

0.034 (0.030, 0.039)

James Spader in Pretty in Pink

James Spader in Pretty in Pink