SOCI 620: Quantitative Methods 2

Location Leacock 917
Time Winter 2025, Mon and Wed 8:35–9:55am
Instructor Peter McMahan
()
TA Chris Borst
Office hours Mondays 11:00–12:00 (Leacock 727 or online by appointment)
Work sessions Mondays 10:00–11:00am (Location TBD)
Syllabus https://soci620.netlify.app/

Description

As the second of two courses in the quantitative methods sequence, this class will build insight into the underlying logic of standard multivariate regression models and introuce students to statistical techniques that extend and diverge from those models. To this end, the course will have two main goals. First, students will become familiar with a range of quantitative methods common in social science research. Methodological topics will include generalized linear models for predicting categorical, ordered, and count data, multilevel/hierarchical linear models, and strategies for analyzing time-series and panel data. Students will learn to critically interpret these methods as they are used in the literature, and to utilize the methods for their own research.

The second (and perhaps the more central) goal of the course will be to provide students with an overarching framework to understand not just the methods we discuss, but the models and techniques they may encounter elsewhere in the literature. Instruction will therefore focus on a probabilistic interpretation of statistical models. In addition to fostering a strong understanding of statistical dependence and parametric estimation, this approach will unify the methods we cover and enable students to build interpretable, theory-driven models of their own.

Requirements

Students are expected to be familiar with the readings, engage during class (in-class and/or online), complete assignments, and prepare an independent research project. Students must have taken SOCI 514 or a similar class, as a basic understanding of multivariate regressions is assumed. While knowledge of the R statistical language is not assumed, familiarity with at least one advanced statistical or general programming language (e.g. R, Stata, Python, Matlab, …) will be very helpful.

Class

The scheduled classes for the course will be hybrid lectures, discussions, and instructor-led lab sessions. It is vital that students attend class regularly, having completed the readings and prepared to engage with the topics covered.

Slides and code will be made available before class. Lectures will be live-streamed on Teams (when necessary), and recordings will be posted shortly after class.

Work sessions

In addition to the scheduled classes, we will have a weekly work session (location to be determined). These sessions will be led by the T.A. as a space to practice techniques, raise questions and concerns, and discuss course content with one another. Attendance at these sessions is optional but strongly encouraged.

Equipment and software

We will be working with data and learning analysis and visualization in-class, so students must bring a laptop computer with them. Mobile devices such as tablets and phones, even with an external keyboard, will not be sufficient. If you do not have access to a laptop please talk to me as soon as possible so we can work out a way for you to participate.

We will be using the R statistical language and software for data processing, statistical estimation, and visualization in this class. It is recommended that students install the RStudio graphical interface, which I will be using for demonstration in class.

Readings

We will use the second edition of the textbook Statistical Rethinking by Richard McElreath for the course (McElreath 2020). The book is available as an ebook through the library website. If you are unable to access the textbook, please let me know as soon as possible.

Worksheets

There will be five worksheets due throughout the semester. These are intended to help you learn to use the methods we discuss in R and to give you practice in interpreting statistical models.

Each worksheet will be structured with a provided R Markdown document. R Markdown provides a way to mix code (in the R statistical language) and prose into a single document. Worksheets will be distributed ahead of Monday labs, and will be due Wednesday of the following week. Students can (and are encouraged to!) work together and consult one another on assignments, but each student will be responsible for completing their own worksheet by the due date.

Worksheets will be evaluated using peer assessment. After the deadline for a worksheet, each student will be responsible for evaluating an anonymized version of one of their classmate’s worksheets. The peer assessment is intended to expose students to different programming styles and interpretations of the data, and to encourage the production of readable R code.

Independent research project

Each student will finish an independent research project by the end of the semester. These projects will be empirical, scholarly analyses, including a source of data, a well formed research question, a motivated statistical analysis, and a thorough interpretation of the results. Ideally, the projects will be related to work students are doing outside of class. Projects that represent a piece of a student’s broader research agenda are encouraged. The projects will be graded on the basis of four required assessments:

  1. Precis (Due Thu, Mar 9): This will be a short (no more than one page) description of the research project. It should include a specific research question, a brief description of the data that will be used, and an outline of the analytical strategy that will be employed. The purpose of the precis is to motivate the project and to establish its feasibility, not to perform any analyses or to answer any research questions.
  2. Proposal (Due Thu, Mar 23): Based on the feedback received from the precis, the project proposal will give a more detailed account of the research project. A good proposal will give a thorough account of the data that is being used, including some preliminary summaries and analyses. It will also articulate the research question in terms of statistical models and will specify those models formally.
  3. Presentation (Due Thu, Apr 6): Each student will give a brief, PechaKucha-style presentation of their final project in class, consisting of twenty slides that will automatically advance ever twenty seconds. The presentation should describe your research question succinctly, give a clear account of the statistical model(s) used, and briefly interpret the results in light of the research question. (Details on final presentation format)
  4. Project write-up (Due Fri, Apr 21): The writeup for the final project will take the form of a formal scholarly paper. This should go into careful detail about the project, including a full description of the data, exposition and motivation of the statistical models used, a summary of the estimation of the model parameters, and a careful, thorough interpretation of the results. It should include tables and figures to illustrate your analysis.

Each student should arrange a brief meeting with me early in the term to discuss ideas for their research project and the appropriateness for the course.

Evaluation

The evaluation components and due dates for this course are strict. If outside circumstances will make it difficult to meet a requirement please raise the issue with me as soon as possible so we can find a solution. Regular absences will affect your ability to do well on assignments and the final project.

Note: In the event of extraordinary circumstances beyond the University’s control, the content and/or evaluation scheme in this course is subject to change

Worksheet 1 Wed, Jan 22 6% of final grade
WS1 peer assessment Mon, Jan 27 3% of final grade
Worksheet 2 Wed, Feb 5 6% of final grade
WS2 peer assessment Mon, Feb 10 3% of final grade
Worksheet 3 Wed, Feb 19 6% of final grade
WS3 peer assessment Mon, Feb 24 3% of final grade
Worksheet 4 Wed, Mar 12 6% of final grade
WS4 peer assessment Wed, Mar 19 3% of final grade
Worksheet 5 Wed, Mar 26 6% of final grade
WS5 peer assessment Mon, Mar 31 3% of final grade
Project précis Thu, Mar 9 5% of final grade
Project proposal Thu, Mar 23 10% of final grade
Project presentation Thu, Apr 6 15% of final grade
Project writeup Fri, Apr 21 25% of final grade

Policies

Accessibility

Students who need accommodation or who are having trouble accessing any aspect of the course may contact me directly. I will make every effort to accommodate individual situations, including religious, medical, or other personal circumstances.

Students with disabilities or otherwise in need of formal accommodation are encouraged to contact the Office for Student Accessibility & Achievement (formerly Office for Students with Disabilities: https://www.mcgill.ca/access-achieve/, phone 514-398-6009).

Les étudiants qui ont besoin d’un accommodation ou qui ont des difficultés à accéder à un aspect du cours peuvent me contacter directement. Je ferai tout mon possible pour tenir compte des circonstances individuelles, y compris des circonstances religieuses, médicales ou autres.

Les étudiants handicapés ou ayant besoin d’un aménagement formel sont encouragés à contacter le Service étudiant d’accessibilité et d’aide à la réussite (https://www.mcgill.ca/access-achieve/fr, téléphone 514-398-6009).

Academic integrity

McGill University values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (see http://www.mcgill.ca/students/srr/honest/ for more information).(approved by Senate on 29 January 2003)

L’université McGill attache une haute importance à l’honnêteté académique. Il incombe par conséquent à tous les étudiants de comprendre ce que l’on entend par tricherie, plagiat et autres infractions académiques, ainsi que les conséquences que peuvent avoir de telles actions, selon le Code de conduite de l’étudiant et des procédures disciplinaires (pour de plus amples renseignements, veuillez consulter le site http://www.mcgill.ca/students/srr/honest/).

Language of evaluation

In accord with McGill University’s Charter of Students’ Rights, students in this course have the right to submit in English or in French any written work that is to be graded. (approved by Senate on 21 January 2009)

Conformément à la Charte des droits de l’étudiant de l’Université McGill, chaque étudiant a le droit de soumettre en français ou en anglais tout travail écrit devant être noté (sauf dans le cas des cours dont l’un des objets est la maîtrise d’une langue).

Generative AI

The use of generative artificial intelligence tools or apps for assignments in this course, including tools like ChatGPT, Apple Intelligence, Gemini, Claude, Microsoft Copilot and other AI writing or coding assistants, is prohibited. While the use of grammar- and spell-checking software is permitted, products and services that rewrite, summarize, paraphrase, or otherwise substantially change input text, including Grammarly’s “rewrite” and “paraphrase” features and Apple’s “writing tools”, are prohibited.

Late submissions

Assignments that are submitted late (without prior approval for an extension) will be assessed with the following penalties: 1. 15 percentage points deducted from submissions up to 24 hours late 2. 10 percentage points for each additional 24 hours (or portion thereof) late

In addition to the above penalties, late work with a peer assessment component will be assessed solely by the instructor and teaching assistants. In these cases, students who submit late may also be unable to provide assessments to peers, which may further affect their grade.

Grade appeals

Instructors and teaching assistants take the marking of assignments very seriously, and we work diligently to be fair, consistent, and accurate. Nonetheless, mistakes and oversights occasionally happen. If you believe that to be the case, you must adhere to the following rules:

  • If it is a mathematical error simply alert the instructor of the error.
  • In the case of more substantive appeals, you must:
    1. Wait at least 24 hours after receiving your mark.
    2. Carefully re-read your assignment, all guidelines and marking schemes, and the grader’s comments.
    3. If you wish to appeal, you must submit to the instructor a written explanation of why you think your mark should be altered. Please note that upon re-grade your mark may go down, stay the same, or go up.

Schedule

Background: parametric probability models

Mon, Jan 6
Lecture topics:
  • Introductions, course structure, syllabus
    (html;  pdf)
Required:
  • (McElreath 2020, Ch. 1)

Wed, Jan 8
Lecture topics:
  • Probability models of social processes
    (html;  pdf)
In-class lab:
Required:
  • (McElreath 2020, Ch. 2)

Mon, Jan 13
Lecture topics:
  • Probability distributions and random samples
    (html;  pdf)
Required:
  • (McElreath 2020, Ch. 3)

Wed, Jan 15
Lecture topics:
  • Estimating multiple parameters
    (html;  pdf)
Required:
  • (McElreath 2020, secs. 4.1–4.3)

Linear models and model checking

Mon, Jan 20
Lecture topics:
  • Linear regressions as probability models
    (html;  pdf)
In-class lab:
Required:
  • (McElreath 2020, Sec 4.4-4.7)

Supplementary:
  • (McElreath 2020, Ch. 5)

Wed, Jan 22
Lecture topics:
  • Covariates for causal analysis
In-class lab:
  • Creating indicators and transforming variables

Required:
  • (McElreath 2020, Ch. 6)

Due:
  • Worksheet 1
Mon, Jan 27
Lecture topics:
  • Checking models and estimates
In-class lab:
  • Prior and posterior predictive plots

Due:
  • WS1 peer assessment
Wed, Jan 29
Lecture topics:
  • Parsimony and overfitting
In-class lab:
  • Deviance and information criteria

Required:
  • (McElreath 2020, Ch. 7)

Generalized linear models

Mon, Feb 3
Lecture topics:
  • Logistic regression and the logit link function
In-class lab:
  • Intercept-only logistic regression

Required:
  • (McElreath 2020, Ch. 10 and Section 11.1)

Wed, Feb 5
Lecture topics:
  • Logistic regression: methods and interpretation
In-class lab:
  • Prior-predictive simulation

Due:
  • Worksheet 2
Mon, Feb 10
Lecture topics:
  • Counts and rates
In-class lab:
  • Poisson regression in R using glm and brm

Required:
  • (McElreath 2020, sec. 11.2)

Due:
  • WS2 peer assessment
Wed, Feb 12
Lecture topics:
  • Expanding on Poisson models
In-class lab:
  • Overdispersed and zero-inflated Poisson regressions in R

Required:
  • (McElreath 2020, secs. 12.1–12.2)

Mon, Feb 17
Lecture topics:
  • Categorical outcomes
In-class lab:
  • Multinomial regression in R

Required:
  • (McElreath 2020, secs. 11.3–11.5)

Wed, Feb 19
Lecture topics:
  • Cumulative probability and ordinal outcomes
In-class lab:
  • Ordered logistic regression in R

Required:
  • (McElreath 2020, secs. 12.2–12.5)

Supplementary:
  • Ordinal regressions (Michael Betancourt 2019)

Due:
  • Worksheet 3

Complications in data and estimation

Mon, Feb 24
Lecture topics:
  • Assessing convergence in estimation
In-class lab:
  • Common convergence issues with lme4 and brms

Required:
  • (McElreath 2020, Ch. 9)

Due:
  • WS3 peer assessment
Wed, Feb 26
Lecture topics:
  • Missing data
In-class lab:
  • Imputing missing data

Required:
  • (McElreath 2020, Ch. 15)

Mon, Mar 10
Lecture topics:
  • Non-uniform samples
In-class lab:
  • Incorporating weights

Multilevel models

Wed, Mar 12
Lecture topics:
  • Nested data and partial pooling
In-class lab:
  • Partial pooling of averages

Required:
  • (McElreath 2020, sec. 13.1)

Due:
  • Worksheet 4
Mon, Mar 17
Lecture topics:
  • Random intercept models
In-class lab:
  • Random intercepts in R

Required:
  • (McElreath 2020, sec. 13.2)

Wed, Mar 19
Lecture topics:
  • Introduction to random slopes
In-class lab:
  • Simple random slopes in R

Required:
  • (McElreath 2020, sec. 13.4)

Due:
  • WS4 peer assessment
Mon, Mar 24
Lecture topics:
  • Covariance of coefficients and the LKJ prior
In-class lab:
  • Specifying LKJ priors in brms

Required:
  • (McElreath 2020, secs. 14.1–14.2)

Wed, Mar 26
Lecture topics:
  • Two-level models in detail
In-class lab:
  • Two-level model with lme4 and brms

Required:
  • (McElreath 2020, secs. 14.3–14.4)

Due:
  • Worksheet 5
Mon, Mar 31
Lecture topics:
  • Multilevel GLM and R formula specification
In-class lab:
  • GMLM in with lme4 and brms

Due:
  • WS5 peer assessment

Building more complex models

Wed, Apr 2
Lecture topics:
  • Time series, three-level, and non-nested models
Required:
  • (McElreath 2020, sec. 16.4)

Mon, Apr 7
Lecture topics:
  • Wildcard (students’ choice)

Presentations

Wed, Apr 9

Student presentations

References

McElreath, Richard. 2020. Statistical Rethinking : A Bayesian Course with Examples in R and Stan. Second. Boca Raton : Chapman & Hall/CRC,.
Michael Betancourt. 2019. “Ordinal Regression.” May 2019. https://betanalpha.github.io/assets/case_studies/ordinal_regression.html.