Incorporating Causal Inference in Statistics Courses

Agenda

  1. The team
  2. Why is the topic important?
  3. Initial list of potential outputs
  4. Potential overall structure of course content
  5. Challenges ahead
  6. Reference material

1. The team

2. Why is the topic important?

“The goal of causal inference is to estimate the effect in a population of intervening on one variable, the treatment, on another variable, the outcome.” (Cummiskey et al. 2020)

  • In introductory courses, we must go beyond the usual warning correlation does not imply causation
  • What is the underlying process that is generating the data?

2. Why is the topic important?

  • Carver et al. (2016) indicate that students should understand that statistics is a problem-solving and decision making process that is fundamental to scientific inquiry and essential for making sound decisions
  • Most importantly, when drawing causal conclusions on observational data, students should be skeptical (Cummiskey et al. 2020)

The Directed Acyclic Graph (DAG)

  • DAGs are handy tools to depict causality
  • Statisticians and subject-matter experts can work together to build them!

DAGs from Cummiskey et al. (2020).

3. Initial list of potential outputs

  • Outputs are in function of developing an undergraduate course (3rd or 4th-year level) spread across 13 weeks:
    • Overall structure of course content
    • Lesson plan
    • Learning objectives
  • Two initial fundamental pillars: experimentation and quasi-experimentation
  • A third and more advanced pillar: observational studies

4. Potential overall structure of course content

  • There are four blocks that start with an introduction to DAGs
  • Then, we go from full control to no control on the study treatments
  • A stronger emphasis on data science-related applications such as A/B testing

A mind map going clockwise!

mindmap
  root(Course in Causal Inference)
    {{BLOCK 3:<br/>Quasi experimentation}}
    {{BLOCK 2:<br/>Experimentation}}
    {{BLOCK 4:<br/>Observational studies}}
    {{BLOCK 1:<br/>Introduction to DAGs}}

5. Challenges ahead

  • Is there room within the course to include Bayesian thinking?
  • How smooth would it be to introduce the fundamentals of causality via DAGs and Bayesian networks? (Lu, Zheng, and and 2023)

5. Challenges ahead

  • What are the pros and cons of having regression analysis as a prerequisite?
  • How theoretical should we conceived the course? Lübke et al. (2020) provide a fair simulation-based starting via generative modelling and regression analysis

5. Challenges ahead

  • Using different textbooks to develop course material such as:
    • Causal Inference: What If” by Hernán (2024), a review on causal inference with (i.e., parametric) or without (i.e., non-parametric) models
    • A First Course in Causal Inference” by Ding (2024), a review on randomized and observational studies
    • Causal Inference: The Mixtape” by Cunningham (2021), which provides insights on regression discontinuity for quasi-experimentation

Questions?

6. Reference material

Carver, R., M. Everson, J. Gabrosek, N. Horton, R. Lock, M. Mocko, A. Rossman, et al. 2016. Guidelines for Assessment and Instruction in Statistics Education: College Report 2016. American Statistical Association.
Cummiskey, Kevin, Bryan Adams, James Pleuss, Dusty Turner, Nicholas Clark, and Krista Watts and. 2020. “Causal Inference in Introductory Statistics Courses.” Journal of Statistics Education 28 (1): 2–8. https://doi.org/10.1080/10691898.2020.1713936.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. Yale University Press. https://mixtape.scunning.com/.
Ding, P. 2024. A First Course in Causal Inference. Chapman; Hall/CRC. https://doi.org/10.1201/9781003484080.
Hernán, Miguel A. 2024. Causal Inference: What If. Edited by James M. Robins. Boca Raton.
Lu, Yonggang, Qiujie Zheng, and Daniel Quinn and. 2023. “Introducing Causal Inference Using Bayesian Networks and Do-Calculus.” Journal of Statistics and Data Science Education 31 (1): 3–17. https://doi.org/10.1080/26939169.2022.2128118.
Lübke, Karsten, Matthias Gehrke, Jörg Horst, and Gero Szepannek and. 2020. “Why We Should Teach Causal Inference: Examples in Linear Regression with Simulated Data.” Journal of Statistics Education 28 (2): 133–39. https://doi.org/10.1080/10691898.2020.1752859.