The Regression Cookbook (in development)

Authors

G. Alexi Rodríguez-Arelis

Andy Tai

Ben Chen

Published

October 21, 2025

Abstract
This book aims to set a common ground between machine learning and statistics regarding linear regression techniques, using Python and R, under two perspectives: inference and prediction.

Preface

Let the regression cooking begin!

Data science is a field in which we become aware of the fascinating overlap between machine learning and statistics. Many data science students usually come across everyday machine learning and statistics concepts or ideas that might only differ in names. For instance, simple terms such as weights in supervised learning (and their statistical counterpart as regression coefficients) might be misleading for students starting their data science formation. On the other hand, from an instructor’s perspective in a data science program that subsets its courses in machine learning in Python and statistics in R, regression courses in R also demand the inclusion of Python-related packages as alternative tools. Furthermore, in a graduate program such as the Master of Data Science (MDS) at the University of British Columbia (UBC), this is especially critical for students whose career plan leans towards the industry job market where Python is more heavily used.

Image by Manfred Stege via Pixabay.

That said, we can state that data science is a substantial synergy between machine learning and statistics. Nevertheless, many gaps between both disciplines still need to be addressed. Thus, closing these critical gaps is imperative in a domain with accelerated growth, such as data science. In this regard, the MDS Stat-ML dictionary has inspired us to write this textbook. It basically consists of common ground between foundational supervised learning models from machine learning and regression models commonly used in statistics. We strive to explore linear modelling approaches as a primary step while explaining different terminology found in both fields. Furthermore, this discussion is more comprehensive than a simple conceptual exploration. Hence, the second step is hands-on practice via the corresponding Python packages for machine learning and R for statistics.

Fun fact!

While thinking about possible names for this work, we were planning to name it “Machine Learning and Statistics: A Common Ground.” Nevertheless, it was quite plain and boring! That said, this whole textbook idea sounded analogous to a cookbook1, given its heavily applied focus with theoretical sparks.

Hence, the cookbook name idea!

Acknowledgments

This textbook is shared as an Open Educational Resource (OER) using Quarto and is hosted on GitHub. One of its core goals is to motivate global academic communities to access, reuse, and customize its content to meet their specific needs. In the field of data science education, where openness, reproducibility, and collaboration are key, OER materials serve as a powerful means of enhancing learning and promoting equity. By making this statistical resource freely available and modifiable, we support a flexible, student-centered approach that reflects the open-source spirit of the field. Learners can engage with real code, adapt examples to their interests, and contribute back to the community.

Image by Manfred Stege via Pixabay.

The development of this resource is funded by the UBC Vancouver (UBCV) OER Fund, which was launched in 2019 through the UBC Academic Excellence Fund. This initiative aims to support more affordable and inclusive education by promoting the creation and integration of open resources in UBCV credit courses. With this support, we are able to design a resource that reduces textbook costs while enhancing pedagogical freedom and adaptability. Both instructors and students benefit from materials that can evolve alongside curriculum changes, disciplinary innovations, and learner diversity. This cookbook is part of a growing movement toward open, collaborative, and responsive teaching in higher education.

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Creative Commons License


  1. Special thanks to Jonathan Graves, who mentioned the cookbook term when this textbook was conceptualized during very early stages.↩︎