Statistics cheat sheet for data scientists pdf

A helpful 5-page data science cheatsheet to assist with exam reviews, interview prep, and anything in-between. It covers over a semester of introductory machine learning, and is based on MIT's Machine Learning courses 6.867 and 15.072. The reader should have at least a basic understanding of statistics and linear algebra, though beginners may find this resource helpful as well.

Inspired by Maverick's Data Science Cheatsheet (hence the 2.0 in the name), located here.

Topics covered:

  • Linear and Logistic Regression
  • Decision Trees and Random Forest
  • SVM
  • K-Nearest Neighbors
  • Clustering
  • Boosting
  • Dimension Reduction (PCA, LDA, Factor Analysis)
  • Natural Language Processing
  • Neural Networks
  • Recommender Systems
  • Reinforcement Learning
  • Anomaly Detection
  • Time Series
  • A/B Testing

This cheatsheet will be occasionally updated with new/improved info, so consider a follow or star to stay up to date.

Future additions (ideas welcome):

  • Time Series Added!
  • Statistics and Probability Added!
  • Data Imputation
  • Generative Adversarial Networks
  • Graph Neural Networks

Links

  • Data Science Cheatsheet 2.0 PDF

Screenshots

Here are screenshots of a couple pages - the link to the full cheatsheet is above!

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Why is Python/SQL not covered in this cheatsheet?

I planned for this resource to cover mainly algorithms, models, and concepts, as these rarely change and are common throughout industries. Technical languages and data structures often vary by job function, and refreshing these skills may make more sense on keyboard than on paper.

License

Feel free to share this resource in classes, review sessions, or to anyone who might find it helpful :)

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

List of Data Science Cheatsheets to rule the world.

Table of Contents


Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Datacamp

Statistics cheat sheet for data scientists pdf

Dataquest

Statistics cheat sheet for data scientists pdf

Others

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Datacamp

-xts (PDF)

Statistics cheat sheet for data scientists pdf

RStudio

From @afshinea, @stat110 and @wzchen:

Python

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

R

Statistics cheat sheet for data scientists pdf

Python

Statistics cheat sheet for data scientists pdf

R

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

_ H2O (PDF)

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Supervised Learning

From @afshinea:

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Unsupervised Learning

From @afshinea:

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Hacks, tricks and tips

From @afshinea:

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Chossing the right model

Neural Nets

R

Statistics cheat sheet for data scientists pdf

Python

- Keras (PDF)

Statistics cheat sheet for data scientists pdf

From @afshinea:

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Python

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

  • Comprehensive Guide to Data Visualization in Python

Statistics cheat sheet for data scientists pdf

R

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

Statistics cheat sheet for data scientists pdf

By @ml874

Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf
Statistics cheat sheet for data scientists pdf

What is data science cheat sheet?

A helpful 5-page data science cheatsheet to assist with exam reviews, interview prep, and anything in-between. It covers over a semester of introductory machine learning, and is based on MIT's Machine Learning courses 6.867 and 15.072.

What are the 4 basic elements of statistics?

Sample size, variables required, numerical summary tools, and conclusions are the four elements of a descriptive statistics problem.

What does μ0 mean?

One-Sample. Tests whether the mean of a normally distributed population is different from a specified value. Null Hypothesis (H0): states that the population mean is equal to some value (μ0)

What are the most important topics in statistics?

Statistics Department Common discrete and continuous distributions. Bivariate distributions. Conditional probability. Random variables, expectation, variance.