My notes on statistics
i! Thank you for your visit. Below, you will find an index of my notes on statistics. That’s how I’ve learned and keep learning. If you learn the same way as I do, maybe you will find them time saving. They are product of my self-study on statistics. First, a little about these pages:
Purposes
The main purposes behind my notes on statistics:
Study platform: I expect my notes on statistics to be the platform for learning and improving until I get more “fluent” in R programming.
Platform for practical cases: For the next months, I will be working out practical exercises with step-by-step solutions. From these, I be linking to ‘My notes on statistics’ for more details on the respective statistical disciplines.
Platform for dialog: I hope that ‘My notes on statistics’ can be a platform for dialogue with you, and I invite you to use the comment fields provided on each page. Thank you for any constructive comment.
Self-promotion: The intention behind my self-studies of statistics is to able to live from it in the long run. One of the aims with my site, dataZ4s.com, is to support this intention.
Learning center: Maybe, my notes can be a learning platform for other self-learners. My plan is to gather a larger collection of exercises that I will solve and upload with step-by-step solutions. Maybe, this material will one day grow and improve enough so that it can help others to save time and contribute to learning from their hubs.
‘Learning statistics’ paragraph
Hopefully, these pages can help other students and professionals saving time. Most pages are provided with a ‘Learning statistics’ paragraph which is a collections of learning material that I have found helpful. The lists are mainly composed by shorter videos and, as you will see, my top learning places have been Khan Academy, JBstatistics, Statisticsfun and the Danish Technical University (DTU), but many other sources have been greatly helpful too and are listed in ‘Learning statistics’.
‘Subject in Excel’ paragraph
I had hoped to start out in R programming. But I soon enough found out that my beginner’s level let to I spend more time studying R than statistics. Instead, and until (soon!) improving my R skills, I have worked out a temporary Excel environment for all the statistical calculations which you will see in almost all pages as they are provided with the paragraph “Subject in Excel”. For example: “Confidence intervals in Excel”.
Sorry LaTeX and graphs
Formulas and expressions are written with LaTeX. I did consider GitHub and HUGO as web platform and maybe I should have done that. I really would have liked to when I found out that LaTeX is not supported by my current WP theme.
I am using RMarkdown for mathematical and statistical formulas and expressions. I then copy/paste them into PowerPoint, edit them, save them as JPEG and the upload them to my page (oh, dear!). And I am promoting myself within workflow! But soon R will get me out of the mud. Same story for my graphs. The ones that I’ve not been able to create in R or Excel, I have hand drawn in PowerPoint. I think most of them work well as their illustrative purpose, though. What do you think?
Summer 2020
The following months (May-August) I will be adding the relevant R code to the relevant pages. This, I will do along site with the working out bunches of exercises and, as mentioned, with step-by-step solutions. They will be attached to the respective note pages. In that way each page will have:
- Text on theory explained through examples
- Videos on theory explained through examples
- Exercises with worked step-by-step solutions (by hand, Excel and R)
Also, I will be adding subjects like:
- sample size calculation
- multiple regression
- time series analysis
- multiple comparison in ANOVA
Thanks for commenting
Your comments will go a long way and I highly each of them. Each page has a comments field below. Criticize, comment, or just say hello. Thank you so much!
Table of contents
Probability
- Sample space, events and probabilities
- Complement of an event
- Independent events
- Dependent events
- Mutually exclusive events
- Mutually inclusive events
- Permutations
- Combinations
- Conditional probability
- Law of total probability
- Bayes’ Theorem
Summarizing quantitative data
- Mean, median and mode
- Interquartile range (IQR)
- Variance and standard deviation of a population
- Variance and standard deviation of a sample
Discrete Probability Distribution
- Discrete vs. continuous random variables
- Discrete probability distributions
- Mean, variance and standard deviation
- Mean of sum & difference
- The binomial distribution
- Poisson distribution
- The geometric distribution
- Hypergeometric distribution
Modelling data distributions
- Continuous vs. discrete data
- Density curves
- Significance level
- Critical value
- Z-score
- The p-value
- The Central Limit Theorem
- Skewness and kurtosis
The Normal Distribution
Study design
Confidence intervals
Hypothesis Testing
- Hypothesis testing
- One-tailed tests
- Two-tailed tests
- Proportion hypothesis testing
- Hypothesis test for a mean
- Statistical power
- Power of test calculation
- Chi-square Goodness of Fit Test
Simple linear regression, fundamentals
- Scatter plots
- Correlation coefficient
- Regression line
- Squared errors of line
- Coefficient of determination, r2
Simple linear regression, Inference
- Inference about regression
- The LINER model
- Residual plots
- Standard error of the slope
- Confidence interval for the slope
- Hypothesis test for the slope
- Mean and single response intervals
- Influential points
- Precautions in simple linear regression
- Transformation of data
Two-sample inference for the dif. between groups
ANOVA and the F-distribution

Carsten Grube
Freelance Data Analyst
Normal distribution
Confidence intervals
Simple linear regression, fundamentals
Two-sample inference
ANOVA & the F-distribution

+34 616 71 29 85
Call me

Spain: Ctra. 404, km 2, 29100 Coín, Malaga
...........
Denmark: c/o Musvitvej 4, 3660 Stenløse
Drop me a line
What are you working on just now? Can I help you, and can you help me?
About me
Learning statistics. Doing statistics. Freelance since 2005. Dane. Living in Spain. With my Spanish wife and two children.
Connect with me
What they say
20 years in sales, analysis, journalism and startups. See what my customers and partners say about me.