**Statistics for Data Science and Business Analysis**

What you’ll learn

- Understand the fundamentals of statistics
- Learn how to work with different types of data
- How to plot different types of data
- Calculate the measures of central tendency, asymmetry, and variability
- Calculate correlation and covariance
- Distinguish and work with different types of distributions
- Estimate confidence intervals
- Perform hypothesis testing
- Make data driven decisions
- Understand the mechanics of regression analysis
- Carry out regression analysis
- Use and understand dummy variables
- Understand the concepts needed for data science even with Python and R!

**Introduction**

1 What does the course cover?

2 Download all resources

**Sample or population data?**

3 Understanding the difference between a population and a sample

**The fundamentals of descriptive statistics**

4 The various types of data we can work with

5 Levels of measurement

6 Categorical variables. Visualization techniques for categorical variables

7 Categorical variables. Visualization techniques. Exercise

8 Numerical variables. Using a frequency distribution table

9 Numerical variables. Using a frequency distribution table. Exercise

10 Histogram charts

11 Histogram charts. Exercise

12 Cross tables and scatter plots

13 Cross tables and scatter plots. Exercise

**Measures of central tendency, asymmetry, and variability**

14 The main measures of central tendency: mean, median and mode

15 Mean, median and mode. Exercise

16 Measuring skewness

17 Skewness. Exercise

18 Measuring how data is spread out: calculating variance

19 Variance. Exercise

20 Standard deviation and coefficient of variation

21 Standard deviation and coefficient of variation. Exercise

22 Calculating and understanding covariance

23 Covariance. Exercise

24 The correlation coefficient

25 Correlation coefficient

**Practical example: descriptive statistics**

26 Practical example

27 Practical example: descriptive statistics

**Distributions**

28 Introduction to inferential statistics

29 What is a distribution?

30 The Normal distribution

31 The standard normal distribution

32 Standard Normal Distribution. Exercise

33 Understanding the central limit theorem

34 Standard error

**Estimators and estimates**

35 Working with estimators and estimates

36 Confidence intervals – an invaluable tool for decision making

37 Calculating confidence intervals within a population with a known variance

38 Confidence intervals. Population variance known. Exercise

39 Confidence interval clarifications

40 Student’s T distribution

41 Calculating confidence intervals within a population with an unknown variance

42 Population variance unknown. T-score. Exercise

43 What is a margin of error and why is it important in Statistics?

**Confidence intervals: advanced topics**

44 Calculating confidence intervals for two means with dependent samples

45 Confidence intervals. Two means. Dependent samples. Exercise

46 Calculating confidence intervals for two means with independent samples (part 1)

47 Confidence intervals. Two means. Independent samples (Part 1). Exercise

48 Calculating confidence intervals for two means with independent samples (part 2)

49 Confidence intervals. Two means. Independent samples (Part 2). Exercise

50 Calculating confidence intervals for two means with independent samples (part 3)

**Practical example: inferential statistics**

51 Practical example: inferential statistics

52 Practical example: inferential statistics

**Hypothesis testing: Introduction**

53 The null and the alternative hypothesis

54 Further reading on null and alternative hypotheses

55 Establishing a rejection region and a significance level

56 Type I error vs Type II error

**Hypothesis testing: Let’s start testing!**

57 Test for the mean. Population variance known

58 Test for the mean. Population variance known. Exercise

59 What is the p-value and why is it one of the most useful tools for statisticians

60 Test for the mean. Population variance unknown

61 Test for the mean. Population variance unknown. Exercise

62 Test for the mean. Dependent samples

63 Test for the mean. Dependent samples. Exercise

64 Test for the mean. Independent samples (Part 1)

65 Test for the mean. Independent samples (Part 1)

66 Test for the mean. Independent samples (Part 2)

67 Test for the mean. Independent samples (Part 2). Exercise

**Practical example: hypothesis testing **

68 Practical example: hypothesis testing

**The fundamentals of regression analysis**

69 Introduction to regression analysis

70 Correlation and causation

71 The linear regression model made easy

72 What is the difference between correlation and regression?

73 A geometrical representation of the linear regression model

74 A practical example – Reinforced learning

**Subtleties of regression analysis**

75 Decomposing the linear regression model – understanding its nuts and bolts

76 What is R-squared and how does it help us?

77 The ordinary least squares setting and its practical applications

78 Studying regression tables

79 Regression tables. Exercise

80 The multiple linear regression model

81 The adjusted R-squared

82 The adjusted R-squared

83 What does the F-statistic show us and why do we need to understand it?

**Assumptions for linear regression analysis**

84 OLS assumptions

85 A1. Linearity

86 A2. No endogeneity

87 A3. Normality and homoscedasticity

88 A4. No autocorrelation

89 A5. No multicollinearity

**Dealing with categorical data**

90 Dummy variables

**Practical example: regression analysis**

91 Practical example: regression analysis

**Bonus lecture**

92 Bonus lecture: Next steps

