**Data Science and Machine Learning with Python**

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 9 Hours | 2.52 GB

Perform data mining and Machine Learning efficiently using Python and Spark

The job of a data scientist is one of the most lucrative jobs out there today – it involves analyzing large amounts of data, and gathering actionable business insights from it using a variety of tools. This course will help you take your first steps in the world of data science, and empower you to conduct data analysis and perform efficient machine learning using Python. Gain value from your data using the various data mining and data analysis techniques in Python, and develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark. You don’t have to be an expert coder in Python to get the most out of this course – just a basic programming knowledge of Python is sufficient.

What You Will Learn

- Learn how to clean your data and ready it for analysis
- Implement the popular clustering and regression methods in Python
- Train efficient machine learning models using Decision Trees and Random Forests
- Visualize the results of your analysis using Python’s Matplotlib library
- Visualize the results of your analysis using Python’s Matplotlib library

**Getting Started**

01 Introduction

02 Getting What You Need

03 Activity Installing Enthought Canopy

04 Python Basics, Part 1

05 Activity Python Basics, Part 2

06 Running Python Scripts

Code.zip

**Statistics and Probability Refresher, and Python Practise**

07 Types of Data

08 Mean, Median, Mode

09 Activity Using mean, median, and mode in Python

10 Activity Variation and Standard Deviation

11 Probability Density Function Probability Mass Function

12 Common Data Distributions

13 Activity Percentiles and Moments

14 Activity A Crash Course in matplotlib

15 Activity Covariance and Correlation

16 Exercise Conditional Probability

17 Exercise Solution Conditional Probability of Purchase by Age

18 Bayes Theorem

**Predictive Models**

19 Activity Linear Regression

20 Activity Polynomial Regression

21 Activity Multivariate Regression, and Predicting Car Prices

22 Multi-Level Models

**Machine Learning with Python**

23 Supervised vs. Unsupervised Learning, and TrainTest

24 Activity Using TrainTest to Prevent Overfitting a Polynomial Regression

25 Bayesian Methods Concepts

26 Activity Implementing a Spam Classifier with Naive Bayes

27 K-Means Clustering

28 Activity Clustering people based on income and age

29 Measuring Entropy

31 Decision Trees Concepts

32 Activity Decision Trees Predicting Hiring Decisions

33 Ensemble Learning

34 Support Vector Machines SVM Overview

35 Activity Using SVM to cluster people using scikit-learn

**Recommender Systems**

36 User-Based Collaborative Filtering

37 Item-Based Collaborative Filtering

38 Activity Finding Movie Similarities

39 Activity Improving the Results of Movie Similarities

40 Activity Making Movie Recommendations to People

41 Exercise Improve the recommenders results

**More Data Mining and Machine Learning Techniques**

42 K-Nearest-Neighbors Concepts

43 Activity Using KNN to predict a rating for a movie

44 Dimensionality Reduction Principal Component Analysis

45 Activity PCA Example with the Iris data set

46 Data Warehousing Overview ETL and ELT

47 Reinforcement Learning

**Dealing with Real-World Data**

48 Activity K-Fold Cross-Validation to avoid overfitting

48 BiasVariance Tradeoff

50 Data Cleaning and Normalization

51 Activity Cleaning web log data

52 Normalizing numerical data

53 Activity Detecting outliers

**Apache Spark Machine Learning on Big Data**

55 Activity Installing Spark – Part 2

56 Spark Introduction

57 Spark and the Resilient Distributed Dataset RDD

58 Introducing MLLib

59 Activity Decision Trees in Spark

60 Activity K-Means Clustering in Spark

61 TF IDF

62 Activity Searching Wikipedia with Spark

**Experimental Design**

63 AB Testing Concepts

64 T-Tests and P-Values

65 Activity Hands-on With T-Tests

66 Determining How Long to Run an Experiment

67 AB Test Gotchas

**You made it**

68 More to Explore

70 Bonus Lecture Discounts on Focused MapReduce and Spark Courses.

Resolve the captcha to access the links!