- Real Time Signals India

# Machine Learning

This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control. The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.

**Prerequisites**

Students are expected to have the following background: Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program. Familiarity with the probability theory. Familiarity with linear algebra.

**About this course**

**What you'll learn**

**01: Introduction**

- Introduction to the course

- What is machine learning?

- Supervised learning

- introduction

- Unsupervised learning - introduction

**02: Regression Analysis and Gradient Descent**

- Linear Regression

- Linear regression

- implementation (cost function)

- A deeper insight into the cost function

- simplified cost function

- Gradient descent algorithm

- So no need to change alpha over time

- Linear regression with gradient descent

**03: Linear Algebra - review**

- Matrices - overview

- Vectors - overview

- Matrix manipulation

- Implementation/use

- Matrix multiplication properties

- Inverse and transpose operations

**04: Linear Regression with Multiple Variables**

- Linear regression with multiple features

- Gradient descent for multiple variables

- Gradient Decent in practice: 1 Feature Scaling

- Learning Rate a - Features and polynomial regression

- Normal equation

**05: Logistic Regression**

- Classification - Hypothesis representation

- Decision boundary

- Non-linear decision boundaries

- Cost function for logistic regression

- Simplified cost function and gradient descent

- Advanced optimization

- Multiclass classification problems

**06: Regularization**

- The problem of overfitting

- Cost function optimization for regularization

- Regularized linear regression

- Regularization with the normal equation

- Advanced optimization of regularized linear regression

**07: Neural Networks - Representation**

- Neural networks

- Overview and summary

- Model representation 1

- Model representation II

- Neural network example

- Computing a complex, nonlinear function of the input

- Multiclass classification

**08: Neural Networks - Learning**

- Neural network cost functionx

- Summary of what's about to go down

- Back propagation algorithm

- Back propagation intuition

- Implementation notes

- Unrolling parameters (matrices)

- Gradient checking - Random initialization - Putting it all together

**09: Advice for applying machine learning techniques**

- Deciding what to try next

- Evaluating a hypothesis

- Model selection and training validation test sets

- Diagnosis

- Bias vs. variance

- Regularization and bias/variance

- Learning curves

**10: Machine Learning System Design**

- Machine learning systems design

- Prioritizing what to work on

- Spam classification example

- Error metrics for skewed analysis

- Trading off precision and recall

- Data for machine learning

**11: Support Vector Machines**

- Support Vector Machine (SVM)

- Optimization objective

- Large margin intuition

- Large margin classification mathematics (optional)

- Kernels - 1: Adapting SVM to non-linear classifiers

- Kernels II

**12: Clustering**

- Unsupervised learning

- Introduction

- K-means algorithm

- K means optimization objective

- How do we choose the number of clusters?

**13: Dimensionality Reduction**

- Motivation 1: Data compression

- Motivation 2: Visualization

- Principle Component Analysis (PCA): Problem Formulation

- PCA Algorithm - Reconstruction from Compressed Representation

- Choosing the number of Principle Components

- Advice for Applying PCA

**14: Anomaly Detection**

- Anomaly detection

- Problem motivation

- The Gaussian distribution (optional)

- Anomaly detection algorithm

- Developing and evaluating and anomaly detection system

- Anomaly detection vs. supervised learning

- Choosing features to use

- Multivariate Gaussian distribution

- Applying multivariate Gaussian distribution to anomaly detection

**15: Recommender Systems**

- Recommender systems

- Introduction

- Content based recommendation

- Collaborative filtering

- Overview

- Collaborative filtering Algorithm

- Vectorization: Low rank matrix factorization

- Implementation detail: Mean Normalization

**16: Large Scale Machine Learning**

- Learning with large datasets

- Stochastic Gradient Descent

- Mini Batch Gradient Descent

- Stochastic gradient descent convergence

- Online learning

- Map reduce and data parallelism

**17: Application Example - Photo OCR**

- Problem description and pipeline

- Sliding window image analysis

**18: Course Summary**