Search
• Real Time Signals India

# Data Science

Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. # Machine Learning

## 1. What is Data Science

1. Demand of Data Science

2. Venn Diagram

3. Pipeline

4. Roles

5. Team

6. Knowledge Check

# 2. Field of study

1. Big Data overview

2. Programming involvement in Data Science

3. Statistics

4. Knowledge check

# 3. Ethics

1. Ethical issues

2. Knowledge check

# 4. Data Sources (Getting Data)

1. Data Metrics

2. Existing data

3. APIs

4. Scraping

5. Creating Data

6. Knowledge check

# 5. Data Exploration (Cleaning Data)

1. Exploratory graphs

2. Exploratory statistics

3. Knowledge check

## 6. Programming

2. R programming

3. Python

4. SQL

5. Web formats

6. Knowledge check

## 7.Mathematics

1. Algebra

2. Systems of equations

3. Calculus

4. Big O

5. Bayes probability

6. Knowledge check

## 8. Applied Statistics

1. Hypothesis

2. Confidence

3. Problems

4. Validating

5. Knowledge check

# 9. Machine Learning

1. Linear Regression with one and multiple variables.

Linear regression predicts a real-valued output based on an input value. We discuss the application

of linear regression to housing price prediction, present the notion of a cost function, and introduce

the gradient descent method for learning.

1. Cost function

3. Normal Equations

1. Logistic regression. What if your input has more than one value? In this module, we show how

linear regression can be extended to accommodate multiple input features.

1. Cost Function

1. Neural Networks. Neural networks is a model inspired by how the brain works. It is widely used

today in many applications: when your phone interprets and understand your voice commands, it is

likely that a neural network is helping to understand your speech;

1. Back propagation

2. Application of Neural Network

1. Support Vector Machines (SVM). Support vector machines, or SVMs, is a machine learning

algorithm for classification. We introduce the idea and intuitions behind SVMs and discuss how to

use it in practice.

1. Large Margin classification

2. Kernels

1. UNSUPERVISED

1. Clustering

2. Gaussian Mixture Models

3. HMM

# 10. R Programming

1. Writing code and setting your working directory

2. Getting started and R nuts and Bolts

1. R console Input and evaluation

2. Data types – R Objects and attributes

3. Data types – Vectors and Lists

4. Data types – Matrices

5. Data types – Factors

6. Data types – Missing values

7. Data types – Data frames

8. Data types – Names Attributes

9. Data types – summary

12. Textual data formats

13. Connections: Interfaces to outside world

14. Subsettings – Basics

15. Subsettings – Lists

16. Subsettings – Matrices

17. Subsettings – Partial Matching

18. Subsettings – Removing Missing values

19. Vectorized Operations

## 11. Communicating

1. Interpretability

2. Actionable insights

3. Visualization for presentation

4. Reproducible research

5. Knowledge check

Conclusion and final test