Dhananjay GNDEC: August 2020

MACHINE LEARNING

[As per Choice Based Credit System (CBCS) scheme]

(Effective from the academic year 2017 - 2018)

SEMESTER – VII Subject Code 17CS73 IA Marks 40 Number of Lecture Hours/Week 03 Exam Marks 60 Total Number of Lecture Hours 50 Exam Hours 03 CREDITS – 04

Module – 1 Introduction: Well posed learning problems, Designing a Learning system, Perspective and Issues in Machine Learning. Concept Learning: Concept learning task, Concept learning as search, Find-S algorithm, Version space, Candidate Elimination algorithm, Inductive Bias.

Text Book1, Sections: 1.1 – 1.3, 2.1-2.5, 2.7 10 Hours

Module – 2 Decision Tree Learning: Decision tree representation, Appropriate problems for decision tree learning, Basic decision tree learning algorithm, hypothesis space search in decision tree learning, Inductive bias in decision tree learning, Issues in decision tree learning.

Text Book1, Sections: 3.1-3.7 10 Hours

Module – 3 Artificial Neural Networks: Introduction, Neural Network representation, Appropriate problems, Perceptrons, Backpropagation algorithm.

Text book 1, Sections: 4.1 – 4.6 08 Hours

Module – 4 Bayesian Learning: Introduction, Bayes theorem, Bayes theorem and concept learning, ML and LS error hypothesis, ML for predicting probabilities, MDL principle, Naive Bayes classifier, Bayesian belief networks, EM algorithm

Text book 1, Sections: 6.1 – 6.6, 6.9, 6.11, 6.12 10 Hours

Module – 5 Evaluating Hypothesis: Motivation, Estimating hypothesis accuracy, Basics of sampling theorem, General approach for deriving confidence intervals, Difference in error of two hypothesis, Comparing learning algorithms. Instance Based Learning: Introduction, k-nearest neighbor learning, locally weighted regression, radial basis function, cased-based reasoning, Reinforcement Learning: Introduction, Learning Task, Q Learning

Text book 1, Sections: 5.1-5.6, 8.1-8.5, 13.1-13.3 12 Hours

Text Books: 1.

Tom M. Mitchell, Machine Learning, India Edition 2013, McGraw Hill Education.

Reference Books: 1. Trevor Hastie, Robert Tibshirani, Jerome Friedman, h The Elements of Statistical Learning, 2nd edition, springer series in statistics.

2. Ethem Alpaydın, Introduction to machine learning, second edition, MIT press.

Pre Requisite for Machine Learning

Coding Capabilities:

The ease of converting logical statements into code can go a long way while becoming an ML practitioner.

Most of the open-source libraries are available in Python and R(especially data science libraries).

A good knowledge of Python can accelerate the learning curve.

Some of the important Python packages are

Tensorflow (Parallel and Distributed Computation for Machine Learning and Deep Learning)
numpy (Efficient Matrices Computations)
OpenCV (Python’s Image Processing Toolbox)
R Studio
Pycharm
iPython/Jupyter Notebook
Julia
Spyder
Anaconda
Rodeo
Google –Colab

Online environment to learn python

FOSS IIT Bombay

Spoken – tutorials IIT Bombay

Open course of MIT on python ( Offered by Coursera MIT OCW) free

Python Documentation

Online environment for deep learning

Google colabs
AWS
IBM blue mix
Microsoft Azure
ML flow

OpenCV (Python’s Image Processing Toolbox)
Python
R
Matlab
Octave
Julia
C++
C

Algorithms and Data Structures:

Data structures are efficient data models and are designed for efficiency in terms of memory and time consumed. Knowing how to handle data can fasten the processing. It also helps design better and faster algorithms, be it for pre-processing of data or designing the algorithm itself.

Calculus:

The heart of neural networks is the back propagation algorithm, based totally on differentiation. Hence we recommend a basic outline of calculus would help in the understanding of the training process.

Linear Algebra:

Gilbert Strang teaches the subject in the most fluidic manner and thus we strongly suggest his course on MIT OCW, available on YouTube. Linear Algebra is important because the data we deal with is multi-dimensional. For example, when we try to predict the price of a house, the various dimensions are location, area, facilities available, etc. Matrices are the most ideal way to deal with higher dimensions.

Statistics:

The basic understanding of mean, median, and mode of various probability distributions, especially the gaussian distribution is useful, as most of the data found in the real world can be modelled via these probability distributions and thus simplify the data to a fewer number of parameters.

Vector Algebra

Dhananjay GNDEC

Tuesday, August 25, 2020

Module 2 Machine Learning

Tuesday, August 18, 2020

PPT for Module 1

Sunday, August 9, 2020

Search This Blog