MACHINE LEARNING
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2017 - 2018)
SEMESTER – VII Subject Code 17CS73 IA Marks 40 Number of
Lecture Hours/Week 03 Exam Marks 60 Total Number of Lecture Hours
50 Exam Hours 03 CREDITS – 04
Module – 1 Introduction: Well posed learning problems,
Designing a Learning system, Perspective and Issues in Machine
Learning. Concept Learning: Concept learning task, Concept learning
as search, Find-S algorithm, Version space, Candidate Elimination
algorithm, Inductive Bias.
Text Book1, Sections: 1.1 – 1.3, 2.1-2.5, 2.7 10 Hours
Module – 2 Decision Tree Learning: Decision tree
representation, Appropriate problems for decision tree learning,
Basic decision tree learning algorithm, hypothesis space search in
decision tree learning, Inductive bias in decision tree learning,
Issues in decision tree learning.
Text Book1, Sections: 3.1-3.7 10 Hours
Module – 3 Artificial Neural Networks: Introduction, Neural
Network representation, Appropriate problems, Perceptrons,
Backpropagation algorithm.
Text book 1, Sections: 4.1 – 4.6 08 Hours
Module – 4 Bayesian Learning: Introduction, Bayes theorem,
Bayes theorem and concept learning, ML and LS error hypothesis, ML
for predicting probabilities, MDL principle, Naive Bayes classifier,
Bayesian belief networks, EM algorithm
Text book 1, Sections: 6.1 – 6.6, 6.9, 6.11, 6.12 10 Hours
Module – 5 Evaluating Hypothesis: Motivation, Estimating
hypothesis accuracy, Basics of sampling theorem, General approach for
deriving confidence intervals, Difference in error of two hypothesis,
Comparing learning algorithms. Instance Based Learning:
Introduction, k-nearest neighbor learning, locally weighted
regression, radial basis function, cased-based reasoning,
Reinforcement Learning: Introduction, Learning Task, Q
Learning
Text book 1, Sections: 5.1-5.6, 8.1-8.5, 13.1-13.3 12 Hours
Text Books: 1.
Tom M. Mitchell, Machine Learning, India Edition 2013, McGraw Hill
Education.
Reference Books: 1. Trevor Hastie, Robert Tibshirani, Jerome
Friedman, h The Elements of Statistical Learning, 2nd edition,
springer series in statistics.
2. Ethem Alpaydın, Introduction to machine learning, second edition,
MIT press.
Pre Requisite for Machine Learning
Coding
Capabilities:
The
ease of converting logical statements into code can go a long way
while becoming an ML practitioner.
Most
of the open-source libraries are available in Python and R(especially
data science libraries).
A
good knowledge of Python can accelerate the learning curve.
Some
of the important Python packages are
-
Tensorflow (Parallel
and Distributed Computation for Machine Learning and Deep
Learning)
-
numpy
(Efficient Matrices Computations)
-
OpenCV
(Python’s Image Processing Toolbox)
-
R Studio
-
Pycharm
-
iPython/Jupyter Notebook
-
Julia
-
Spyder
-
Anaconda
-
Rodeo
-
Google –Colab
Online environment to learn python
-
FOSS IIT Bombay
-
Spoken – tutorials IIT Bombay
-
Open
course of MIT on python ( Offered by Coursera MIT
OCW) free
-
Python
Documentation
Online environment for deep learning
-
Google
colabs
-
AWS
-
IBM
blue mix
-
Microsoft
Azure
-
ML
flow
-
OpenCV
(Python’s Image Processing Toolbox)
-
Python
-
R
-
Matlab
-
Octave
-
Julia
-
C++
-
C
Algorithms
and Data Structures:
Data
structures are efficient data models and are designed for efficiency
in terms of memory and time consumed. Knowing how to handle data can
fasten the processing. It also helps design better and faster
algorithms, be it for pre-processing of data or designing the
algorithm itself.
Calculus:
The
heart of neural networks is the back propagation algorithm, based
totally on differentiation. Hence we recommend a basic outline of
calculus would help in the understanding of the training process.
Linear
Algebra:
Gilbert
Strang teaches the subject in the most fluidic manner and thus we
strongly suggest his course on MIT OCW, available on YouTube. Linear
Algebra is important because the data we deal with is
multi-dimensional. For example, when we try to predict the price of a
house, the various dimensions are location, area, facilities
available, etc. Matrices are the most ideal way to deal with higher
dimensions.
Statistics:
The
basic understanding of mean, median, and mode of various probability
distributions, especially the gaussian distribution is useful, as
most of the data found in the real world can be modelled via these
probability
distributions
and thus simplify the data to a fewer number of parameters.
Vector
Algebra