Introduction to Machine Learning (Feb 2024)

Undergraduate course - GATE Data science a AI syllabus, Jadavpur University, Information Technology (Online), 2024

This was a 4 month-long rigorous (2 sessions per week) course introducing undergraduates and AI enthusiasts to Machine learning. It consists of 24 sessions of instruction (totalling 54 hours) and 4 assignments (12 hours). In the first 12 sessions, we built the foundations to understand machine learning – Probability and Statistics, and Linear Algebra. All these topics were covered rigorously, and the GATE syllabus was addressed completely for these subjects. In the next 12 sessions, we covered foundational Machine Learning and Optimization for undergraduates and beginners. This course would be particularly useful for people who want to write the Graduate Aptitude Test for Engineers (GATE) Data Science and AI paper. We also showed how to code several ML algorithms in Python from scratch. Furthermore, the assignments would further help them grasp the concepts through theory and coding in python. This course covers all the AI topics and Mathematics related topics in the syllabus:

PART A (14 hours): Probability and Statistics (2 hours/session)

Session 1: Introduction to counting, Probability Space, Conditional Probability.

Session 2-3: Discrete and Continuous random variables and distributions - Bernoulli, Binomial, Poisson, Normal, Exponential and Uniform distribution.

Session 4-5: LOTUS, Moment Generating Functions, Central Limit Theorem, Other distributions – Geometric, Cauchy, Joint distributions, Conditional expectation, Law of Large Numbers, Inequalities in Probability.

Session 6-7: Advanced Distributions – Chi-squared, Student t-distributions, Statistical tests – binomial test, z-test, t-test, and chi-squared test with all relevant proof.

Assignment 1: Probability and Statistics (3 hours, 25 points)

PART B (10 hours): Linear Algebra (2 hours/session)

Session 8-9: Linear System of equations, Gauss-Jordan Elimination, Vector Spaces, Subspaces, Basis, Dimensions, 4 fundamental subspaces, different solutions to linear equations.

Session 10: Some matrix identities, Special matrices – symmetric, orthonormal, positive semidefinite, Projection and Projection Matrices.

Session 11-12: Eigenvalue and Eigenvectors, characteristic equations, its properties, Singular Value Decompositions.

Assignment 2: Linear Algebra (25 points, 3 hours)

PART C (30 hours): Machine Learning and Optimization (2.5 hours/session)

Session 13: What is Machine Learning? Supervised Learning - Linear and Ridge Regression, Gradient Descent, Train, CV, Test.

Session 14: Gradient Descent, Bayesian Learning Theory, Error, and Risk.

Session 15: Discriminant Functions, Maximum Likelihood Estimate, MAP estimate, Applications in Linear regression.

Session 16: Classification problems: Logistic regression, softmax regression and K-nearest neighbours.

Session 17 Decision Trees, Intro to Support Vector Machines, constraint Optimization problems, Lagrange multipliers.

Session 18-19: Solving constrained Optimization, Lagrange multipliers, Duality, soft SVM, KKT condition, Primal solution, Sequential Minimal Optimization (complete solution), SVR.

Session 20: Perceptron Learning Algorithm, Convergence Proof Multilayer Perceptron.

Assignment 3: Supervised Learning – Theory + Coding (30 points, 4 hours)

Session 21: Backpropagation, Regularization of NNs – Dropout, Bias-Variance Tradeoff.

Session 22: Introduction to unsupervised learning, clustering algorithms - k means, k-medoids, Hierarchical methods.

Session 23: Dimensionality Reduction using Principal Component Analysis and linear Discriminant Analysis.

Session 24: Mixture Models, Expectation Maximization, GMMs, K-means as a specialized GMM.

Assignment 4: Unsupervised Learning – Theory + Coding (20 points, 2 hours)

You can get the session notes here, and the video lectures in this youtube link. Certificate