Foundation of Mathematics for Machine Learning (July 2023)

Undergraduate course, Jadavpur University, Information Technology (Online), 2023

This was a month-long rigorous (2-3 sessions per week) mathematical course trying to build foundations of mathematics for learning Machine learning. It consists of 10 sessions (total of 27 hours of interaction) and 3 assignments (5 hours). It primarily dealt with 3 components - Probability and Statistics (11 hours), Linear Algebra (11 hours), and Introduction to Optimization and Machine Learning (5 hours). The details of the course is listed below:

PART A (11 hours): Probability and Statistics

Session 1: Introduction to Counting, story proofs, probability triplet, Random variables, PMF, PDF, CDF (2.5 hrs)

Session 2: Discrete random variables and distributions - Bernoulli, Binomial, Negative Binomial, Poisson, distribution, expectation, lotus, variance (3 hrs)

Session 3: Continuous random variables and distribution - Exponential, Normal, uniform distribution, Transformation of random variables, inequalities - Markov, chebycheff, Chernoff (2.5 hrs)

Session 4: Moment generating functions and applications - sum of gaussians, finding kth moment, central limit theorem, random vectors, joint distributions, conditional distribution, introduction to random processes, counting process, random walk and discrete time markov chains (3 hrs)

Assignment 1: Probability and Statistics (35 points)

PART B (11 hours): Linear Algebra

Session 5: Introduction to matrices, vector spaces, subspaces (span and basis), finding 4 - fundamental subspaces, elimination of matrices (rank, nullity, pivots, row reduced echelon form), solution to a system of linear equations (3 hrs)

Session 6: Identities of subspaces - e.g. rank(A+B), rank(AB), Projection matrices, least square solution - a linear algebra view, introduction to eigenvalues and eigenvectors, motivation - Fibonacci sequence, quadratic solution, similar matrices, algebraic and geometric multiplicity (3 hrs)

Session 7: Cayley Hamiton’s Theorem, Eigen-decomposition, schur’s triangularization, unitary matrices (Gram-Schmidt), normal matrices, and applications - principle component analysis, singular value decomposition (2.5 hrs)

Session 8: Inner products, norms, norms from inner products, vector norms - Lp norms (p=1, 2, infinity), dual spaces and dual norms - Lp - Lq dual norms (geometric interpretation of L1-Linf, L2-L2), introduction to matrix norms - Frobenius norm, and induced matrix norms - induced L1 norm (2.5 hours)

Assignment 2: Linear Algebra (35 points)

PART C (5 hours): Introduction to optimization and Machine learning

Session 9: Introduction to convex sets (e.g. norm balls), using epigraphs to form a convex function, 1st derivative and 2nd derivative properties of convex functions, Jensen’s inequality, Holder’s and Minkowski inequality, constrained optimization problems - Lagrangian multipliers (3 hrs)

Session 10: Fundamentals of Gradient descent approach, why it works, maximum likelihood estimation - linear and logistic regression, Python demonstration of gradient descent in Linear and logistic regression (2 hrs)

Assignment 3: Programming assingment - Lasso and Ridge regression (20 points), (10 points for total attendance)

You can get the session notes here, and the video lectures in this youtube link. Certificate