Posts by Collection

portfolio

projects_acad

Social Networking Site for Sports Lovers

Java-Based Social Media Website, Web Development, B.E., JU, 2020

Developed a java-based web application that allows user to follow their favourite teams/players and predict results of future matches. Based on the predictions the users are given rankings. It has a win predictor. There is a web-socket based chat feature that allows users to make groups, chat with other users, and even participate in public chatrooms pertaining to current sports events.

You can find the implementation of the website here.

Natural Language Inference

NLP, Deep Learning, E0-270, ML, IISc, 2022

In this NLP problem, the the algorithm needs to identify if the hypothesis derived from a sentence is an entailment (follows), contradiction (is opposite), or neutral (independent sentence and hypothesis).

Safe RL with Curriculam Learning

Safe Reinforcement Learning, student-teacher based RL, E1-277, RL, IISc, 2022

Traditional reinforcement learning algorithms learn about a dangerous states only after the agent has been in such states enough to impact the value function. But learning for such algorithms become dangerous. For example, we don’t want cars to learn to drive safely, only after it has met an accident.

VQ-VAEs, DC-GANs

Generative Models, Vector Quantized VAEs, GANs, E9-333, ADRL, IISc, 2022

VQ-VAEs and DC-GANs are widely used generative models today. Vector Quantized VAEs, intuits that every image that we see today is discrete and quantized. It comes up with the idea that it is possible to represent most signals with a combination of fixed length codebook. The output of the Encoder is thus quantized into vectors in the code-book (which are themselves learnt). The decoder is made to reconstruct the image from its quantized version.

Diffusion Models, Conditional Diffusion Models

Generative Models, Deep Representation Learning, E9-333, ADRL, IISc, 2022

Diffusion Models take data from a distribution, gradually adds gaussian noise, until a map to an isotropic gaussian is obtained. For small mixing parameters, the reverse process is also Markov. This assumption helps us come up with a model that can leran the backward process, i.e., given isotropic gaussian noise, it can run steps of langevin dynamics (backward/also known as denoising) to generate images from the train distribution.

Domain Adversarial Neural Networks

Domaian Adaptation, DANN, E9-333, ADRL, IISc, 2022

All machine learning algorithm assumes that the train and test data comes from the same distribution. For example, when we train a classifier on handwritten digits of the USPS dataset, it performs much poorer on the MNIST dataset. In DANN, this is solved adversarially by using the a discriminator network (identifies data from source and target domain) that forces the feature extractor to produce similar features for both the domains.

Few-shot learning using MAML and Protonets

Meta-learning, MAML, Prototype learning, E9-333, ADRL, IISc, 2022

What happens when you have a huge number of classes and a few data points for each class (like omniglot )? You need few-shot/meta-learning where you learn initializations of weights using tasks. This is done for N classes having k examples each. The model must do well with a small k for any combination of N classes (potentially new) (and a support set).

In this project I implemented MAML and Prototype Learning (2 popularly used few-shot learning algorithms).

projects_research

Text Normalization Using WFSTs

NLP, weighted finite state transducers, Voice Intelligence, SRIB, 2020

Text Normalization plays important role in all NLP pipeline. It is used to convert different representation of an entity into a single cannonical form. Weighted finite state transducers were used for this problem. This describes the technical details.

Spoken Language Identification

SLID, Speech Signal Processing, Deep Learning, Jadavpur University, 2020

This aims at identifying language from speech data. Many Indian languages have similar phonemes, which makes it challenging to separate them. This project uses MFCC features to develop diffent algorithms to tackle this problem.

Decoding Attention Signatures from EEG Data

Attention, Deep Learning, Cognition Lab, IISc, 2022

EEG data is extremely noisy, and heterogenous across humans. Variability increases further if the task design of a psychophysical changes. This project aims at answering the following:

Predicting brain age with diffusion models

Generative Modeling, Brain age prediction, Cognition Lab, IISc, 2023

Diffusion Magnetic Resonance Imaging (dMRI) helps find the structural connectivity of the brain in the form of an undirected graph. Therefore, these graphs can be used as markers for brain age. But data scarcity makes it challenging for naïve deep-learning algorithms to succeed at this problem. Even larger datasets, like – Rush Alzheimer’s Disease Center (RADC) have around 750 scans. Therefore, data augmentation is one of the most viable solutions to this problem. The recent success of conditional stable (latent) diffusion models is not limited to just generating realistic natural images based on an input text (DALLE-2). They have been used successfully for stimulus reconstruction conditioned on the input fMRI activity. We propose the use of latent diffusion models for the generation of connectivity matrices conditioned on age. A realistic augmentation of the training set can reduce overfitting and help build a robust brain-age decoder from connectivity matrices.

Predicting brain age with Supervised Domain Adaptation

Transfer Learning, Domain Adaptation, Brain age prediction, Cognition Lab, IISc, 2024

Increasing life expectancy and global median age make the population more susceptible to age-related neurological and cognitive disorders like mild cognitive impairments (MCI) and Alzheimer’s disease (AD). These disorders are known to greatly degrade the quality of life. Hence, detecting an early onset of these diseases becomes critical for the timely prognosis.

Rescuing referral failures due to domain shift

Selective Classification, Domain Adaptation, Medical Imaging, Cognition Lab, IISc, 2024

Here, we address a major challenge with domain generalization – selective classification during automated medical image diagnosis. During selective classification, models must abstain from making predictions when label confidence is low, especially for samples that deviate from the training set (covariate shift). Such uncertain samples are referred to the clinician for further evaluation (“referral”). Yet, we see that even state-of-the-art deep learning models fail dramatically during referral when tested on medical images acquired from a different demographic or with different technology.

CRAFT: A framework for Deep transfer and semi-sup. learning

Transfer Learning, semi-supervised learning, Cognition Lab, IISc, 2024

State-of-the-art deep learning models are seldom as efficient in neuroscience as in domains with access to big data. A major reason is the rarity of large labeled datasets. Hence, the role of pre-trained models (transfer learning) and unlabeled data (semi-supervised learning) is important. We propose Contradistinguisher Regularized Adaptive FineTuning (CRAFT) of neural networks – a method for efficient transfer and semi-supervised learning.

publications

talks

talks_own

How to open the correct Gate?

Gate, Preparations, Jadavpur University, Kolkata (Online), November, 2022

In the freshers orientation programme, 2022, at Jadavapur University, Kolkata, I spoke about the preparation required to get a good gate score. I also touched upon the foundations needed during undergraduate studies to pursue higher education and research.

Mathematics – The key to the GATE?

Gate Preparations, Mathematics, Jadavpur University, Kolkata (Online), October, 2023

In October 2023, I spoke about pursuing a career in academia at the first year UG students’ orientation programme in the department of Information Technology, Jadavapur University, Kolkata. The talk comprised self-preparation statergies for writing the GATE examination for Computer Science, and Data Science and AI papers; and preparing for higher education abroad.

teaching

Natural Language Processing (Jan 2022)

Undergraduate course, Jadavpur University, Information Technology (Online), 2022

This contains the details of the tutorial sessions (2 hours each) conducted during this course. Besides these tutorials, content was made for the class in form of ppts, and assignments, and evaluating them.

Introduction to Machine Learning (Aug 2022)

cs-97, NPTEL, 2022

This contains the details of the tutorial sessions (1.5 hours each) conducted during this course. Coding tutorials (in python) related to the topics covered in class were also conducted in most sessions.

Basic Calculus 1 (Jan 2023)

ma-13, NPTEL, 2023

This contains the details of the tutorial sessions (2 hours each) conducted during this course.

Foundation of Mathematics for Machine Learning (July 2023)

Undergraduate course, Jadavpur University, Information Technology (Online), 2023

This was a month-long rigorous (2-3 sessions per week) mathematical course trying to build foundations of mathematics for learning Machine learning. It consists of 10 sessions (total of 27 hours of interaction) and 3 assignments (5 hours). It primarily dealt with 3 components - Probability and Statistics (11 hours), Linear Algebra (11 hours), and Introduction to Optimization and Machine Learning (5 hours). The details of the course is listed below:

Introduction to Machine Learning (Feb 2024)

Undergraduate course - GATE Data science a AI syllabus, Jadavpur University, Information Technology (Online), 2024

This was a 4 month-long rigorous (2 sessions per week) course introducing undergraduates and AI enthusiasts to Machine learning. It consists of 24 sessions of instruction (totalling 54 hours) and 4 assignments (12 hours). In the first 12 sessions, we built the foundations to understand machine learning – Probability and Statistics, and Linear Algebra. All these topics were covered rigorously, and the GATE syllabus was addressed completely for these subjects. In the next 12 sessions, we covered foundational Machine Learning and Optimization for undergraduates and beginners. This course would be particularly useful for people who want to write the Graduate Aptitude Test for Engineers (GATE) Data Science and AI paper. We also showed how to code several ML algorithms in Python from scratch. Furthermore, the assignments would further help them grasp the concepts through theory and coding in python. This course covers all the AI topics and Mathematics related topics in the syllabus: