Tutorials

  • Prior-Fitted Networks

      [ Notebook (Soln.) ]

    Publication: Müller et al. Transformers can do Bayesian Inference. ICLR, 2022
    Task: Using Transformers to estimate Posterior Predictive Distributions
    Libraries: PyTorch
    Learning objectives:
    • Understand the principles of Prior-Data Fitted Networks (PFNs) and their application in integrating prior knowledge to predict the Posterior Predictive Distribution (PPD) in machine learning models.
    • Acquire practical skills in defining priors, creating dataset loaders for synthetic data generation, developing transformer models for PPD approximation, formulating loss functions for specific regression tasks, and evaluating model output quality.
    Data Generating Priors for Prior-Fitted Networks

      [ Notebook (Soln.) ]

    Publication: Müller et al. Transformers can do Bayesian Inference. ICLR, 2022
    Hollmann et al. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. ICLR, 2023
    Task: Generating data according to different priors for PFNs
    Libraries: PyTorch
    Learning objectives: Learn how to generate synthetic data based on specified priors and utilize it for training neural networks to approximate Bayesian inference.
    LLMTime - Zero-shot prompting LLMs for time series forecasting

      [ Notebook (DIY) ] [ Notebook (Soln.) ]

    Publication: Gruver et al. Large Language Models are Zero Shot Time Series Forecasters. NeurIPS, 2023
    Task: Weather forecasting using LLMs
    Libraries: openai, tiktoken, jax

    Learning objectives: Explore zero-shot prompting with Large Language Models (LLMs) for time series forecasting. In this tutorial, we aim to:
    • Acquaint you with the application of machine learning techniques using Large Language Models (LLMs).
    • Enhance your understanding of LLMs and the parameters influencing their behavior.
    • Guide you through the essentials for successful time series prediction with LLMs.
    • Translate knowledge from transformers to the realm of LLMs.
    Attention is all you need

      [ Notebook (DIY) ] [ Notebook (Soln.) ]

    Publication: Vaswani, Ashish, et al. "Attention is all you need." NeurIPS 2017.
    Task: Neural Machine Translation (e.g, German-English)
    Dataset: Multi30k
    Libraries: PyTorch, NLTK, Spacy, torchtext

    Learning objectives:
    • Build a transformer model for neural machine translation
    • Train the model using proposed label smoothing loss and learning rate scheduler
    • Use the trained model to infer likely translations using
      • Greedy Decoding
      • Beam Search
    Critical Exploration of Transformer Models

      [ Notebook (DIY) ] [ Notebook (Soln.) ]

    Learning objectives: Delve into the inner workings of transformer models beyond basic applications
    Key Areas:
    • Adversarial Inputs: Crafting inputs to challenge language models
    • Attention Visualization: Understanding focus mechanisms in transformers
    • Fine-Tuning with LoRA: Implementing Low-Rank Adaptation for model refinement
    • Bias Detection: Investigating biases in model responses
    Note: This tutorial is an introductory exploration of transformers. It’s a starting point for more advanced study.

    Molecule Attention Transformer

      [ Notebook (Soln.) ]

    Publication: Maziarka, et al. "Molecule Attention Transformer"
    Task: Classification task to predict Blood-brain barrier permeability (BBBP)
    Dataset: BBBP
    Libraries: PyTorch, DeepChem, RDKit

    Learning objectives:
    • Learn key concepts required to work with molecules
    • Perform critical data preprocessing tasks, such as feature extraction, graph formation, and scaffold splitting
    • Explore challenges of drug discovery, particularly designing drugs that can cross the blood-brain barrier and enter the central nervous system
    • Implement Molecule Attention Transformer (MAT) proposed by Maziarka et al. (2020) using a deep learning pipeline
    • Train and evaluate the model on molecular datasets
    Accessing Research Data for Social Science [Oxford Internet Institute, MT 2022]

      [ Notebook (DIY) ]

    DeepNote: (Jupyter notebook hosting service) DIY Notebooks.
    Github: Repository to work on your local machine.
    Programming Language: Python
    Libraries: Pandas, feedparser, newscatcherapi, psaw, requests, twarc (Twitter API), requests-html
    Learning objectives:
    • Use Python to collect research data from the social web
    • Give due consideration to the ethics of data collection
    • Following topics are covered:
      • Accessing RSS feeds
      • Accessing Reddit and Wikipedia through APIs
      • Accessing Twitter using Twitter API
      • Web crawling

    Courses

  • Attention, Transformers, LLMs
    [Topics] [Practical]
    Advanced Convolutional Neural Network Architectures (2012 - 2018)
    [Topics] [Practical]
    Network Pruning
    [Topics] [Practical]
    Deep Autoencoders and Variational Autoencoders
    [Topics] [Practical]
    Noise Reduction in Machine Learning
    [Topics] [Practical]

    Hackathons

    Prediction of COVID Infection using reported symptoms

      [ Notebook (DIY) ] [ Notebook (Soln.) ]

    Based on: Zoabi et al. "Machine learning-based prediction of COVID-19 diagnosis based on symptoms." npj digital medicine 4.1 (2021): 1-5.
    Task: Predict COVID-19 infection from reported symptoms
    Dataset: English translation of COVID infections reported by Israeli Ministry of Health
    Learning objectives:
    • Explore a realistic dataset and prepare it for building a practical machine learning system
    • Think through various practical issues related to deploying such a system (e.g., class imbalance, data collection bias)
    • Build and train such an ML system addressing the issues discovered above