" the current and future potential of music technology is both exciting and imperative for its cross-disciplinary applications "


Machine Learning Research Projects at the Georgia Institute of Technology (2020-2021)

  1. Master's Project - Representation Learning for Automatic Indexing of Commercial Sound Effects Libraries (2021)

    • Collaborators: Alison Ma

    • Description: Implemented experiments for sound event classification and few-shot metric learning
      of commercial sound effects libraries with SVM, Random Forest, Convolutional Neural Network, and Siamese Neural Network architectures in Python and PyTorch

  2. Adapting Phoneme Classification of Speech to Singing Voice (2021)

    • Collaborators: Nikhil Ravi Krishnan, Akhil Shukla, Alison Ma

    • Description: Worked with HMMs, GMMs, and RNNs with BiLSTMs using the Timit and DALI datasets

  3. Audio versus MIDI-based Genre Classification (2021)

    • Collaborators: Alison Ma

    • Description: Conducted ablation study experiments on the LAKH MIDI Dataset v0.1, Million Song Dataset, and top- MAGD MSD Allmusic Genre Dataset to compare MIDI and audio-based classification with Random Forest, MLP, and CNN architectures

  4. Deep Learning Approaches to Symbolic Sequential Music Generation and Musical In-painting (2021)​
    • Collaborators: Yilun Zha, Alison Ma, Iman Haque, Yufei Xu, Bowen Ran

    • Description: Surveyed deep learning approaches to symbolic sequential music generation and musical in-painting for ABC format, employing LSTMs with attention and Transformer architectures on the folk-rnn data_v2_worepeats dataset

  5. Automated Image Captioning (2020)

    • Collaborators: Aryan Pariani, Ishaan Mehra, Alison Ma, Jun Chen, Max Rivera

    • Description: Utilized attention-based Mask-RCNNs and LSTMs on the Flickr30k Kaggle dataset to achieve a BLEU score of 0.795 on the best caption from the test set in Keras

  6. The Relationship Between Stem Combinations of Features and Popularity through the 1925-2010s (2020)

    • Collaborators: Alison Ma

    • Description: Executed a feature analysis study and conducted statistical analysis utilizing Billboard Hot 100 metadata and SigSep Open-UnMix extracted audio stems for songs in the Million Song Dataset

Max for Live Human-Computer Interaction

Google Chrome Speech-to-Text Performance System (2019)

Collaborators: Alison Ma

Description: Designed a real-time performance system using JavaScript, node.js, and socket.io to integrate Google Chrome's Speech-to-Text engine with Max for Live devices at the Berklee College of Music

MaxMSP 10.2 Surround Sound OSC Panner: Digital Forest Multimedia Installation


Collaborators: Joshua Williams, Alison Ma

Description: Co-developed MaxMSP and Max for Live real-time panning tools for performers for use with Ableton Live and Audinate Dante at the Berklee College of Music Digital Forest Multimedia Installation