" the current and future potential of music technology is both exciting and imperative for its cross-disciplinary applications "

 
  1. Audio versus MIDI-based Genre Classification (2021)

    • Collaborators: Alison Ma

    • Description: Machine/deep learning genre classification project in PyTorch utilizing LMD-aligned in the LAKH MIDI Dataset v0.1, Million Song Dataset, and top-MAGD from the MSD Allmusic Genre Dataset.

  2. Deep Learning Approaches to Symbolic Sequential Music Generation and Musical In-painting (2021)

    • Collaborators: Yilun Zha, Alison Ma, Iman Haque, Yufei Xu, Bowen Ran

    • Description: An analysis of various deep learning approaches to symbolic sequential music generation and musical in-painting for ABC format in PyTorch, featuring LSTMs with attention and Transformer architectures on the folk-rnn data_v2_worepeats dataset.

  3. The Relationship Between Stem Combinations of Features and Popularity through the 1925-2010s (2020)

    • Collaborators: Alison Ma

    • Description: A feature analysis study utilizing the Million Song Dataset and Billboard Hot 100 metadata.

  4. Automated Image Captioning (2020)

    • Collaborators: Aryan Pariani, Ishaan Mehra, Alison Ma, Jun Chen, Max Rivera

    • Description: Automated Image Captioning for computer vision and natural language processing applications utilizing attention-based Mask-RCNNs and LSTMs on the Flickr30k Kaggle dataset.

Machine Learning Research Projects at the Georgia Institute of Technology (2020-2021)

Max for Live Human-Computer Interaction

Google Chrome Speech-to-Text Performance System (2019)

Collaborators: Alison Ma

Description: Hosting a local server to access Chrome's speech recognition, text parsing in JavaScript for use with MaxMSP trigger word bank, trigger word bank outputs udp messages to cue list and MaxforLive devices.

 
 

MaxMSP 10.2 Surround Sound OSC Panner: Digital Forest Multimedia Installation (2019)

Collaborators: Joshua Williams, Alison Ma

Description: "Digital Forest" Multimedia Installation featuring Berklee College of Music guest artists Nona Hendryx and Will Calhoun. An interdisciplinary performance project featuring live performance and dancers, OSC, network audience-reactive audio, projection mapping visuals, and visuals from a game developed in Unity.

My Role:

Programmed a MaxMSP OSC-controlled mono/stereo panner to work with a 32-channel 10.2 mixer alongside team leader Joshua Williams. The OSC-panner gives the user control over the following: rate, depth, mono/stereo lock and width, re-center/re-lock, panning shapes. Each user/performer is given a total of 4 instances of the mono or stereo OSC-panner and selects a channel that corresponds to their audio input to the room. The OSC-panner information from all users/performers are sent to a master 10.2 Mixer patch that processes the panning information and audio, which it consequently sends back to each OSC-panner for visualization. The position of each speaker and corresponding amplitudes for panning were modeled and calculated to represent Berklee College of Music's performance room in 22 Fenway, which utilizes Audinate's Dante media networking technology. The mixer used was a Korg NanoKontrol 2, a MIDI controller programmed to have 4 banks of 8 32-channel faders.

- Live mixer

- Sound designer, live vocal processing

20210808_171058-2-5bw_edited_edited.jpg