Hello! I am Rithesh Kumar, an AI researcher with expertise in deep learning and generative modeling. Currently, I am a Research Scientist and member of the Audio Research Group at Adobe Research.

Previously, I was the Technical Lead for the Overdub Research team at Descript Inc. In this time, I built and shipped 4+ text-to-speech models behind the flagship Overdub feature capable of ultra-realistic voice cloning and performing corrections on recordings through text. Recently, I also led the development of the Regenerate feature that leverages instant voice cloning technology to make bad edits sound seamless and natural.

Currrently, I live in Toronto, Ontario 🇨🇦.

Education

I completed my MSc in Computer Science (specializing in Artificial Intelligence) at the Mila lab in Université de Montréal supervised by Yoshua Bengio. During my MSc, I had the excellent opportunity to intern at Lyrebird and Microsoft Research - Montréal.

Earlier, I graduated from SSN College of Engineering (affiliated to Anna University) with a Bachelors in Computer Science and Engineering. I spent the final 2 years of my undergrad learning about deep learning, spending a summer at the Serre Lab in Brown University and collaborating with Prof. Yoshua Bengio at the Mila lab.

Publications

High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar*, Prem Seetharaman*, Alejandro Luebs, Ishaan Kumar, Kundan Kumar
Poster Presentation (Spotlight) - NeurIPS 2023
VampNet: Music Generation via Masked Acoustic Token Modeling
Hugo Flores Garcia, Prem Seetharaman, Rithesh Kumar, Bryan Pardio
Poster Presentation - ISMIR 2023
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio
Poster Presentation - ICLR 2022
NU-GAN: High Resolution Neural Upsampling With GANs
Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar*, Rithesh Kumar*, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brébisson, Yoshua Bengio, Aaron Courville
Poster Presentation - NeurIPS 2019
Maximum Entropy Generators for Energy-based Models
Rithesh Kumar, Sherjil Ozair, Anirudh Goyal, Aaron Courville, Yoshua Bengio
Masters Thesis
Harmonic Recomposition using Conditional Autoregressive Modeling
Kyle Kastner, Rithesh Kumar, Tim Coojimans, Aaron Courville
Poster Presentation - Joint Workshop on Machine Learning for Music (ICML 2018)
ObamaNet: Photo-realistic lip-sync from text
Rithesh Kumar, Jose Sotelo, Kundan Kumar, Alexandre de Brébisson, Yoshua Bengio
Oral Presentation - Machine Learning for Creativity and Design Workshop (NeurIPS 2017)
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Aaron Courville, Yoshua Bengio
Poster Presentation - ICLR 2017
Select Projects

Reproducing Neural Discrete Representation Learning
Rithesh Kumar, Tristan Deleu, Evan Racah
Final project - Representation Learning
Reproducing Handwriting Synthesis and Prediction
Rithesh Kumar
Open source project
Reproducing What You Get Is What You See: Visual Markup Decompiler
Rithesh Kumar, Rithesh Rohan, U. Sivashanmugam Undergraduate Thesis