Rithesh Kumar

Rithesh Kumar

Member of Technical Staff

OpenAI

San Francisco, CA 🇺🇸

ritheshkumar.95@gmail.com

About

I work on building audio interfaces to AGI at OpenAI. We are entering an era of increasingly intelligent, agentic models that can gather context at enormous scale asynchronously, while still needing to listen, perceive, and respond in real-time and present information in a human-like way.

Before OpenAI, I was a Senior Research Scientist at Adobe Research, where I led speech generation efforts across controllable text-to-speech, automatic dubbing, and speech editing. My work centered on scaling diffusion models and developing efficient distillation algorithms for multilingual audio generation.

Before Adobe, I was Technical Lead for Audio Research at Descript Inc., where I built and shipped 4+ text-to-speech models powering the flagship Overdub and Regenerate features—enabling ultra-realistic voice cloning and text-based audio corrections.

My work has been rooted in the fundamentals of generative modeling and deep learning since 2016. I had the privilege of completing my M.Sc. in Computer Science (2017–2019) at the Mila lab in Université de Montréal under the supervision of Prof. Yoshua Bengio.

Experience

OpenAI

Member of Technical Staff, building the next generation of voice interfaces to AGI.

Adobe Research

Research on controllable text-to-speech synthesis, automatic dubbing, and speech editing.

Descript (prev. Lyrebird)

Technical Lead for Audio Research. Overdub, Regenerate, and AI Voices.

Shipped Products

Selected Publications

Other Publications
  1. 2026

    TAC: Timestamped Audio Captioning

    Sonal Kumar, Prem Seetharaman, Ke Chen, Oriol Nieto, Jiaqi Su, Zhepei Wang, Rithesh Kumar, Dinesh Manocha, Nicholas J. Bryan, Zeyu Jin, Justin Salamon.