Posted: Nov 14, 2024 Weekly Hours: 35 Role Number:200579035
Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us making each other’s ideas stronger. That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing lives for the better. It’s the diversity of our people and their thinking that inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something.
In the Siri Attention and Invocation team we act as the front door to our users’ interactions with Siri on almost every shipping Apple device. We work hard to make sure that Siri responds only when intended, in an efficient and privacy-preserving manner.
Description
We are looking for an intern to explore speech synthesis and audio generation techniques. The ideal candidate will be very familiar with audio generation or text to speech synthesis.
Key responsibilities:
* Develop audio generation and speech synthesis methods
* Build automated evaluation pipelines to assess quality of the synthetic data
* Optimize developed models for efficient inference
Minimum Qualifications
Bachelor’s degree in Computer Science or equivalent
Demonstrable experience in training deep learning systems on multiple GPUs in Pytorch
Demonstrable experience in audio, text to speech, speech to text technologies
Knowledge of the state of the art in audio generation, e.g. autoregressive vs non-autoregressive systems, etc.
Preferred Qualifications
Demonstrable experience with diffusion and/or autoregressive audio generation models
Publications in audio generation at well known conferences