logo inner

Product Manager, Model Behavior

CompanyCartesia
Location* | San Francisco, California, United States
TypeOnsite

About Cartesia


Our mission is to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device.We're pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models.

Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences.We're funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We're fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world's foremost experts in AI.

About the Role


We're seeking an exceptional Product Manager to drive model quality and behavior excellence for our text-to-speech and speech-to-text products at Cartesia. As our Model Behavior PM, you'll be the bridge between our customers' needs and our model development teams, defining what world-class TTS and STT models should sound like, perform like, and feel like. This role combines deep analytical rigor with customer empathy to continuously elevate our model quality and establish Cartesia as the gold standard in voice AI.

Your Impact


  • Define and evolve comprehensive evaluation frameworks for TTS and STT model behavior, establishing clear metrics for naturalness, accuracy, prosody, emotion, latency, and user satisfaction across diverse use cases
  • Conduct systematic competitive analysis by deeply using our products alongside competitors' offerings, identifying quality gaps, behavioral differences, and opportunities for differentiation
  • Partner closely with data teams to design data collection strategies, labeling guidelines, and dataset curation approaches that directly improve model behavior and performance
  • Collaborate with evaluation teams to build rigorous testing methodologies, automated evaluation pipelines, and human evaluation protocols that catch edge cases and quality regressions
  • Engage directly with customers across industries to understand their voice AI requirements, gather qualitative feedback on model behavior, and translate insights into actionable product improvements
  • Drive cross-functional alignment between research, engineering, data, and GTM teams to prioritize and execute on model behavior improvements that deliver maximum customer impact
  • Build a deep intuition for what makes TTS and STT models truly great—from subtle pronunciation nuances to handling of edge cases—and champion quality standards across the organization
  • Create frameworks, documentation, and best practices that help internal teams and customers understand model capabilities, limitations, and optimal usage patterns

What You Bring


  • 6+ years of product management experience with technical products, preferably in AI/ML, audio, or speech technologies
  • Strong analytical mindset with experience designing evaluation frameworks, defining success metrics, and making data-driven quality decisions
  • Deep customer empathy with proven ability to conduct user research, synthesize qualitative feedback, and translate needs into product requirements
  • Technical fluency to work effectively with ML researchers, data scientists, and engineers—understanding model behavior at a detailed level
  • Exceptional attention to detail and quality standards, with the ability to notice subtle differences in model outputs and articulate what makes one better than another
  • Experience working cross-functionally with data teams, engineering teams, and evaluation/testing teams
  • Strong communication skills to advocate for quality and influence technical teams toward customer-centric decisions

Nice to Have


  • Direct experience with speech technologies (TTS, STT, voice cloning, or conversational AI)
  • Background in linguistics, audio engineering or speech sciences
  • Experience with ML model evaluation, A/B testing methodologies, or human evaluation design
  • Familiarity with audio quality metrics (MOS, WER, CER, prosody analysis)
  • Prior experience at a company known for exceptional product quality and attention to detail

What We Offer


🍽 Lunch, dinner and snacks at the office🏥 Fully covered medical, dental, and vision insurance for employees🏦 401(k)✈️ Relocation and immigration support🦖 Your own personal Yoshi

Our culture


🏢 We’re an in-person team based out of San Francisco. We love being in the office, hanging out together and learning from each other everyday.🚢 We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don’t sacrifice quality and design along the way.🤝 We support each other. We have an open and inclusive culture that’s focused on giving everyone the resources they need to succeed.Compensation Range: $200K - $270K

Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2025