Breaker
Breaker is hiring: Machine Learning Engineer - Audio Specialist in Austin
Breaker, Austin, TX, US, 78716
Machine Learning Engineer - Audio Specialist
Join to apply for the Machine Learning Engineer - Audio Specialist role at Breaker
Direct message the job poster from Breaker
Build audio understanding models (STT, STS) from scratch for voice‑controlled robotic systems
Own the entire audio ML pipeline: data collection, training infrastructure, deployment, and optimization
Deep audio/signal processing expertise required – this isn't general ML
Travel for field testing in real‑world conditions
Join an exciting startup backed by globally recognised investors at the bleeding edge of physical AI
About Us
The way humans use robots is broken. Modern warfare demands more robots than we have operators. Every drone, ground vehicle, and maritime system requires dedicated training, manual control, and constant human oversight. One operator per robot. One pilot per mission. This operator bottleneck is the constraint on military capability today.
Breaker's AI agent, Avalon, breaks this constraint. Instead of piloting individual robots, operators command entire teams of autonomous systems across air, ground, and maritime domains, all through natural conversation.
A single operator can now coordinate multiple drones, ground vehicles, and other platforms simultaneously. Instead of flying search patterns on three different screens, you say “survey this area and flag anything unusual” – and a team of robots figures out how to divide the task, coordinate their movements, and report back what matters.
We're not just making robots easier to control. We're fundamentally changing the operator‑to‑robot ratio, turning small teams into force multipliers.
Our software deploys models directly onboard each robot, enabling real‑time, intent‑driven control even in contested environments with limited bandwidth. We're solving problems most AI companies never touch: sub‑second inference on edge hardware with strict latency, power, and connectivity constraints.
We're backed by some of the best global investors and are growing our team across Austin, Texas, and Sydney, Australia. We're a small team of experienced engineers, moving fast on technology that will define how humans and machines work together for decades to come.
Join us if you want to help create the robots we were promised
About the Role
Voice is the primary human machine interface our customers use with autonomous systems – and you'll own that capability from the ground up. You'll build speech recognition from scratch: building training pipelines, curating datasets, and shipping models that work reliably in real‑world conditions where radio communication, wind noise, and varying audio quality are the norm, not the exception.
This is a rare opportunity to build foundational IP rather than integrate off‑the‑shelf solutions. You'll establish the entire ML infrastructure for the speech stack, from data collection strategies to model deployment on edge hardware. The work spans the full spectrum: running field tests to capture training data, experimenting with state‑of‑the‑art architectures, and optimizing models to run efficiently on compute‑constrained robotic platforms.
The technical challenge is unusually deep for an ML role. You're not just training models – you're making architectural decisions about how audio signals are processed, understanding the tradeoffs between different model families for varying audio quality scenarios, and building the instrumentation to track performance improvements over time. This is the kind of work that could result in patents and define how voice‑controlled robotics systems perform for years to come.
Key Responsibilities
Evaluate and implement state‑of‑the‑art architectures, making informed decisions about model selection based on audio quality constraints and deployment requirements
Own metrics such as word error rate (WER), establishing baselines and demonstrating measurable improvements over time
Build and maintain infrastructure for model training, including experiment tracking, performance monitoring, and version control
Design data collection campaigns and field testing protocols to capture representative training data across varying environmental conditions
Establish audio quality requirements and provide input on hardware selection for optimal model performance
Deploy and optimize models for NVIDIA Jetson platforms, ensuring real‑time performance within compute and latency constraints
Conduct hands‑on field testing in varied environments (outdoor, windy conditions, different communication systems) to validate model performance
Stay current with rapidly evolving speech recognition and multimodal model research, evaluating new approaches for potential integration
About You
Required Skills and Experience
Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, Audio Engineering, or a related field
Proven track record designing, training, and shipping audio ML models end‑to‑end (e.g. speech‑to‑text, speech‑to‑speech), including dataset creation, training pipelines, evaluation, and deployment in real‑world applications
Deep understanding of how audio is represented and modeled for ML, including audio DSP and frequency‑domain processing (e.g. STFT, mel/spectrogram transforms) and how these choices affect model performance
Expert‑level Python for ML development, including building training loops, data/input pipelines, and experiment tracking
Hands‑on experience deploying, quantizing, and optimizing models for production environments
Open to field work and travel for data capture campaigns and system validation testing
Preferred Skills
Background in audio product companies or audio‑focused ML applications (microphone manufacturers, audio processing products, speech recognition systems)
Personal passion for audio (e.g., sound engineering background, audio enthusiast with technical depth)
Experience with data annotation workflows and managing labeling processes
Experience with edge deployment or resource‑constrained environments
Familiarity with ARM deployment or NVIDIA Jetson platforms
Exposure to multimodal models or bridging speech and language model systems
Data pipeline engineering experience for managing large‑scale training datasets
Proficiency with ML infrastructure tools (e.g., Weights & Biases, ClearML, or similar)
Experience with ROS/ROS2 development and integrating AI with robotic systems
Why Join Us?
You’ll be an owner, not a renter. We're at the stage where foundational decisions are still being made and entire systems need to be built from scratch. Your work won't be maintaining someone else’s legacy – you'll be creating what comes next. The problems you solve and the systems you build will define how Breaker scales.
You’ll work with people who’ve done this before. Our team has shipped production robotics systems, scaled infrastructure, and solved the kind of hard integration problems that only come up when software meets the physical world. You won’t be the only person in the room who’s debugged a sensor fusion pipeline or optimized inference on a Jetson.
You’ll solve problems that don’t exist anywhere else. Most companies are building incremental improvements on established technology. We’re defining new categories – which means the work is harder, more ambiguous, and infinitely more interesting.
You’ll work hard, together. We're in the office every day, grinding on hard problems alongside great people. We've built a workspace where the best work happens – access to hardware, quick decisions, real collaboration. We're flexible when life requires it, but we're looking for people who want to show up, get stuck in, and build something significant with a team they respect.
We’re going global. Backed by globally recognised investors, we're growing teams across Sydney, Australia and Austin, Texas. If you want exposure to international expansion and the opportunity to help build across regions, that path exists here.
You’ll own what you build. Generous equity packages mean when Breaker wins, you win.
Location. Austin, Texas
Ready to Apply?
If you’re excited about the opportunity to work at the bleeding edge of physical AI, we’d love to hear from you.
Seniority level
Mid‑Senior level
Employment type
Full‑time
Job function
Engineering and Information Technology
Industries
Software Development
#J-18808-Ljbffr