
Director of AI (FDE)
Colossus Technologies Group, Mountain View, CA, United States
Director of Agent Systems Engineering (Forward Deployed Engineering)
Location:
Remote (U.S.) Monthly team meetups Compensation:
Up to ~$300K base + equity About the Role Were partnering with a fast-growing healthtech company building AI agents that operate real clinical workflows including patient intake, administrative automation, and clinician support. After deploying these systems in production, one thing became clear: The hard problem isnt the model its making the system reliable enough to run real workflows. These workflows are long-running, touch multiple systems (EHRs, APIs, internal tools), and require correctness, traceability, and resilience when things break. To support this, the company is scaling its
Forward Deployed Engineering (FDE)
function from ~20 ? 50 engineers this year. Were hiring a
Director of Agent Systems Engineering
to lead part of this organization. What Youll Do This role sits at the intersection of AI engineering and real-world deployment. You will lead teams responsible for turning complex workflows into production-grade agent systems and ensuring those systems are reliable, observable, and repeatable. Key responsibilities include: Designing repeatable delivery systems
for deploying agent workflows into production Mentoring and scaling engineering pods , driving execution and technical excellence Capacity planning and delivery predictability
across multiple concurrent deployments Setting the technical bar
for reliability, observability, and system correctness Driving architecture and system design , including debugging multi-step workflows and failure modes Feeding learnings back into the core platform
as reusable primitives and abstractions This is a hands-on leadership role youll be close to architecture, system behavior, and real production issues. What Were Looking For Were looking for engineers who think in systems, not just models. Strong candidates will have: Experience building and scaling
distributed systems or platform infrastructure Exposure to
AI/LLM-based systems , ideally including agent workflows or orchestration A deep understanding of
reliability, observability, and failure handling
in production systems Experience working with
complex, multi-step workflows across multiple services or APIs A track record of turning
repeated patterns into reusable platform capabilities Leadership experience managing teams and driving delivery in ambiguous environments Backgrounds may include: Platform / infrastructure engineering ML platform or AI systems Workflow orchestration / distributed systems Forward deployed or customer-facing engineering roles
Remote (U.S.) Monthly team meetups Compensation:
Up to ~$300K base + equity About the Role Were partnering with a fast-growing healthtech company building AI agents that operate real clinical workflows including patient intake, administrative automation, and clinician support. After deploying these systems in production, one thing became clear: The hard problem isnt the model its making the system reliable enough to run real workflows. These workflows are long-running, touch multiple systems (EHRs, APIs, internal tools), and require correctness, traceability, and resilience when things break. To support this, the company is scaling its
Forward Deployed Engineering (FDE)
function from ~20 ? 50 engineers this year. Were hiring a
Director of Agent Systems Engineering
to lead part of this organization. What Youll Do This role sits at the intersection of AI engineering and real-world deployment. You will lead teams responsible for turning complex workflows into production-grade agent systems and ensuring those systems are reliable, observable, and repeatable. Key responsibilities include: Designing repeatable delivery systems
for deploying agent workflows into production Mentoring and scaling engineering pods , driving execution and technical excellence Capacity planning and delivery predictability
across multiple concurrent deployments Setting the technical bar
for reliability, observability, and system correctness Driving architecture and system design , including debugging multi-step workflows and failure modes Feeding learnings back into the core platform
as reusable primitives and abstractions This is a hands-on leadership role youll be close to architecture, system behavior, and real production issues. What Were Looking For Were looking for engineers who think in systems, not just models. Strong candidates will have: Experience building and scaling
distributed systems or platform infrastructure Exposure to
AI/LLM-based systems , ideally including agent workflows or orchestration A deep understanding of
reliability, observability, and failure handling
in production systems Experience working with
complex, multi-step workflows across multiple services or APIs A track record of turning
repeated patterns into reusable platform capabilities Leadership experience managing teams and driving delivery in ambiguous environments Backgrounds may include: Platform / infrastructure engineering ML platform or AI systems Workflow orchestration / distributed systems Forward deployed or customer-facing engineering roles