
Technical Program Manager, Infrastructure & Capacity Management Job at Cohere in
Cohere, New York, NY, United States
Who are we?
Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.
Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.
Join us on our mission and shape the future!
Why this role?
We’re seeking an experienced Technical Program Manager to join Cohere’s Technical Program Management team. We need someone with curiosity, drive, independence, and leadership, who has hands‑on experience launching LLM models or enterprise‑grade software onto cloud platforms.
In return, you’ll have the unique opportunity to shape Cohere’s operations, collaborate with leading minds in the LLM space. You will get a chance to create extremely high‑impact contributions to our fast‑growing company, product and culture.
This role is open to candidates based on the East Coast and within an hour of GMT.
As a Technical Program Manager for Infrastructure & Capacity Management, you will:
Coordinate with internal and external teams (cloud providers, GPU manufacturers, etc.) to manage the full lifecycle of compute resources, with a heavy emphasis on GPUs
Lead end‑to‑end programs within the infrastructure space from planning and execution to cross‑functional coordination and stakeholder management
Collaborate directly with technical and non‑technical teams to set, track, and manage timelines, deliverables, budgets, scope, etc.
Manage various overlapping projects and programs, ruthlessly prioritizing asks, to ensure that the company’s top priorities are met
Work with leadership on anticipating and planning capacity needs across the company, and turning these needs into practical execution plans on how capacity is allocated, tracked, and managed
Deliver clear, timely, and consistent updates across engineering, leadership, and non‑technical teams on the monthly container releases and wider model launches.
You may be a good fit if:
You have 5+ years of experience as a Technical Program Manager
You have 3+ years working with infrastructure teams, and have a solid understanding of the challenges with project and program management in this space
You know how to use GitHub & Grafana, can read Terraform, and are generally comfortable with onboarding and diving into new technologies
You have practical experience, either in a hands‑on role or as a TPM, with full lifecycle management of compute resources (GPUs, CPUs, VMs, etc.)
You have a blend of experience working in the intersection of software, hardware, and machine learning – particularly with GPUs used for ML training and inference
You have hands‑on experience with coordinating large‑scale, company‑wide programs within a deeply technical space
You have worked in highly dynamic, fast‑paced environments, requiring striking a balance between speed and structure
(bonus) You have experience with managing projects and programs related to serving and efficiency of LLMs
(bonus) You have past experience in a technical role, such as infrastructure engineer, software engineer, machine learning engineer, etc.
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply!
Full‑Time Employees at Cohere enjoy these Perks:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in‑office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top‑up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well‑being, quality time, and workspace improvement
Remote‑flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co‑working stipend
✈️ 6 weeks of vacation (30 working days!)
We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.
#J-18808-Ljbffr
In Summary: Cohere is training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents . We need someone with curiosity, drive, independence, and leadership . In return, you’ll have the unique opportunity to shape Cohere’s operations, collaborate with leading minds in the LLM space .
En Español: ¿Quiénes somos? Nuestra misión es ampliar la inteligencia para servir a la humanidad. Estamos capacitando y desplegando modelos de vanguardia para desarrolladores e empresas que están construyendo sistemas de IA para impulsar experiencias mágicas como generación de contenidos, búsqueda semántica, RAG y agentes. Creemos que nuestro trabajo es fundamental para la adopción generalizada de AI. Nos obsesionamos con lo que construimos. Cada uno de nosotros es responsable de contribuir a aumentar las capacidades de nuestros modelos y el valor que generan para nuestros clientes. Nos gusta trabajar duro y movernos rápidamente para hacer lo que mejor sea para nuestros usuarios. Aquí hay un equipo de investigadores, ingenieros, diseñadores y más, apasionados por su oficio. Cada persona es una de las mejores del mundo en lo que hace. Como Gerente de Programa Técnico para la Infraestructura y Gestión de Capacidad, usted tendrá la oportunidad de: coordinar con equipos internos y externos (proveedores de nube, fabricantes de GPUs, etc.) para gestionar el ciclo completo de vida de los recursos computacionales, haciendo hincapié en los programas GPU de punta a punta dentro del espacio infraestructural desde la planificación y ejecución hasta la coordinación transversal y gestión entre las partes interesadas. Coordinará directamente con equipos técnicos y no técnicos para establecer, rastrear y administrar plazos, entregables, presupuestos, alcance, etc. Administrará diversos proyectos e programas rápidos e integrando sin piedad como prioridades, asegurándose de que la empresa tenga una experiencia práctica completa en ejecutar y manejar sistemas operativos y capacidades técnicas, así como tener un conocimiento sólido sobre las necesidades básicas de desarrollo de procesos de trabajo y administración de software, especialmente si se trata de desarrollar un sistema de gestión de problemas relacionados con la calidad de funcionamiento de las computadoras, cómo trabajar con un equipo técnico o una gran experiencia en términos prácticos (incluyendo una amplia gama de entrenamientos y servicios) y la capacidad de trabajo en conjunto con un programa de trabajo de 3 años de duración y una vez en cuando trabaja con un proyecto de tiempo más amplio (también incluido en el marco de un modelo de gestión y una nueva, como un proceso de aprendizaje de trabajo), una buena experiencia en línea de ingeniería y gestión de tecnología de trabajo con unida y una serie de negocios). Damos la bienvenida a los solicitantes de todos los orígenes y nos comprometemos a proporcionar igualdad de oportunidades. Si necesita algún alojamiento durante el proceso de contratación, envíe un Formulario de Solicitud de Alojamiento, y trabajaremos juntos para satisfacer sus necesidades. #J-18808-Ljbffr
Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.
Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.
Join us on our mission and shape the future!
Why this role?
We’re seeking an experienced Technical Program Manager to join Cohere’s Technical Program Management team. We need someone with curiosity, drive, independence, and leadership, who has hands‑on experience launching LLM models or enterprise‑grade software onto cloud platforms.
In return, you’ll have the unique opportunity to shape Cohere’s operations, collaborate with leading minds in the LLM space. You will get a chance to create extremely high‑impact contributions to our fast‑growing company, product and culture.
This role is open to candidates based on the East Coast and within an hour of GMT.
As a Technical Program Manager for Infrastructure & Capacity Management, you will:
Coordinate with internal and external teams (cloud providers, GPU manufacturers, etc.) to manage the full lifecycle of compute resources, with a heavy emphasis on GPUs
Lead end‑to‑end programs within the infrastructure space from planning and execution to cross‑functional coordination and stakeholder management
Collaborate directly with technical and non‑technical teams to set, track, and manage timelines, deliverables, budgets, scope, etc.
Manage various overlapping projects and programs, ruthlessly prioritizing asks, to ensure that the company’s top priorities are met
Work with leadership on anticipating and planning capacity needs across the company, and turning these needs into practical execution plans on how capacity is allocated, tracked, and managed
Deliver clear, timely, and consistent updates across engineering, leadership, and non‑technical teams on the monthly container releases and wider model launches.
You may be a good fit if:
You have 5+ years of experience as a Technical Program Manager
You have 3+ years working with infrastructure teams, and have a solid understanding of the challenges with project and program management in this space
You know how to use GitHub & Grafana, can read Terraform, and are generally comfortable with onboarding and diving into new technologies
You have practical experience, either in a hands‑on role or as a TPM, with full lifecycle management of compute resources (GPUs, CPUs, VMs, etc.)
You have a blend of experience working in the intersection of software, hardware, and machine learning – particularly with GPUs used for ML training and inference
You have hands‑on experience with coordinating large‑scale, company‑wide programs within a deeply technical space
You have worked in highly dynamic, fast‑paced environments, requiring striking a balance between speed and structure
(bonus) You have experience with managing projects and programs related to serving and efficiency of LLMs
(bonus) You have past experience in a technical role, such as infrastructure engineer, software engineer, machine learning engineer, etc.
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply!
Full‑Time Employees at Cohere enjoy these Perks:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in‑office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top‑up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well‑being, quality time, and workspace improvement
Remote‑flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co‑working stipend
✈️ 6 weeks of vacation (30 working days!)
We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.
#J-18808-Ljbffr
In Summary: Cohere is training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents . We need someone with curiosity, drive, independence, and leadership . In return, you’ll have the unique opportunity to shape Cohere’s operations, collaborate with leading minds in the LLM space .
En Español: ¿Quiénes somos? Nuestra misión es ampliar la inteligencia para servir a la humanidad. Estamos capacitando y desplegando modelos de vanguardia para desarrolladores e empresas que están construyendo sistemas de IA para impulsar experiencias mágicas como generación de contenidos, búsqueda semántica, RAG y agentes. Creemos que nuestro trabajo es fundamental para la adopción generalizada de AI. Nos obsesionamos con lo que construimos. Cada uno de nosotros es responsable de contribuir a aumentar las capacidades de nuestros modelos y el valor que generan para nuestros clientes. Nos gusta trabajar duro y movernos rápidamente para hacer lo que mejor sea para nuestros usuarios. Aquí hay un equipo de investigadores, ingenieros, diseñadores y más, apasionados por su oficio. Cada persona es una de las mejores del mundo en lo que hace. Como Gerente de Programa Técnico para la Infraestructura y Gestión de Capacidad, usted tendrá la oportunidad de: coordinar con equipos internos y externos (proveedores de nube, fabricantes de GPUs, etc.) para gestionar el ciclo completo de vida de los recursos computacionales, haciendo hincapié en los programas GPU de punta a punta dentro del espacio infraestructural desde la planificación y ejecución hasta la coordinación transversal y gestión entre las partes interesadas. Coordinará directamente con equipos técnicos y no técnicos para establecer, rastrear y administrar plazos, entregables, presupuestos, alcance, etc. Administrará diversos proyectos e programas rápidos e integrando sin piedad como prioridades, asegurándose de que la empresa tenga una experiencia práctica completa en ejecutar y manejar sistemas operativos y capacidades técnicas, así como tener un conocimiento sólido sobre las necesidades básicas de desarrollo de procesos de trabajo y administración de software, especialmente si se trata de desarrollar un sistema de gestión de problemas relacionados con la calidad de funcionamiento de las computadoras, cómo trabajar con un equipo técnico o una gran experiencia en términos prácticos (incluyendo una amplia gama de entrenamientos y servicios) y la capacidad de trabajo en conjunto con un programa de trabajo de 3 años de duración y una vez en cuando trabaja con un proyecto de tiempo más amplio (también incluido en el marco de un modelo de gestión y una nueva, como un proceso de aprendizaje de trabajo), una buena experiencia en línea de ingeniería y gestión de tecnología de trabajo con unida y una serie de negocios). Damos la bienvenida a los solicitantes de todos los orígenes y nos comprometemos a proporcionar igualdad de oportunidades. Si necesita algún alojamiento durante el proceso de contratación, envíe un Formulario de Solicitud de Alojamiento, y trabajaremos juntos para satisfacer sus necesidades. #J-18808-Ljbffr