Goldman Sachs
Engineering-Vice President-AI / ML Engineering
Goldman Sachs, Jersey City, New Jersey, United States, 07390
Join to apply for the
Engineering Vice President – AI / ML Engineering
role at
Goldman Sachs .
Who we are Goldman Sachs is a leading global investment banking, securities and investment management firm that provides a wide range of services worldwide to a substantial and diversified client base including corporations, financial institutions, governments and high net‑worth individuals.
Business Unit Overview Enterprise Technology Operations (ETO) is a Business Unit within Core Engineering focused on running scalable production management services with a mandate of operational excellence and operational risk reduction achieved through large‑scale automation, best‑in‑class engineering, and application of data science and machine learning. The Production Runtime Experience (PRX) team in ETO applies software engineering and machine learning to production management services, processes, and activities to streamline monitoring, alerting, automation, and workflows.
Team Overview The Machine Learning and Artificial Intelligence team in PRX applies advanced ML and GenAI to reduce the risk and cost of operating the firm’s large‑scale compute infrastructure and extensive application estate. Leveraging statistical modelling, anomaly detection, predictive modelling and time‑series forecasting, we use foundational LLM models to orchestrate multi‑agent systems for automated production management services.
Role and Responsibilities In this role, you will launch and implement GenAI agentic solutions aimed at reducing the risk and cost of managing large‑scale production environments with varying complexities. You will address production runtime challenges by developing agentic AI solutions that diagnose, reason, and take actions in production environments to improve productivity and support.
What You’ll Do
Build agentic AI systems: design and implement tool‑calling agents that combine retrieval, structured reasoning, and secure action execution following MCP protocol. Engineer robust guardrails for safety, compliance, and least‑privilege access.
Productionize LLMs: build an evaluation framework for open‑source and foundational LLMs; implement retrieval pipelines, prompt synthesis, response validation, and self‑correction loops tailored to production operations.
Integrate with runtime ecosystems: connect agents to observability, incident management, and deployment systems to enable automated diagnostics, runbook execution, remediation, and post‑incident summarization with full traceability.
Collaborate directly with users: partner with production engineers and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability, risk reduction, and cost; deliver auditable, business‑aligned outcomes.
Safety, reliability, and governance: build validator models, adversarial prompts, and policy checks into the stack; enforce deterministic fallbacks, circuit breakers, and rollback strategies; instrument continuous evaluations for usefulness, correctness, and risk.
Scale and performance: optimize cost and latency via prompt engineering, context management, caching, model routing, and distillation; leverage batching, streaming, and parallel tool‑calls to meet stringent SLOs under real‑world load.
Build a RAG pipeline: curate domain‑knowledge; build data‑quality validation framework; establish feedback loops and milestone framework to maintain knowledge freshness.
Raise the bar: drive design reviews, experiment rigor, and high‑quality engineering practices; mentor peers on agent architectures, evaluation methodologies, and safe deployment patterns.
Qualifications A Bachelor’s degree (Masters/PhD preferred) in a computational field (Computer Science, Applied Mathematics, Engineering, or a related quantitative discipline) with 7+ years of experience as an applied data scientist / machine learning engineer.
Essential Skills
7+ years of software development in one or more languages (Python, C/C++, Go, Java); strong hands‑on experience building and maintaining large‑scale Python applications preferred.
3+ years designing, architecting, testing, and launching production ML systems, including model deployment/serving, evaluation and monitoring, data processing pipelines, and model fine‑tuning workflows.
Practical experience with Large Language Models (LLMs): API integration, prompt engineering, finetuning/adaptation, and building applications using RAG and tool‑using agents (vector retrieval, function calling, secure tool execution).
Understanding of different LLMs, both commercial and open‑source, and their capabilities (e.g., OpenAI, Gemini, Llama, Qwen, Claude).
Solid grasp of applied statistics, core ML concepts, algorithms, and data structures to deliver efficient and reliable solutions.
Strong analytical problem‑solving, ownership, and urgency; ability to communicate complex ideas simply and collaborate effectively across global teams with a focus on measurable business impact.
Preferred
Proficiency building and operating on cloud infrastructure (ideally AWS), including containerized services (ECS/EKS), serverless (Lambda), data services (S3, DynamoDB, Redshift), orchestration (Step Functions), model serving (SageMaker), and infra‑as‑code (Terraform/CloudFormation).
Your Career Goldman Sachs is a meritocracy that provides tools to advance your career. Access our comprehensive “Goldman Sachs University” training programme, spanning technical, business and leadership skills.
Salary Range The expected base salary for this Jersey City, New Jersey position is $130,000–$250,000. In addition, you may be eligible for a discretionary bonus if you are an active employee as of fiscal year‑end.
Benefits Goldman Sachs is committed to providing valuable and competitive benefits and wellness offerings as part of a strong overall employee experience.
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Engineering and Information Technology
Referrals increase your chances of interviewing at Goldman Sachs by 2x.
#J-18808-Ljbffr
Engineering Vice President – AI / ML Engineering
role at
Goldman Sachs .
Who we are Goldman Sachs is a leading global investment banking, securities and investment management firm that provides a wide range of services worldwide to a substantial and diversified client base including corporations, financial institutions, governments and high net‑worth individuals.
Business Unit Overview Enterprise Technology Operations (ETO) is a Business Unit within Core Engineering focused on running scalable production management services with a mandate of operational excellence and operational risk reduction achieved through large‑scale automation, best‑in‑class engineering, and application of data science and machine learning. The Production Runtime Experience (PRX) team in ETO applies software engineering and machine learning to production management services, processes, and activities to streamline monitoring, alerting, automation, and workflows.
Team Overview The Machine Learning and Artificial Intelligence team in PRX applies advanced ML and GenAI to reduce the risk and cost of operating the firm’s large‑scale compute infrastructure and extensive application estate. Leveraging statistical modelling, anomaly detection, predictive modelling and time‑series forecasting, we use foundational LLM models to orchestrate multi‑agent systems for automated production management services.
Role and Responsibilities In this role, you will launch and implement GenAI agentic solutions aimed at reducing the risk and cost of managing large‑scale production environments with varying complexities. You will address production runtime challenges by developing agentic AI solutions that diagnose, reason, and take actions in production environments to improve productivity and support.
What You’ll Do
Build agentic AI systems: design and implement tool‑calling agents that combine retrieval, structured reasoning, and secure action execution following MCP protocol. Engineer robust guardrails for safety, compliance, and least‑privilege access.
Productionize LLMs: build an evaluation framework for open‑source and foundational LLMs; implement retrieval pipelines, prompt synthesis, response validation, and self‑correction loops tailored to production operations.
Integrate with runtime ecosystems: connect agents to observability, incident management, and deployment systems to enable automated diagnostics, runbook execution, remediation, and post‑incident summarization with full traceability.
Collaborate directly with users: partner with production engineers and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability, risk reduction, and cost; deliver auditable, business‑aligned outcomes.
Safety, reliability, and governance: build validator models, adversarial prompts, and policy checks into the stack; enforce deterministic fallbacks, circuit breakers, and rollback strategies; instrument continuous evaluations for usefulness, correctness, and risk.
Scale and performance: optimize cost and latency via prompt engineering, context management, caching, model routing, and distillation; leverage batching, streaming, and parallel tool‑calls to meet stringent SLOs under real‑world load.
Build a RAG pipeline: curate domain‑knowledge; build data‑quality validation framework; establish feedback loops and milestone framework to maintain knowledge freshness.
Raise the bar: drive design reviews, experiment rigor, and high‑quality engineering practices; mentor peers on agent architectures, evaluation methodologies, and safe deployment patterns.
Qualifications A Bachelor’s degree (Masters/PhD preferred) in a computational field (Computer Science, Applied Mathematics, Engineering, or a related quantitative discipline) with 7+ years of experience as an applied data scientist / machine learning engineer.
Essential Skills
7+ years of software development in one or more languages (Python, C/C++, Go, Java); strong hands‑on experience building and maintaining large‑scale Python applications preferred.
3+ years designing, architecting, testing, and launching production ML systems, including model deployment/serving, evaluation and monitoring, data processing pipelines, and model fine‑tuning workflows.
Practical experience with Large Language Models (LLMs): API integration, prompt engineering, finetuning/adaptation, and building applications using RAG and tool‑using agents (vector retrieval, function calling, secure tool execution).
Understanding of different LLMs, both commercial and open‑source, and their capabilities (e.g., OpenAI, Gemini, Llama, Qwen, Claude).
Solid grasp of applied statistics, core ML concepts, algorithms, and data structures to deliver efficient and reliable solutions.
Strong analytical problem‑solving, ownership, and urgency; ability to communicate complex ideas simply and collaborate effectively across global teams with a focus on measurable business impact.
Preferred
Proficiency building and operating on cloud infrastructure (ideally AWS), including containerized services (ECS/EKS), serverless (Lambda), data services (S3, DynamoDB, Redshift), orchestration (Step Functions), model serving (SageMaker), and infra‑as‑code (Terraform/CloudFormation).
Your Career Goldman Sachs is a meritocracy that provides tools to advance your career. Access our comprehensive “Goldman Sachs University” training programme, spanning technical, business and leadership skills.
Salary Range The expected base salary for this Jersey City, New Jersey position is $130,000–$250,000. In addition, you may be eligible for a discretionary bonus if you are an active employee as of fiscal year‑end.
Benefits Goldman Sachs is committed to providing valuable and competitive benefits and wellness offerings as part of a strong overall employee experience.
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Engineering and Information Technology
Referrals increase your chances of interviewing at Goldman Sachs by 2x.
#J-18808-Ljbffr