
Infrastructure Research Engineer
3B Staffing LLC, Los Angeles, CA, United States
Infrastructure Research Engineer
USC/GC only
Location: Remote (must work PST hours)
12- Months Contract
Overview:-
We're seeking a
senior engineer
to lead GPU computing performance research and AI infrastructure optimization. This role focuses on
Kubernetes-based distributed systems , benchmarking, and system tuning to maximize performance across compute, storage, and networking.
Responsibilities
Design, implement, and optimize large-scale infrastructure for AI workloads.
Run GPU/CPU benchmarking and performance analysis; recommend improvements.
Tune servers, GPUs, networking, and databases for efficiency and scalability.
Build and manage containerized environments (Kubernetes, Rancher, Kubeflow).
Write and maintain Python/system scripts for automation and debugging.
Troubleshoot infrastructure issues across servers, GPUs, networks, and storage.
Collaborate with research and engineering teams to meet performance goals.
Work occasional late-night hours to complete critical benchmarking deadlines.
Requirements
10+ years of hands-on experience with
Kubernetes, containers, and distributed systems .
5-7 years in
infrastructure research and GPU computing performance .
Strong Python and system-level scripting skills.
Deep knowledge of
AI infrastructure optimization and debugging .
Experience with benchmarking tools and performance tuning methodologies.
Strong problem-solving skills in fast-paced environments.
USC/GC only
Location: Remote (must work PST hours)
12- Months Contract
Overview:-
We're seeking a
senior engineer
to lead GPU computing performance research and AI infrastructure optimization. This role focuses on
Kubernetes-based distributed systems , benchmarking, and system tuning to maximize performance across compute, storage, and networking.
Responsibilities
Design, implement, and optimize large-scale infrastructure for AI workloads.
Run GPU/CPU benchmarking and performance analysis; recommend improvements.
Tune servers, GPUs, networking, and databases for efficiency and scalability.
Build and manage containerized environments (Kubernetes, Rancher, Kubeflow).
Write and maintain Python/system scripts for automation and debugging.
Troubleshoot infrastructure issues across servers, GPUs, networks, and storage.
Collaborate with research and engineering teams to meet performance goals.
Work occasional late-night hours to complete critical benchmarking deadlines.
Requirements
10+ years of hands-on experience with
Kubernetes, containers, and distributed systems .
5-7 years in
infrastructure research and GPU computing performance .
Strong Python and system-level scripting skills.
Deep knowledge of
AI infrastructure optimization and debugging .
Experience with benchmarking tools and performance tuning methodologies.
Strong problem-solving skills in fast-paced environments.