
AI Optimization Engineer
Virtual Tech Gurus, Jersey City, NJ, United States
Job Summary
We are seeking an AI Optimization Engineer to support enterprise AI/ML initiatives with a focus on performance optimization, scalable infrastructure, and production deployment. The ideal candidate will have hands‑on experience with large language models (LLMs), GPU-accelerated environments, and high-performance computing (HPC) platforms.
Key Responsibilities
Design, optimize, and deploy machine learning and deep learning models into production
Build and manage scalable infrastructure for AI/ML workloads, including LLMs
Configure and maintain GPU-accelerated clusters for large-scale processing
Implement model optimization techniques (pruning, quantization, distillation)
Deploy models using containerized and microservices-based architectures
Develop secure REST APIs using Flask for inference and orchestration
Configure and optimize Triton Inference Server for model serving
Manage job scheduling and automation using SLURM
Monitor system and model performance using Prometheus and Grafana
Perform exploratory data analysis (EDA) and visualization
Collaborate with cross-functional AI teams (NLP, Computer Vision, GenAI)
Required Skills & Qualifications
Strong Python programming (NumPy, scikit-learn)
Experience with ML/DL frameworks: TensorFlow, PyTorch, or Keras
Hands‑on experience with HPC and GPU environments
Strong knowledge of ML algorithms (supervised & unsupervised learning)
Experience deploying ML models into production
Understanding of neural networks, transformers, and ensemble methods
Experience with hyperparameter tuning and transfer learning
Linux administration (RHEL/CentOS)
API development experience (Flask, REST)
Strong troubleshooting and performance optimization skills
Tools & Technologies
Containers & Orchestration: Docker, Kubernetes, Podman, Enroot, Pyxis
ML & AI Tools: MLflow, Jupyter, Hugging Face
Inference & Optimization: Triton Inference Server, TRT-LLM
Monitoring: Prometheus, Grafana
CI/CD & Infra: GitHub, Jenkins, Terraform
Databases: Oracle, MS SQL, MySQL, MongoDB, Redis
Visualization: Matplotlib, Seaborn, Plotly
Scheduling: SLURM
Preferred Qualifications
Experience with AWS (SageMaker, EC2, Lambda)
Knowledge of vector embeddings and generative AI
Experience with data preprocessing (cleaning, scaling, normalization)
Frontend exposure (Angular, HTML, CSS, JavaScript)
SQL / PL-SQL scripting
JOBID: 12335
#J-18808-Ljbffr
We are seeking an AI Optimization Engineer to support enterprise AI/ML initiatives with a focus on performance optimization, scalable infrastructure, and production deployment. The ideal candidate will have hands‑on experience with large language models (LLMs), GPU-accelerated environments, and high-performance computing (HPC) platforms.
Key Responsibilities
Design, optimize, and deploy machine learning and deep learning models into production
Build and manage scalable infrastructure for AI/ML workloads, including LLMs
Configure and maintain GPU-accelerated clusters for large-scale processing
Implement model optimization techniques (pruning, quantization, distillation)
Deploy models using containerized and microservices-based architectures
Develop secure REST APIs using Flask for inference and orchestration
Configure and optimize Triton Inference Server for model serving
Manage job scheduling and automation using SLURM
Monitor system and model performance using Prometheus and Grafana
Perform exploratory data analysis (EDA) and visualization
Collaborate with cross-functional AI teams (NLP, Computer Vision, GenAI)
Required Skills & Qualifications
Strong Python programming (NumPy, scikit-learn)
Experience with ML/DL frameworks: TensorFlow, PyTorch, or Keras
Hands‑on experience with HPC and GPU environments
Strong knowledge of ML algorithms (supervised & unsupervised learning)
Experience deploying ML models into production
Understanding of neural networks, transformers, and ensemble methods
Experience with hyperparameter tuning and transfer learning
Linux administration (RHEL/CentOS)
API development experience (Flask, REST)
Strong troubleshooting and performance optimization skills
Tools & Technologies
Containers & Orchestration: Docker, Kubernetes, Podman, Enroot, Pyxis
ML & AI Tools: MLflow, Jupyter, Hugging Face
Inference & Optimization: Triton Inference Server, TRT-LLM
Monitoring: Prometheus, Grafana
CI/CD & Infra: GitHub, Jenkins, Terraform
Databases: Oracle, MS SQL, MySQL, MongoDB, Redis
Visualization: Matplotlib, Seaborn, Plotly
Scheduling: SLURM
Preferred Qualifications
Experience with AWS (SageMaker, EC2, Lambda)
Knowledge of vector embeddings and generative AI
Experience with data preprocessing (cleaning, scaling, normalization)
Frontend exposure (Angular, HTML, CSS, JavaScript)
SQL / PL-SQL scripting
JOBID: 12335
#J-18808-Ljbffr