
LLM Engineer
Saroe, Inc., New York, NY, United States
About Saroe, Inc.
Saroe, Inc. is a technology consulting and staff-augmentation firm helping small to mid-size and enterprise organizations build scalable, modern software solutions. We partner closely with our clients to deliver high-impact work across cloud, AI, automation, and application development.
We are currently
supporting one of our clients
in hiring a skilled
LLM Engineer
to work on next-generation AI-powered applications.
About the Role
We are seeking a highly skilled
LLM Engineer
to design, develop, and optimize applications powered by Large Language Models (LLMs). In this role, you’ll work on cutting-edge AI systems involving tool-using agents, vector search, and modern orchestration frameworks.
You will collaborate with engineers, architects, and product stakeholders to translate real-world business needs into scalable, secure AI solutions.
Key Responsibilities
Design and develop LLM-powered applications using APIs such as OpenAI, Claude, Gemini, and similar platforms
Build AI workflows using frameworks like
LangGraph ,
DSPy , and tool-use architectures
Implement MCP server integrations to extend LLM capabilities
Develop and integrate vector search solutions using
Qdrant ,
Milvus , or
pgvector
Optimize prompts, orchestration logic, and tool chaining for accuracy and performance
Collaborate with cross-functional teams to deliver AI-enabled solutions
Ensure best practices around security, scalability, and maintainability
Use Azure DevOps
for source control, CI/CD, and deployments
Participate in Agile ceremonies including sprint planning and reviews
Required Skills & Qualifications
Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or a related field
4 years of software development experience, including
2 years working with LLM-based systems
Strong hands-on experience with LLM APIs (OpenAI, Claude, Gemini, etc.)
Experience with LangGraph ,
DSPy , and tool-use patterns
Familiarity with MCP server integration
Solid experience with vector databases (Qdrant, Milvus, pgvector)
Experience working in Agile teams and using Azure DevOps
Nice to Have
Experience building
RAG (Retrieval-Augmented Generation)
pipelines
Exposure to MLOps practices and AI model deployment
Experience in multi-cloud environments (Azure, AWS, GCP)
Familiarity with Docker and Kubernetes
Why Join Through Saroe?
Work on
real-world, production-grade AI systems
Opportunity to collaborate with strong engineering teams and modern tech stacks
Hybrid work model with flexibility
Saroe acts as a
long-term partner , not just a placement firm
Location
Location:
New York, Princeton, or Chicago
Work Model
Work Model:
Hybrid (3 days/week onsite)
#J-18808-Ljbffr
Saroe, Inc. is a technology consulting and staff-augmentation firm helping small to mid-size and enterprise organizations build scalable, modern software solutions. We partner closely with our clients to deliver high-impact work across cloud, AI, automation, and application development.
We are currently
supporting one of our clients
in hiring a skilled
LLM Engineer
to work on next-generation AI-powered applications.
About the Role
We are seeking a highly skilled
LLM Engineer
to design, develop, and optimize applications powered by Large Language Models (LLMs). In this role, you’ll work on cutting-edge AI systems involving tool-using agents, vector search, and modern orchestration frameworks.
You will collaborate with engineers, architects, and product stakeholders to translate real-world business needs into scalable, secure AI solutions.
Key Responsibilities
Design and develop LLM-powered applications using APIs such as OpenAI, Claude, Gemini, and similar platforms
Build AI workflows using frameworks like
LangGraph ,
DSPy , and tool-use architectures
Implement MCP server integrations to extend LLM capabilities
Develop and integrate vector search solutions using
Qdrant ,
Milvus , or
pgvector
Optimize prompts, orchestration logic, and tool chaining for accuracy and performance
Collaborate with cross-functional teams to deliver AI-enabled solutions
Ensure best practices around security, scalability, and maintainability
Use Azure DevOps
for source control, CI/CD, and deployments
Participate in Agile ceremonies including sprint planning and reviews
Required Skills & Qualifications
Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or a related field
4 years of software development experience, including
2 years working with LLM-based systems
Strong hands-on experience with LLM APIs (OpenAI, Claude, Gemini, etc.)
Experience with LangGraph ,
DSPy , and tool-use patterns
Familiarity with MCP server integration
Solid experience with vector databases (Qdrant, Milvus, pgvector)
Experience working in Agile teams and using Azure DevOps
Nice to Have
Experience building
RAG (Retrieval-Augmented Generation)
pipelines
Exposure to MLOps practices and AI model deployment
Experience in multi-cloud environments (Azure, AWS, GCP)
Familiarity with Docker and Kubernetes
Why Join Through Saroe?
Work on
real-world, production-grade AI systems
Opportunity to collaborate with strong engineering teams and modern tech stacks
Hybrid work model with flexibility
Saroe acts as a
long-term partner , not just a placement firm
Location
Location:
New York, Princeton, or Chicago
Work Model
Work Model:
Hybrid (3 days/week onsite)
#J-18808-Ljbffr