LLM Engineer

Saroe, Inc., New York, NY, United States

About Saroe, Inc.
Saroe, Inc. is a technology consulting and staff-augmentation firm helping small to mid-size and enterprise organizations build scalable, modern software solutions. We partner closely with our clients to deliver high-impact work across cloud, AI, automation, and application development.

We are currently

supporting one of our clients

in hiring a skilled

LLM Engineer

to work on next-generation AI-powered applications.

About the Role
We are seeking a highly skilled

LLM Engineer

to design, develop, and optimize applications powered by Large Language Models (LLMs). In this role, you’ll work on cutting-edge AI systems involving tool-using agents, vector search, and modern orchestration frameworks.

You will collaborate with engineers, architects, and product stakeholders to translate real-world business needs into scalable, secure AI solutions.

Key Responsibilities

Design and develop LLM-powered applications using APIs such as OpenAI, Claude, Gemini, and similar platforms

Build AI workflows using frameworks like

LangGraph ,

DSPy , and tool-use architectures

Implement MCP server integrations to extend LLM capabilities

Develop and integrate vector search solutions using

Qdrant ,

Milvus , or

pgvector

Optimize prompts, orchestration logic, and tool chaining for accuracy and performance

Collaborate with cross-functional teams to deliver AI-enabled solutions

Ensure best practices around security, scalability, and maintainability

Use Azure DevOps

for source control, CI/CD, and deployments

Participate in Agile ceremonies including sprint planning and reviews

Required Skills & Qualifications

Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or a related field

4 years of software development experience, including

2 years working with LLM-based systems

Strong hands-on experience with LLM APIs (OpenAI, Claude, Gemini, etc.)

Experience with LangGraph ,

DSPy , and tool-use patterns

Familiarity with MCP server integration

Solid experience with vector databases (Qdrant, Milvus, pgvector)

Experience working in Agile teams and using Azure DevOps

Nice to Have

Experience building

RAG (Retrieval-Augmented Generation)

pipelines

Exposure to MLOps practices and AI model deployment

Experience in multi-cloud environments (Azure, AWS, GCP)

Familiarity with Docker and Kubernetes

Why Join Through Saroe?

Work on

real-world, production-grade AI systems

Opportunity to collaborate with strong engineering teams and modern tech stacks

Hybrid work model with flexibility

Saroe acts as a

long-term partner , not just a placement firm

Location
Location:

New York, Princeton, or Chicago

Work Model
Work Model:

Hybrid (3 days/week onsite)

#J-18808-Ljbffr