AI Model Evaluation Specialist

Inizio Partners, New York, New York, us, 10261

About the job AI Model Evaluation Specialist

Key Responsibilities:

Perform scoring and qualitative evaluations ofLLM-generated responses across multiple use cases. Develop and maintain scoring guidelines and rubrics toensure consistency and objectivity. Collaborate with data scientists, product managers, andengineering teams to align scoring with project goals. Assist in the creation and labeling of high-qualityevaluation datasets for prompt tuning or model fine-tuning. Utilize NLP-based metrics and tools (e.g., ROUGE, BLEU,cosine similarity) for automated scoring support. Document scoring patterns, common model errors, andimprovement opportunities. Contribute to prompt experimentation and help compareeffectiveness of different prompt strategies. Qualifications:

Prior experience with LLMs (e.g., GPT, Claude, LLaMA,etc.) or AI/NLP projects is highly preferred. Strong analytical skills and attention to detail,especially in assessing language quality. Familiarity with prompt engineering, generative AI, orconversational AI tools is a plus. Hands-on experience with Python, Jupyter, or evaluationlibraries (optional but desirable). Experience working with evaluation frameworks orannotation tools (Label Studio, Prodigy, etc.) is a bonus. Excellent written and verbal communication skills