
Evaluation Scenario Writer - AI Agent Testing Specialist
Mindrift, Saint Paul, Minnesota, United States
Please submit your CV in English and indicate your level of English proficiency.
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems.
Participation is project-based, not permanent employment.
What This Opportunity Involves
Create structured test cases that simulate complex human workflows
Define gold-standard behavior and scoring logic to evaluate agent actions
Analyze agent logs, failure modes, and decision paths
Work with code repositories and test frameworks to validate your scenarios
Iterate on prompts, instructions, and test cases to improve clarity and difficulty
Ensure that scenarios are production-ready, easy to run, and reusable
#J-18808-Ljbffr
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems.
Participation is project-based, not permanent employment.
What This Opportunity Involves
Create structured test cases that simulate complex human workflows
Define gold-standard behavior and scoring logic to evaluate agent actions
Analyze agent logs, failure modes, and decision paths
Work with code repositories and test frameworks to validate your scenarios
Iterate on prompts, instructions, and test cases to improve clarity and difficulty
Ensure that scenarios are production-ready, easy to run, and reusable
#J-18808-Ljbffr