PhD Chemistry Expert for AI Evaluation (Remote)

YO IT Consulting, Los Angeles, CA, United States

Role Overview

We partner with leading AI teams to improve the quality, usefulness, and reliability of conversational AI systems. Engagement Type:

Independent Contractor

Work Mode:

Fully Remote

Schedule:

Flexible (Project-Based)

Role Overview

We partner with leading AI teams to improve the quality, usefulness, and reliability of conversational AI systems.

In Chemistry-related Contexts, AI Systems Must Demonstrate

Accurate mechanistic reasoning Quantitative precision Strong command of chemical theory and application

This role focuses on evaluating and improving how AI models reason about, explain, and communicate chemistry concepts across foundational and advanced topics.

Key Responsibilities

Write and refine prompts to guide AI model behavior in chemistry contexts Evaluate AI-generated responses for: Scientific accuracy Mechanistic correctness Quantitative reasoning Conduct fact-checking using authoritative sources and domain expertise Annotate responses by identifying: Strengths Weak reasoning Conceptual inaccuracies Assess explanation clarity and structure for different audience levels Ensure responses align with conversational guidelines and evaluation standards Apply structured taxonomies, benchmarks, and detailed review frameworks

Required Qualifications

PhD in Chemistry or a closely related field

Deep expertise in one or more of the following domains:

Organic & Biological Chemistry Inorganic & Materials Chemistry Physical & Theoretical Chemistry Analytical & Instrumental Chemistry

Additional Requirements

Significant experience using large language models (LLMs) Strong understanding of how LLMs are used and where they fail Excellent technical writing and explanation skills Exceptional attention to detail Experience reviewing or editing academic or technical writing

Nice-to-Have Qualifications

Experience with RLHF, AI model evaluation, or data annotation Teaching or mentoring experience Familiarity with structured evaluation rubrics or benchmarking systems

What Success Looks Like

You identify subtle inaccuracies or weak reasoning in chemistry outputs Your feedback improves scientific rigor and clarity You deliver reproducible evaluation artifacts AI systems become more reliable in chemistry-related applications

Why Join

Apply PhD-level expertise to frontier AI development Contribute to high-impact AI research beyond traditional academic roles Flexible schedule with project-based autonomy Competitive compensation aligned with expertise

Contract & Payment Terms

Independent Contractor engagement Fully remote with flexible working schedule Project-based work (may be extended or concluded based on needs and performance) Weekly payments via Stripe or Wise Unable to support H1-B or STEM OPT candidates at this time

#J-18808-Ljbffr