
Remote | Operations Research Model Prompt Evaluator — $60–$80/hour
24-MAG, San Francisco, CA, United States
Overview
We are sharing a specialised part‑time consulting opportunity for experienced operations research professionals with strong quantitative judgment, deep technical knowledge, and the ability to craft and verify high‑quality open‑ended prompts for AI model evaluation. This role supports an exciting collaboration with leading AI companies focused on improving frontier language models through high‑quality prompt authoring, verification, and evaluation workflows across core operations research and decision‑science domains.
Key Responsibilities Professionals in this role may contribute to:
Prompt Authoring for AI Evaluation
Create original, open‑ended operations research prompts from assigned subdomains at varying difficulty levels
Develop prompts that require human judgment to evaluate the quality of AI responses
Help ensure that prompts are clear, technically rigorous, and suitable for model evaluation
Prompt Verification & Quality Review
Review authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness
Edit prompts and difficulty ratings where needed
Help maintain high standards for precision, quality, and consistency across evaluation tasks
Operations Research Reasoning Assessment
Apply expert judgment to assess the depth and quality of quantitative reasoning required
Work across areas such as optimization modeling, algorithmic analysis, stochastic reasoning, and decision science
Help improve model quality through carefully designed and verified technical prompts
Ideal Profile
Strong candidates may have a Master's degree or higher in Operations Research, Industrial Engineering, Applied Mathematics, or a closely related field
2–6 years of professional or research experience in optimization, logistics, or decision science
Strong command of mathematical programming, probabilistic modeling, and algorithmic methods
Excellent written English and the ability to craft precise, well‑scoped technical questions
Preferred Qualifications
Experience with solvers such as Gurobi or CPLEX
Experience with simulation tools
Strong familiarity with operations research subdomains such as linear and integer programming, network optimization, queuing theory, game theory, supply chain optimization, and simulation
High attention to detail and strong consistency in technical evaluation workflows
Why This Opportunity
Contribute specialised operations research expertise to a cutting‑edge AI collaboration
Help establish rigorous evaluation standards for frontier language models
Work on high‑impact prompt design and verification tasks with strong technical relevance
Flexible remote work with competitive hourly compensation
Contract Details
Independent contractor role
Fully remote with flexible scheduling
Hourly compensation of $60–$80 per hour
Expected commitment of 10+ hours per week
Asynchronous work format
Assignments may involve either authoring or verification tasks depending on project needs
Projects may be extended, shortened, or concluded early depending on project needs and performance
Weekly payments via Stripe or Wise
Work will not involve access to confidential or proprietary information from any employer, client, or institution
Please note: We are unable to support H1‑B or STEM OPT candidates at this time
Start date: Immediate
#J-18808-Ljbffr
Key Responsibilities Professionals in this role may contribute to:
Prompt Authoring for AI Evaluation
Create original, open‑ended operations research prompts from assigned subdomains at varying difficulty levels
Develop prompts that require human judgment to evaluate the quality of AI responses
Help ensure that prompts are clear, technically rigorous, and suitable for model evaluation
Prompt Verification & Quality Review
Review authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness
Edit prompts and difficulty ratings where needed
Help maintain high standards for precision, quality, and consistency across evaluation tasks
Operations Research Reasoning Assessment
Apply expert judgment to assess the depth and quality of quantitative reasoning required
Work across areas such as optimization modeling, algorithmic analysis, stochastic reasoning, and decision science
Help improve model quality through carefully designed and verified technical prompts
Ideal Profile
Strong candidates may have a Master's degree or higher in Operations Research, Industrial Engineering, Applied Mathematics, or a closely related field
2–6 years of professional or research experience in optimization, logistics, or decision science
Strong command of mathematical programming, probabilistic modeling, and algorithmic methods
Excellent written English and the ability to craft precise, well‑scoped technical questions
Preferred Qualifications
Experience with solvers such as Gurobi or CPLEX
Experience with simulation tools
Strong familiarity with operations research subdomains such as linear and integer programming, network optimization, queuing theory, game theory, supply chain optimization, and simulation
High attention to detail and strong consistency in technical evaluation workflows
Why This Opportunity
Contribute specialised operations research expertise to a cutting‑edge AI collaboration
Help establish rigorous evaluation standards for frontier language models
Work on high‑impact prompt design and verification tasks with strong technical relevance
Flexible remote work with competitive hourly compensation
Contract Details
Independent contractor role
Fully remote with flexible scheduling
Hourly compensation of $60–$80 per hour
Expected commitment of 10+ hours per week
Asynchronous work format
Assignments may involve either authoring or verification tasks depending on project needs
Projects may be extended, shortened, or concluded early depending on project needs and performance
Weekly payments via Stripe or Wise
Work will not involve access to confidential or proprietary information from any employer, client, or institution
Please note: We are unable to support H1‑B or STEM OPT candidates at this time
Start date: Immediate
#J-18808-Ljbffr