AI Evaluation Specialist (Polish) | $15/hr Remote

Crossing Hurdles, Poland, NY, United States

Position:

LLM – AI Quality Analyst (Personalization) – Polish

Type:

Short-Term Contract

Location:

Remote (Global)

Commitment:

20-40 hours/week with 4 hours overlap with PST

Engagement Length:

1 month

Start Date:

Immediate

Role Responsibilities

Design multi-turn conversational prompts based on personal context

Evaluate personalized AI responses for relevance, grounding, and helpfulness

Assess correct and incorrect use of personal data in model outputs

Perform side-by-side (SxS) evaluation and ranking of AI responses

Identify grounding errors, poor inferences, and forced personalization

Write clear, structured rationales referencing specific conversation turns

Extract and verify model debug information and data source usage

Maintain strict data hygiene by deleting evaluation conversations

Requirements

Polish fluency (reading and writing) is mandatory, as Polish is the focus language for this project

Experience in data annotation, AI quality evaluation, content moderation, or related roles is strongly preferred

Strong analytical thinking and attention to detail

Ability to evaluate nuanced and ambiguous AI responses

Comfortable using a primary personal Google account with enabled data sources

BS/BA degree or equivalent experience in a relevant analytical field

Strong written communication and structured feedback skills

Self-motivated and able to work independently in a remote setting

Reliable desktop/laptop with stable internet connection

#J-18808-Ljbffr