Python Insfrastructure Engineer - Model Evaluation

Alignerr, Seattle, WA, United States

Python Infrastructure Engineer — Model Evaluation (AI Training)
What if your Python expertise could directly shape how the world’s most advanced AI models are built, tested, and improved? We’re looking for a Senior Python Infrastructure Engineer to design and build the data pipelines, annotation tooling, and evaluation systems that leading AI labs depend on to train and validate next-generation models.

This is a fully remote contract role with flexible hours — you’ll be working on real production systems at the cutting edge of AI development.

Organization: Alignerr

Type: Hourly Contract

Location: Remote

Commitment: 20–40 hours/week

What You’ll Do

Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation workflows

Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control

Build and maintain evaluation harnesses for ML models, integrating with inference frameworks

Improve reliability, performance, and safety across existing Python codebases

Implement observability, metrics collection, and monitoring to track system reliability and model performance

Identify bottlenecks and edge cases in data and system behavior, and ship scalable fixes

Collaborate with data, research, and engineering teams to support model training and evaluation workflows

Participate in synchronous design reviews to iterate on system architecture and implementation decisions

Who You Are

Native or fluent English speaker with clear written and verbal communication skills

Full-stack developer with a strong systems programming background

3–5+ years of professional experience writing production-grade Python

Experienced building evaluation harnesses for ML models and integrating with inference frameworks

Strong background in observability, metrics collection, and system reliability monitoring

Able to commit 20–40 hours per week consistently

Self-directed and comfortable working asynchronously across distributed teams

Nice to Have

Prior experience with data annotation, data quality, or evaluation systems

Familiarity with AI/ML workflows, model training, or benchmarking pipelines

Experience with distributed systems or developer tooling

Background in MLOps, infrastructure engineering, or platform engineering

Why Join Us

Work on real production systems powering some of the most advanced AI research in the world

Fully remote and flexible — structure your work around your life

Freelance autonomy with the depth and meaning of high-impact engineering work

Contribute directly to AI infrastructure that shapes how next-generation models are built and evaluated

Potential for ongoing work and contract extension as new projects launch

#J-18808-Ljbffr