Machine Learning Engineer Intern (Computer Vision/Multimodal/Generative AI) Job

SpreeAI, San Francisco, CA, United States

Machine Learning Engineer Intern (Computer Vision/Multimodal/Generative AI) About the Role
We are hiring Machine Learning Engineers who want to work on frontier problems in vision and generative AI where standard solutions break. You will work across photorealistic virtual try-on, video-based modeling, Smart Sizing, and multimodal representation learning. The work spans modern architectures such as diffusion models, transformers, and learned visual representations, with emphasis on controllability, compute efficiency, and production readiness. This role sits at the intersection of applied research and engineering execution.
What you'll do
Develop and improve multimodal AI systems involving image, video, and generative pipelines.
Work on diffusion model optimization, controllability, and step efficiency.
Design experiments and evaluation frameworks for visual realism and consistency.
Translate research prototypes into scalable production systems.
Collaborate closely with infrastructure teams to optimize training and inference.
Qualifications
Degree in Computer Science, AI, Robotics, or comparable combination of education and practical experience.
Strong programming skills in Python and familiarity with object-oriented languages (C++, Java, or similar).
Strong data structures and algorithms fundamentals.
Experience with PyTorch or similar frameworks.
Familiarity with CNNs, Vision Transformers (ViT), or diffusion architectures.
Preferred Qualifications
Experience with Stable Diffusion, ControlNet, LoRA, or generative pipelines.
Human pose estimation, geometry-aware modeling, or video understanding.
Experience shipping ML systems into production.

Why Join SPREEAI?
Real Impact & Ownership: This is an opportunity to shape a product and brand at the forefront of fashion-tech innovation. Your design work will directly impact how thousands (eventually millions) of people experience shopping with SPREEAI – no bureaucratic layers, your ideas can go live and make a difference immediately. You’ll own projects that truly matter in the company’s growth.
Visionary Team & Exposure: Work side-by-side with a passionate founding team and collaborate with industry visionaries. You’ll have direct access to our CEO and leadership, and even get to interact with world-class advisors (our board includes an iconic fashion figure). It’s a chance to learn from and contribute to the best of both the tech and fashion worlds.
Creative Freedom & Innovation: Join a high-growth startup environment that celebrates bold ideas and moves at lightning speed. You’ll have the autonomy to introduce new design concepts, test emerging technologies, and innovate without red tape. If you’ve ever wanted to combine your love of design, luxury fashion, and cutting-edge tech, you’ll have the freedom to do it here and see your vision realized.
SPREEAI is a fast-growing, innovative AI company at the forefront of fashion and e-commerce, revolutionizing how consumers engage with fashion through lifelike photorealistic try-on technology and hyper-personalized shopping experiences. Our mission is to redefine the retail landscape with cutting‑edge AI solutions that blend high fashion and technology. We thrive in a dynamic, fast‑paced environment where creativity meets technology to drive real impact. If you are passionate about innovation and shaping the future of fashion, SPREEAI offers a platform to make your mark.

#J-18808-Ljbffr

In Summary: Machine Learning Engineers will work across photorealistic virtual try-on, video-based modeling, Smart Sizing, and multimodal representation learning . The work spans modern architectures such as diffusion models, transformers, and learned visual representations, with emphasis on controllability, compute efficiency, and production readiness .

En Español:

El trabajo abarca arquitecturas modernas como modelos de difusión, transformadores y representaciones visuales aprendidas, con énfasis en la controlabilidad, eficiencia computacional y preparación para la producción. Esta función se encuentra en la intersección de investigación aplicada y ejecución de ingeniería. Lo que harás Desarrollar y mejorar sistemas multimodal de IA que involucran imágenes, video y tuberías generativas. Trabajar en optimización del modelo de difusión, regulabilidad y eficiencia de paso. Experimentos de diseño y marcos de evaluación para el realismo visual y consistencia. Traducir prototipos de investigación a sistemas de producción escalables. Colaborar estrechamente con equipos de infraestructura para optimizar la capacitación e inferencia. Grado en Ciencias Informáticas, Inteligencia Artificial, Robótica o combinación comparable de educación y experiencia práctica. Fuertes habilidades de programación en Python y compatibilidad con lenguajes orientados a objetos (C++, Java o SPEA). Estructuras sólidas de datos y fundamentos de algoritmos. Experiencia con PyTorch o frameworks similares. Familiarizarte inmediatamente con los modelos generacionales, Transformadores de visión (VISION), o redes sociales. Tendrá acceso directo a nuestro CEO y líderes de la industria, e incluso podrá interactuar con asesores de clase mundial (nuestro consejo incluye una figura icónica de la moda). Es una oportunidad para aprender y contribuir al mejor del mundo tecnológico y de la Moda. Libertad creativa & Innovación: Únete a un entorno emergente de alto crecimiento que celebra ideas audaces y se mueve a velocidad relámpago. Tendrás la autonomía de introducir nuevos conceptos de diseño, probar tecnologías emergentes e innovar sin cinta roja. Si alguna vez has querido combinar el amor por el diseño , la moda de lujo y la vanguardia rápida, tendrás que hacer lo mismo y ver tu visión.