Fireworks AI

Software Engineer, Multimedia

Fireworks AI, Redwood City, California, United States, 94061

Overview Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.

Responsibilities

Collaborate with ML engineers and researchers to productionize models and support evolving multimedia capabilities

Identify, profile and address performance bottlenecks across the stack, from media preprocessing to vision/audio encoders to the core inference engine

Ensure high reliability, observability, and security across backend systems.

Own the enablement and optimization of new model releases, ensuring we consistently deliver the fastest implementations in the market.

Build and maintain performant APIs and services

Collaborate closely with customers and sales teams to implement custom features and optimizations that drive ARR growth

Propose new roadmap items based on customer needs.

Minimum Qualifications

Bachelor’s degree in Computer Science, Engineering, or a related field.

3+ years of experience as a backend or infrastructure engineer, ideally supporting ML/AI systems or data-intensive workloads.

Experience with PyTorch and deep learning frameworks for inference and training.

Strong programming skills in Python and/or Go, with a track record of building reliable distributed backend systems.

Experience with cloud platforms (e.g., AWS, GCP), infrastructure-as-code tools (e.g., Terraform), and containerization/orchestration tools (e.g., Docker, Kubernetes).

Preferred Qualifications

Experience supporting ML workloads in production (model fine-tuning, distributed training, inference optimization)

Experience working directly with LLMs, vision-language models, audio models (ASR, TTS) or other multimodal AI systems in production environments

Experience with performance optimization and profiling for high-throughput systems

Knowledge of model quantization, speculative decoding, or other ML optimization techniques

Compensation Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range is intended as a guideline and may be adjusted.

$170,000 - $240,000 USD

Benefits

Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.

Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.

Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.

Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Equal Employment Opportunity Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

#J-18808-Ljbffr