Member of Technical Staff - Machine Learning Engineer; Audio Data
Liquid AI, Inc. - Oklahoma City, Oklahoma, United States
Work at Liquid AI, Inc.
Overview
- View job
Overview
Member of Technical Staff - Audio-Language Model Data
to play a critical role in the development of Liquid Audio-Language models. This role focuses on gathering high-quality audio-text pre-training and SFT datasets. Key Responsibilities
Create and maintain data cleaning, filtering, selection pipeline that can handle audio-text data. Watch out for the release of public high-quality audio (ASR and SFT) datasets. Create and maintain synthetic data generation pipeline to create task-specific audio SFT data. Work with the multimodal audio team to run ablations on new dataset. Required Qualifications
Experience Level:
B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience. Dataset Engineering:
Expertise in data curation, cleaning, augmentation, and synthetic data generation techniques. Machine Learning Expertise:
Ability to write and debug models in popular ML frameworks, and experience working with LLMs and VLMs. Software Development:
Strong programming skills in Python, with an emphasis on writing clean, maintainable, and scalable code. Preferred Qualifications
M.S. or Ph.D. in Computer Science, Electrical Engineering, Math, or a related field. Experience training text-to-speech (TTS) or translation models. 2+ years working with audio data. First-author publications in top ML or audio conferences (e.g. NeurIPS, ICML, ICLR, ICASSP, Interspeech). Contributions to popular open-source projects.
#J-18808-Ljbffr