Mediabistro logo
job logo

Senior Director, Applied Research

Capital One, San Francisco, CA, United States


* PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 6 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 8 years of experience in Applied Research* At least 5 years of people leadership experience* PhD focus on NLP or Masters with 10 years of industrial NLP research experience* Core contributor to team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)* Numerous publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)* Has worked on an LLM (open source or commercial) that is currently available for use* Demonstrated ability to guide the technical direction of a large-scale model training team* Experience working with 500+ node clusters of GPUs Has worked on LLM scaled to 70B parameters and 1T+ tokens* Experience with common training optimization frameworks (deep speed, nemo)* PhD focus on topics in geometric deep learning (Graph Neural Networks, Sequential Models, Multivariate Time Series)* Member of technical leadership for model deployment for a very large user behavior model* Multiple papers on topics relevant to training models on graph and sequential data structures at KDD, ICML, NeurIPs, ICLR* Worked on scaling graph models to greater than 50m nodes Experience with large scale deep learning based recommender systems* Experience with production real-time and streaming environments* Contributions to common open source frameworks (pytorch-geometric, DGL)* Proposed new methods for inference or representation learning on graphs or sequences* Worked datasets with 100m+ users* PhD focused on topics related to optimizing training of very large language models* 5+ years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression* PhD focused on topics related to guiding LLMs with further tasks (Supervised Finetuning, Instruction-Tuning, Dialogue-Finetuning, Parameter Tuning)* Demonstrated knowledge of principles of transfer learning, model adaptation and model guidance* Experience deploying a fine-tuned large language modelCapital One offers a comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being. Learn more at the . Eligibility varies based on full or part-time status, exempt or non-exempt status, and management level. #J-18808-Ljbffr