
Overview
The Director, Head of Research Data Integration & Analytics leads the strategic development and implementation of robust, FAIR (Findable, Accessible, Interoperable, Reusable) data analytics systems and advanced AIML and predictive modelling solutions across VIDRU. This role conceives and delivers the digital roadmap of the entire research lifecycle, transforming raw experimental outputs into standardized, analysis-ready data assets, and embedding cutting-edge AI methodologies into daily operations by refining high-impact use cases.
This role establishes and maintains world-class data standards, governance, and quality for datasets, accelerating infectious disease research and vaccine development. It ensures seamless access to high-quality data and actionable insights, working with scientists within Discovery Technologies and across scientific areas, and embedding software engineering best practices into the scientific process. The role drives the promotion of digital literacy and fluency across VIDRU, ensuring alignment with Research Tech and the R&D Digital Network.
The Head leads and mentors a world-class team, ensuring scientific excellence and alignment with the Head of VIDRU Data Sciences and VIDRU R&D project/platform leaders, with adherence to top scientific and industry standards for all deliverables.
What You'll Do
- Provides strategic vision and leadership for research data integration and advanced analytics initiatives within VIDRU Data Sciences, focusing on accelerating infectious disease research through FAIR data and AI/ML, and transforming raw experimental outputs into analysis-ready data. Promotes cutting-edge AI methodologies in daily operations and refines high-impact use cases.
- Leads and manages a multidisciplinary team of data scientists, data architects, and scientific software/research engineers, fostering a culture of high performance, scientific innovation, and continuous professional development.
- Directs the design, development, and implementation of robust, scalable integrated data systems and automated, product-grade data processing and integration pipelines (e.g., for cloud computing) to consolidate and harmonize diverse bio-clinical datasets, including multi-omics, preclinical, translational, and early clinical data.
- Establishes and enforces world-class data standards, quality control processes, and governance frameworks (FAIR principles) to ensure data integrity, reliability, and reusability across all VIDRU research initiatives, from lab bench to final analysis, in collaboration with Research Technologies.
- Drives the development and application of advanced analytical methodologies, including deep learning, biomedical computer vision, and predictive modeling, to extract deep biological and clinical insights from integrated datasets. Promotes collaborative knowledge sharing and aligns with tech providers on emerging innovations.
- Collaborates closely with DPLs, VDLs, PILs, TPLs, and clinical sciences teams, as well as lab scientists within Discovery Technologies and scientific areas, to understand data needs and deliver integrated datasets and scalable analytical workflows.
- Partners with experimental scientists to optimize VIDRU data flows, ensuring high-quality data generation aligned with FAIR principles from the outset of experiments.
- Drives innovation in research data integration and predictive analytics by partnering with GSK's AIML, Research Tech and R&D Tech organizations, leveraging product-grade software development practices to scale successful research pipelines into reusable assets.
- Ensures data and analytical deliverables meet high standards of scientific excellence, quality, security, and timelines, translating complex data into actionable insights with reproducibility and reliability.
- Communicates complex data landscapes, integration strategies, and analytical findings effectively to internal and external stakeholders, acting as a bridge between biologists, data scientists, and IT engineers. Mentors scientists in leveraging LLMs and other digital tools for R&D breakthroughs.
- Contributes to the definition and implementation of VIDRU Data Science scientific strategy, processes, and objectives, ensuring alignment with the Head of VIDRU Data Sciences and the overall GSK Vaccines & Infectious Diseases R&D strategy, maintaining digital fluency with RTech and the R&D Digital Network.
Basic Qualifications
- PhD or equivalent experience in Data Science, Computer Science, Bioinformatics, Computational Biology, Statistics, Engineering, or equivalent, with a strong focus on data systems, advanced analytics, and molecular biology insight.
- Research experience and publication record in relevant areas, demonstrating leadership in establishing integrated data environments and delivering impactful data-driven insights from datasets in an R&D setting.
- Eight to ten years of relevant scientific experience, including four years of direct/matrix people management and international leadership responsibilities (e.g., principal investigator for international R&D projects).
- Proven capacity to translate theoretical knowledge into solutions for R&D projects while leading cross-functional teams. Acts as a global reference for the function and performs people management and coaching of global staff.
Preferred Qualifications
- Demonstrated strong proficiency and publication record in one or more of the following areas:
- Designing and implementing robust, product-grade data systems and integration pipelines (e.g., cloud computing) for diverse bio-clinical datasets in cloud systems and HPC environments.
- Developing and enforcing FAIR data governance frameworks and data quality standards for scientific research data.
- Advanced AI/ML techniques including deep learning, biomedical computer vision, and predictive modelling on clinical and molecular data.
- Expertise in managing, integrating, and analyzing multi-omics data (genomics, transcriptomics, proteomics) and associated metadata, with understanding of data types like FASTQ, BAM, VCF.
- Molecular biology insight applied to data interpretation and model development, with familiarity with laboratory processes and experimental design.
- Proficiency in cloud-based data platforms and technologies (e.g., GCP, Azure) for large-scale scientific data processing and analytics.
- Experience with version control and automated testing for scientific software development.
- Good programming, data analytics and modelling skills.
- Excellent line management skills.
- Excellent business understanding of the pharmaceutical industry.
- Good knowledge of vendors and state-of-the-art solutions for molecular design.
Compensation and Benefits
If based in Cambridge, MA; Waltham, MA; Rockville, MD; or San Francisco, CA, the annual base salary ranges from 178,200 to 297,000 USD. If based in another US location, the range is provided during recruitment. The US salary ranges consider location, skills, experience, education, and market rate. This position offers an annual bonus and eligibility for a share-based long-term incentive program. Benefits include health care and other insurance for employee and family, retirement benefits, paid holidays, vacation, and paid caregiver/parental and medical leave. If salary ranges are not displayed, compensation will be discussed during recruitment.
Please note that some benefits details are provided for informational purposes and may vary by location.
GSK is an Equal Opportunity Employer. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, religion, sex, pregnancy, gender identity or expression, sexual orientation, parental status, national origin, age, disability, or genetic information. For adjustments in the recruitment process, contact HR at the Americas Recruitment email. Recruitment FAQ guidance is available via the provided link.
Important notice to Employment businesses/ Agencies: GSK does not accept referrals from employment businesses or agencies without prior written authorization. This ensures that all parties understand the contract and fees. GSK may be required to capture and report certain expenses to comply with US transparency requirements; more information is available from CMS.
#J-18808-Ljbffr