
Senior Data Engineer, NSV
NVIDIA AI, Chicago, IL, United States
Senior Data Engineer – Network Solutions Validation
NVIDIA is looking for a Senior Data Engineer to join the NSV (Network Solutions Validation) group. NSV builds high-performing software automation for NVIDIA's Data Center environments and helps drive the data growth of the world's biggest companies. In this role, you will design, build, and maintain scalable, high-performance data pipelines that handle massive volumes of data from hardware, communication modules, firmware, and large-scale AI and HPC clusters. You'll also contribute to our growing Agentic AI initiatives, helping develop AI Agents that bring our data capabilities to the next level.
What You'll Be Doing
Define and execute the group's data technical roadmap, aligning with Infra, DevOps, and Performance teams
Design and maintain flexible ETL/ELT frameworks for ingesting, transforming, and classifying cluster verification and telemetry data
Build and optimize streaming pipelines using Apache Spark, Kafka, and Databricks, ensuring high throughput, reliability, and adaptability to evolving data schemas
Ensure data quality and pipeline health through observability standards, schema validation, lineage tracking, monitoring, and alerting
Deliver reliable insights for cluster performance analysis, telemetry visibility, and end-to-end test coverage
Support self-service analytics for engineers and researchers via Databricks notebooks, APIs, and datasets
Drive best practices in data modeling, code quality, and operational excellence; collaborate with cross-functional teams to support data-driven decision-making
Contribute to the development of AI Agents that enhance the visibility and accessibility of insights and data for our users
What We Need To See
B.Sc. or M.Sc. in Computer Science, Data Science, or a related field
5+ years of hands‑on experience in data engineering
Strong practical experience with Apache Spark (PySpark or Scala) and Databricks
Proficiency in Python and SQL for data transformation, automation, and pipeline logic
Experience with Apache Kafka, including stream ingestion and event processing
Experience with schema evolution, data versioning, and validation frameworks (Delta Lake, Iceberg, or Great Expectations)
Strong problem-solving skills and ability to debug and troubleshoot complex data-related issues
Strong communication skills and ability to work effectively across teams
Ways To Stand Out From The Crowd
Experience with real-time analytics frameworks (Spark Structured Streaming, Flink, Kafka Streams)
Exposure to hardware, firmware, or embedded telemetry environments
Experience with data cataloging or governance tools (DataHub, Collibra, or Alation)
Hands-on experience building or deploying AI Agents or LLM-based applications
NVIDIA is committed to fostering a diverse work environment and is proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.
#J-18808-Ljbffr
NVIDIA is looking for a Senior Data Engineer to join the NSV (Network Solutions Validation) group. NSV builds high-performing software automation for NVIDIA's Data Center environments and helps drive the data growth of the world's biggest companies. In this role, you will design, build, and maintain scalable, high-performance data pipelines that handle massive volumes of data from hardware, communication modules, firmware, and large-scale AI and HPC clusters. You'll also contribute to our growing Agentic AI initiatives, helping develop AI Agents that bring our data capabilities to the next level.
What You'll Be Doing
Define and execute the group's data technical roadmap, aligning with Infra, DevOps, and Performance teams
Design and maintain flexible ETL/ELT frameworks for ingesting, transforming, and classifying cluster verification and telemetry data
Build and optimize streaming pipelines using Apache Spark, Kafka, and Databricks, ensuring high throughput, reliability, and adaptability to evolving data schemas
Ensure data quality and pipeline health through observability standards, schema validation, lineage tracking, monitoring, and alerting
Deliver reliable insights for cluster performance analysis, telemetry visibility, and end-to-end test coverage
Support self-service analytics for engineers and researchers via Databricks notebooks, APIs, and datasets
Drive best practices in data modeling, code quality, and operational excellence; collaborate with cross-functional teams to support data-driven decision-making
Contribute to the development of AI Agents that enhance the visibility and accessibility of insights and data for our users
What We Need To See
B.Sc. or M.Sc. in Computer Science, Data Science, or a related field
5+ years of hands‑on experience in data engineering
Strong practical experience with Apache Spark (PySpark or Scala) and Databricks
Proficiency in Python and SQL for data transformation, automation, and pipeline logic
Experience with Apache Kafka, including stream ingestion and event processing
Experience with schema evolution, data versioning, and validation frameworks (Delta Lake, Iceberg, or Great Expectations)
Strong problem-solving skills and ability to debug and troubleshoot complex data-related issues
Strong communication skills and ability to work effectively across teams
Ways To Stand Out From The Crowd
Experience with real-time analytics frameworks (Spark Structured Streaming, Flink, Kafka Streams)
Exposure to hardware, firmware, or embedded telemetry environments
Experience with data cataloging or governance tools (DataHub, Collibra, or Alation)
Hands-on experience building or deploying AI Agents or LLM-based applications
NVIDIA is committed to fostering a diverse work environment and is proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.
#J-18808-Ljbffr