
Data Engineer
Brooksource, Charlotte, NC, United States
12 – month contract (high likelihood of extension or full-time conversion)
On – site 3 days a week in Charlotte, NC
Role Overview
We are seeking a
Senior Data Quality Engineer
to design, implement, and maintain automated data quality validations across our enterprise data engineering ecosystem. This role focuses on ensuring the
accuracy, completeness, consistency, and timeliness
of data flowing through both
batch and streaming pipelines
built on AWS.
You will work closely with data engineers, analytics teams, and business stakeholders to embed data quality controls into pipelines built with
AWS Glue, PySpark, Kafka, AWS DMS, Lambda, and Aurora PostgreSQL , supporting trusted analytics and reporting in
Qlik .
Key Responsibilities
Data Quality Engineering
Design and implement
automated data quality checks
across ingestion, transformation, and consumption layers.
Define and enforce data quality rules for key dimensions such as completeness, validity, uniqueness, consistency, and timeliness.
Build reusable
Python and PySpark
frameworks for validating large-scale datasets.
Batch & Streaming Validation
Embed data quality validations into
AWS Glue (PySpark)
batch pipelines.
Implement real-time or near-real-time validations for
Kafka-based streaming pipelines , including schema validation, duplicate detection, and latency checks.
Monitor and validate event-time vs. processing-time behavior for streaming data.
CDC & Ingestion Quality
Validate
AWS DMS
change data capture pipelines, ensuring accuracy between source systems and downstream targets.
Perform reconciliation checks (row counts, aggregates, checksums) between source and target systems.
Detect and alert on data gaps, duplication, or schema drift in CDC pipelines.
Write advanced
SQL-based data quality checks
against
Amazon Aurora PostgreSQL
and curated data layers.
Ensure data delivered to
Qlik
meets defined quality thresholds and freshness SLAs.
Validate semantic consistency and completeness of datasets used for reporting and dashboards.
Implement data quality monitoring, logging, and alerting using
AWS Lambda , CloudWatch, and pipeline metrics.
Create dashboards and alerts for data quality failures and SLA breaches.
Perform root-cause analysis of data quality incidents and drive long-term remediation.
Standards, Governance & Collaboration
Partner with data engineers to embed quality gates into CI/CD and deployment workflows.
Contribute to data quality standards, documentation, and operational runbooks.
Act as a subject-matter expert for data quality best practices across batch and streaming architectures.
Required Qualifications
6+ years of experience in data engineering, analytics engineering, or data quality engineering.
Strong hands-on experience with
AWS Glue ,
PySpark , and
Python .
Experience validating
batch and streaming data pipelines .
Practical knowledge of
Kafka
for streaming ingestion and validation use cases.
Experience working with
AWS DMS
for CDC pipelines and data reconciliation.
Advanced
SQL
skills and experience with
Amazon Aurora PostgreSQL .
Experience implementing serverless workflows using
AWS Lambda .
Understanding of data modeling concepts and multi-layer data architectures.
Strong analytical and problem-solving skills with attention to detail.
Ability to communicate data quality issues clearly to technical and non-technical stakeholders.
Preferred Qualifications
Experience supporting BI tools such as
Qlik
or similar analytics platforms.
Familiarity with data observability concepts and quality metrics.
Knowledge of schema management and schema evolution in streaming systems.
Experience in regulated or highly governed data environments.
Exposure to CI/CD pipelines and Infrastructure-as-Code practices.
What Success Looks Like
Critical datasets have automated, repeatable data quality validations.
Data quality issues are detected early and resolved before impacting analytics.
Streaming and batch pipelines meet defined quality and freshness SLAs.
Business users trust analytics and reporting outputs with minimal manual intervention.
#J-18808-Ljbffr
On – site 3 days a week in Charlotte, NC
Role Overview
We are seeking a
Senior Data Quality Engineer
to design, implement, and maintain automated data quality validations across our enterprise data engineering ecosystem. This role focuses on ensuring the
accuracy, completeness, consistency, and timeliness
of data flowing through both
batch and streaming pipelines
built on AWS.
You will work closely with data engineers, analytics teams, and business stakeholders to embed data quality controls into pipelines built with
AWS Glue, PySpark, Kafka, AWS DMS, Lambda, and Aurora PostgreSQL , supporting trusted analytics and reporting in
Qlik .
Key Responsibilities
Data Quality Engineering
Design and implement
automated data quality checks
across ingestion, transformation, and consumption layers.
Define and enforce data quality rules for key dimensions such as completeness, validity, uniqueness, consistency, and timeliness.
Build reusable
Python and PySpark
frameworks for validating large-scale datasets.
Batch & Streaming Validation
Embed data quality validations into
AWS Glue (PySpark)
batch pipelines.
Implement real-time or near-real-time validations for
Kafka-based streaming pipelines , including schema validation, duplicate detection, and latency checks.
Monitor and validate event-time vs. processing-time behavior for streaming data.
CDC & Ingestion Quality
Validate
AWS DMS
change data capture pipelines, ensuring accuracy between source systems and downstream targets.
Perform reconciliation checks (row counts, aggregates, checksums) between source and target systems.
Detect and alert on data gaps, duplication, or schema drift in CDC pipelines.
Write advanced
SQL-based data quality checks
against
Amazon Aurora PostgreSQL
and curated data layers.
Ensure data delivered to
Qlik
meets defined quality thresholds and freshness SLAs.
Validate semantic consistency and completeness of datasets used for reporting and dashboards.
Implement data quality monitoring, logging, and alerting using
AWS Lambda , CloudWatch, and pipeline metrics.
Create dashboards and alerts for data quality failures and SLA breaches.
Perform root-cause analysis of data quality incidents and drive long-term remediation.
Standards, Governance & Collaboration
Partner with data engineers to embed quality gates into CI/CD and deployment workflows.
Contribute to data quality standards, documentation, and operational runbooks.
Act as a subject-matter expert for data quality best practices across batch and streaming architectures.
Required Qualifications
6+ years of experience in data engineering, analytics engineering, or data quality engineering.
Strong hands-on experience with
AWS Glue ,
PySpark , and
Python .
Experience validating
batch and streaming data pipelines .
Practical knowledge of
Kafka
for streaming ingestion and validation use cases.
Experience working with
AWS DMS
for CDC pipelines and data reconciliation.
Advanced
SQL
skills and experience with
Amazon Aurora PostgreSQL .
Experience implementing serverless workflows using
AWS Lambda .
Understanding of data modeling concepts and multi-layer data architectures.
Strong analytical and problem-solving skills with attention to detail.
Ability to communicate data quality issues clearly to technical and non-technical stakeholders.
Preferred Qualifications
Experience supporting BI tools such as
Qlik
or similar analytics platforms.
Familiarity with data observability concepts and quality metrics.
Knowledge of schema management and schema evolution in streaming systems.
Experience in regulated or highly governed data environments.
Exposure to CI/CD pipelines and Infrastructure-as-Code practices.
What Success Looks Like
Critical datasets have automated, repeatable data quality validations.
Data quality issues are detected early and resolved before impacting analytics.
Streaming and batch pipelines meet defined quality and freshness SLAs.
Business users trust analytics and reporting outputs with minimal manual intervention.
#J-18808-Ljbffr