Mediabistro logo
job logo

Data Engineer

Brooksource, Charlotte, NC, United States


12 – month contract (high likelihood of extension or full-time conversion)

On – site 3 days a week in Charlotte, NC

Role Overview
We are seeking a

Senior Data Quality Engineer

to design, implement, and maintain automated data quality validations across our enterprise data engineering ecosystem. This role focuses on ensuring the

accuracy, completeness, consistency, and timeliness

of data flowing through both

batch and streaming pipelines

built on AWS.

You will work closely with data engineers, analytics teams, and business stakeholders to embed data quality controls into pipelines built with

AWS Glue, PySpark, Kafka, AWS DMS, Lambda, and Aurora PostgreSQL , supporting trusted analytics and reporting in

Qlik .

Key Responsibilities
Data Quality Engineering

Design and implement

automated data quality checks

across ingestion, transformation, and consumption layers.

Define and enforce data quality rules for key dimensions such as completeness, validity, uniqueness, consistency, and timeliness.

Build reusable

Python and PySpark

frameworks for validating large-scale datasets.

Batch & Streaming Validation

Embed data quality validations into

AWS Glue (PySpark)

batch pipelines.

Implement real-time or near-real-time validations for

Kafka-based streaming pipelines , including schema validation, duplicate detection, and latency checks.

Monitor and validate event-time vs. processing-time behavior for streaming data.

CDC & Ingestion Quality

Validate

AWS DMS

change data capture pipelines, ensuring accuracy between source systems and downstream targets.

Perform reconciliation checks (row counts, aggregates, checksums) between source and target systems.

Detect and alert on data gaps, duplication, or schema drift in CDC pipelines.

Write advanced

SQL-based data quality checks

against

Amazon Aurora PostgreSQL

and curated data layers.

Ensure data delivered to

Qlik

meets defined quality thresholds and freshness SLAs.

Validate semantic consistency and completeness of datasets used for reporting and dashboards.

Implement data quality monitoring, logging, and alerting using

AWS Lambda , CloudWatch, and pipeline metrics.

Create dashboards and alerts for data quality failures and SLA breaches.

Perform root-cause analysis of data quality incidents and drive long-term remediation.

Standards, Governance & Collaboration

Partner with data engineers to embed quality gates into CI/CD and deployment workflows.

Contribute to data quality standards, documentation, and operational runbooks.

Act as a subject-matter expert for data quality best practices across batch and streaming architectures.

Required Qualifications

6+ years of experience in data engineering, analytics engineering, or data quality engineering.

Strong hands-on experience with

AWS Glue ,

PySpark , and

Python .

Experience validating

batch and streaming data pipelines .

Practical knowledge of

Kafka

for streaming ingestion and validation use cases.

Experience working with

AWS DMS

for CDC pipelines and data reconciliation.

Advanced

SQL

skills and experience with

Amazon Aurora PostgreSQL .

Experience implementing serverless workflows using

AWS Lambda .

Understanding of data modeling concepts and multi-layer data architectures.

Strong analytical and problem-solving skills with attention to detail.

Ability to communicate data quality issues clearly to technical and non-technical stakeholders.

Preferred Qualifications

Experience supporting BI tools such as

Qlik

or similar analytics platforms.

Familiarity with data observability concepts and quality metrics.

Knowledge of schema management and schema evolution in streaming systems.

Experience in regulated or highly governed data environments.

Exposure to CI/CD pipelines and Infrastructure-as-Code practices.

What Success Looks Like

Critical datasets have automated, repeatable data quality validations.

Data quality issues are detected early and resolved before impacting analytics.

Streaming and batch pipelines meet defined quality and freshness SLAs.

Business users trust analytics and reporting outputs with minimal manual intervention.

#J-18808-Ljbffr