Data Engineer [Python, Spark, API, Data Pipelines] Contract On-Site at Cupertino

Red Oak Technologies, Cupertino, CA, United States

Data Engineer [Python, Spark, API, Data Pipelines] Contract On-Site at Cupertino, CA

Position Overview:

We are looking for a passionate and skilled Data Engineer with expertise in

Spark, Python, SQL, and API development

to design, develop, and maintain end-to-end data solutions. The ideal candidate will work closely with cross-functional teams to build scalable data pipelines, ensure data quality, and enable analytics and reporting.

Key Responsibilities:

Design, develop, and maintain scalable data pipelines using Apache Spark and Python .
Develop and optimize SQL queries for data extraction, transformation, and loading (ETL).
Build and implement end-to-end API integrations for data ingestion and dissemination.
Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements.
Ensure data accuracy, integrity, and security across all platforms.
Monitor and troubleshoot data pipeline issues, ensuring high availability and performance.
Document data workflows, architecture, and best practices.

Qualifications:

Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
Proven experience with Spark (PySpark) for large-scale data processing.
Strong programming skills in Python for data manipulation and automation.
Extensive experience with SQL and relational databases.
Hands-on experience developing and consuming APIs (RESTful/SOAP).
Knowledge of data warehousing concepts and tools.
Familiarity with cloud platforms (AWS, Azure, GCP) is a plus.
Excellent problem-solving skills and attention to detail.
Strong communication and teamwork skills.

Preferred Skills:

Experience with streaming data technologies (Kafka, Kinesis).
Knowledge of data governance and security best practices.
Experience with CI/CD pipelines for data workflows.