
Databricks Data Warehouse Architect
3B Staffing LLC, Nashville, TN, United States
Remote
Key Responsibilities:
• Design and deploy a new Databricks Lakehouse instance tailored to the client's product-level data needs.
• Architect and implement robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake.
• Integrate AWS-native services (S3, Glue, Athena, Redshift, Lambda) with Databricks for optimized performance and scalability.
• Define data models, optimize query performance, and establish warehouse governance best practices.
• Collaborate cross-functionally with product teams, data scientists, and DevOps to streamline data workflows.
• Maintain CI/CD, preferably DBX for data pipelines using GitOps and Infrastructure-as-Code.
• Monitor data jobs and resolve performance bottlenecks or failures across environments.
Required Skills & Experience:
Databricks / Lakehouse Architecture
• End-to-end setup of Databricks workspaces and Unity Catalog
• Expertise in Delta Lake internals, file compaction, and schema enforcement
• Advanced PySpark/SQL skills for ETL and transformations
AWS Native Integration
• Deep experience with AWS Glue, S3, Redshift Spectrum, Lambda, and Athena
• IAM and VPC configuration knowledge for secure cloud integrations
Data Warehousing & Modeling
• Strong grasp of modern dimensional modeling (star/snowflake schemas)
• Experience setting up lakehouse design patterns for mixed workloads
Automation & DevOps
• Familiarity with CI/CD for data engineering using tools like DBX, Terraform, GitHub Actions, or Azure DevOps
• Proficient in monitoring tools like CloudWatch, Datadog, or New Relic for data pipelines
Bonus/Nice to Have:
• Experience supporting gaming or real-time analytics workloads
• Familiarity with Airflow, Kafka, or EventBridge
• Exposure to data privacy and compliance practices (GDPR, CCPA)
Key Responsibilities:
• Design and deploy a new Databricks Lakehouse instance tailored to the client's product-level data needs.
• Architect and implement robust data ingestion pipelines using Spark (PySpark/Scala) and Delta Lake.
• Integrate AWS-native services (S3, Glue, Athena, Redshift, Lambda) with Databricks for optimized performance and scalability.
• Define data models, optimize query performance, and establish warehouse governance best practices.
• Collaborate cross-functionally with product teams, data scientists, and DevOps to streamline data workflows.
• Maintain CI/CD, preferably DBX for data pipelines using GitOps and Infrastructure-as-Code.
• Monitor data jobs and resolve performance bottlenecks or failures across environments.
Required Skills & Experience:
Databricks / Lakehouse Architecture
• End-to-end setup of Databricks workspaces and Unity Catalog
• Expertise in Delta Lake internals, file compaction, and schema enforcement
• Advanced PySpark/SQL skills for ETL and transformations
AWS Native Integration
• Deep experience with AWS Glue, S3, Redshift Spectrum, Lambda, and Athena
• IAM and VPC configuration knowledge for secure cloud integrations
Data Warehousing & Modeling
• Strong grasp of modern dimensional modeling (star/snowflake schemas)
• Experience setting up lakehouse design patterns for mixed workloads
Automation & DevOps
• Familiarity with CI/CD for data engineering using tools like DBX, Terraform, GitHub Actions, or Azure DevOps
• Proficient in monitoring tools like CloudWatch, Datadog, or New Relic for data pipelines
Bonus/Nice to Have:
• Experience supporting gaming or real-time analytics workloads
• Familiarity with Airflow, Kafka, or EventBridge
• Exposure to data privacy and compliance practices (GDPR, CCPA)