Senior Database DBA

Qode, Tampa, Florida, us, 33646

Job Title: MemSQL / Single Store - Senior Database DBA Location: New Jersey / Irving, TX / Tampa, FL

Role Overview We are seeking a Senior MemSQL / SingleStore Cluster Administrator to own and manage mission-critical, large-scale distributed database platforms. This role requires a pure Database Administrator (DBA) with deep expertise in handling petabyte-scale data, complex distributed clusters, and real-time latency-sensitive workloads.

Core Technical Expectations

Experience handling petabytes of data ingested every 15 minutes in large-scale environments.

Strong expertise managing large MemSQL / SingleStore clusters (multi-node, multi-TB to multi-PB).

Deep understanding of data distribution across aggregators and leaf nodes.

Expertise in:

Partitioning and shard key strategy

Data skew mitigation

Hot partition resolution

Worker node and leaf node optimization

Strong table-level knowledge includes:

Index strategy

Thread management

Connection pooling

Memory limits

Query plan optimization

Strong understanding of different MemSQL/SingleStore versions and corresponding architectural/feature changes.

Key Responsibilities End-to-end ownership of large MemSQL/SingleStore clusters (design, build, upgrade, operate, decommission).

Architect and maintain High Availability (HA) and Disaster Recovery (DR) setups including:

Redundancy levels

Availability groups

Cross-region replication

Plan and execute:

Cluster expansion

Downsizing

Online partition rebalancing

Leaf node management with minimal/no downtime

Proactively monitor cluster health, throughput, latency, and capacity; define and maintain SLAs.

Perform advanced performance tuning:

Schema design

Shard key design

Index strategy

NUMA and memory tuning

Workload management

Implement backup/restore strategies and regularly test DR & failover.

Lead incident response and perform deep root cause analysis.

Enforce database security best practices:

Authentication & authorization

Encryption

Auditing

Network controls

Drive automation using scripting (Python/Bash) and Infrastructure as Code.

Maintain documentation, operational runbooks, and standards.

Evaluate new MemSQL/SingleStore features and lead version upgrades and migrations.

Required Experience & Skills 10+ years of total database engineering/administration experience.

4–5+ years of deep, production-grade experience administering MemSQL/SingleStore clusters at scale.

Strong hands‑on experience with:

Aggregators & leaf nodes

Licensing and memory limits

Cluster expansion & partition rebalancing

Replication & failover/failback

Proven ability to diagnose:

Locking issues

Data skew

Hot partitions

Bad execution plans

Strong Linux system tuning knowledge:

CPU/NUMA affinity

Disk & I/O optimization

Networking

ulimits & OS-level tuning

Experience with monitoring & alerting tools:

Prometheus / Grafana

Datadog

Splunk

ELK

Strong SQL expertise and scripting (Python/Bash).

Experience in Cloud/Container environments (AWS/Azure/GCP, Kubernetes) is highly preferred.

Excellent communication skills with ability to lead production calls and explain technical trade-offs clearly.

#J-18808-Ljbffr