
New York, NY Senior Edge + Backend Software Engineer
Eino, New York, NY, United States
We’re building an autonomous site-level connectivity intelligence platform for enterprise environments. This role owns the
on-device agent
and key parts of the
device-facing middleware , and also contributes broadly to our
Python backend
as we scale from pilots to fleets.
You’re a “full-stack systems” engineer: comfortable shipping reliable software on edge devices and building clean backend services, APIs, and data pipelines in Python.
What you’ll build
Device software (primary ownership)
Edge agent runtime
on OpenWrt/Linux:
service lifecycle, watchdog, structured logging, resource controls
Telemetry & probes :
device health (CPU/mem/disk/uptime/temp where available)
network health (interfaces, IP, routes, gateway reachability, DNS config)
internet probes (ping/DNS/HTTPS timing, loss/latency/jitter derived)
private cellular telemetry : RSRP/RSRQ/SINR, RAT, band/channel, cell ID/PCI, attach/detach, SIM state
Wi‑Fi telemetry : passive scans (SSID/BSSID, RSSI, channel, band, density), link stats when associated (best‑effort)
Store‑and‑forward reliability :
local spooling (SQLite or equivalent), bounded storage, backoff/retry, idempotent batching, backfill after outages
snapshot capture on triggers, consistent schemas, versioning, compatibility discipline
Hardware abstraction layer to support multiple device SKUs over time
Middleware + backend (meaningful ownership)
Device registry & fleet health :
identity, capability discovery, last‑seen/heartbeat, data completeness scoring
Backend product primitives :
event processing, correlation inputs, incident objects, site‑level rollups
data models and query patterns supporting “evidence‑first” RCA
Ops tooling :
dashboards, ingest failure triage, replay tools, offline detection
Responsibilities
Architect and implement the edge agent to be
production‑grade
(unattended, self‑healing, resilient to flaky power/backhaul).
Own devicecloud schemas and compatibility strategy.
Build and operate core Python backend services and data pipelines.
Establish testing and release discipline for edge + backend:
unit/integration tests, network fault injection, replayable fixtures
Drive security/reliability best practices:
per‑device identity, TLS everywhere, least privilege, safe upgrade path
Required qualifications
Strong experience building
production Python
systems (services, APIs, workers).
Experience with telemetry pipelines: batching, retries, idempotency, persistence, backpressure.
Preferred / nice‑to‑have
Cellular: ModemManager, QMI/MBIM, AT commands, SIM/attach state.
Wi‑Fi: iw, nl80211 concepts, scan/link stats.
Backend infra: Postgres, Redis, queues/workers, time‑series/event storage.
Device fleet security: keys/certs, enrollment, signed updates.
Culture & operating principles (internal clarity)
Ownership is real : if it breaks in the field, we fix it—fast—and we harden it so it doesn’t recur.
Bias to shipping : meaningful progress weekly; small PRs; tight feedback loops.
High standards on reliability : correctness > cleverness; explicit contracts; versioned schemas.
No “throw it over the wall” : edge + backend are one system; we optimize the whole loop.
Low ego, high velocity : debate the idea, not the person; be direct and respectful.
What success looks like
30+ day unattended stability on devices in real environments.
No data loss during outages (bounded spool + backfill works).
Ingestion is resilient and idempotent; device registry and fleet health are reliable.
Evidence bundles and symptom events are consistent and useful to downstream RCA.
#J-18808-Ljbffr
on-device agent
and key parts of the
device-facing middleware , and also contributes broadly to our
Python backend
as we scale from pilots to fleets.
You’re a “full-stack systems” engineer: comfortable shipping reliable software on edge devices and building clean backend services, APIs, and data pipelines in Python.
What you’ll build
Device software (primary ownership)
Edge agent runtime
on OpenWrt/Linux:
service lifecycle, watchdog, structured logging, resource controls
Telemetry & probes :
device health (CPU/mem/disk/uptime/temp where available)
network health (interfaces, IP, routes, gateway reachability, DNS config)
internet probes (ping/DNS/HTTPS timing, loss/latency/jitter derived)
private cellular telemetry : RSRP/RSRQ/SINR, RAT, band/channel, cell ID/PCI, attach/detach, SIM state
Wi‑Fi telemetry : passive scans (SSID/BSSID, RSSI, channel, band, density), link stats when associated (best‑effort)
Store‑and‑forward reliability :
local spooling (SQLite or equivalent), bounded storage, backoff/retry, idempotent batching, backfill after outages
snapshot capture on triggers, consistent schemas, versioning, compatibility discipline
Hardware abstraction layer to support multiple device SKUs over time
Middleware + backend (meaningful ownership)
Device registry & fleet health :
identity, capability discovery, last‑seen/heartbeat, data completeness scoring
Backend product primitives :
event processing, correlation inputs, incident objects, site‑level rollups
data models and query patterns supporting “evidence‑first” RCA
Ops tooling :
dashboards, ingest failure triage, replay tools, offline detection
Responsibilities
Architect and implement the edge agent to be
production‑grade
(unattended, self‑healing, resilient to flaky power/backhaul).
Own devicecloud schemas and compatibility strategy.
Build and operate core Python backend services and data pipelines.
Establish testing and release discipline for edge + backend:
unit/integration tests, network fault injection, replayable fixtures
Drive security/reliability best practices:
per‑device identity, TLS everywhere, least privilege, safe upgrade path
Required qualifications
Strong experience building
production Python
systems (services, APIs, workers).
Experience with telemetry pipelines: batching, retries, idempotency, persistence, backpressure.
Preferred / nice‑to‑have
Cellular: ModemManager, QMI/MBIM, AT commands, SIM/attach state.
Wi‑Fi: iw, nl80211 concepts, scan/link stats.
Backend infra: Postgres, Redis, queues/workers, time‑series/event storage.
Device fleet security: keys/certs, enrollment, signed updates.
Culture & operating principles (internal clarity)
Ownership is real : if it breaks in the field, we fix it—fast—and we harden it so it doesn’t recur.
Bias to shipping : meaningful progress weekly; small PRs; tight feedback loops.
High standards on reliability : correctness > cleverness; explicit contracts; versioned schemas.
No “throw it over the wall” : edge + backend are one system; we optimize the whole loop.
Low ego, high velocity : debate the idea, not the person; be direct and respectful.
What success looks like
30+ day unattended stability on devices in real environments.
No data loss during outages (bounded spool + backfill works).
Ingestion is resilient and idempotent; device registry and fleet health are reliable.
Evidence bundles and symptom events are consistent and useful to downstream RCA.
#J-18808-Ljbffr