Complete Guide to Edge Computing Architecture 2026

TL;DR:

Edge computing places compute close to data sources; the cloud handles analytics and long-term storage
Use edge when latency must be under 500ms, raw data exceeds 10GB/day per node, or data sovereignty applies
K3s, systemd, and Balena cover the three main deployment patterns depending on team skills and fleet size

Edge computing is a deployment topology where you run compute at or near the data source, outside a central data centre. You do it when latency or bandwidth cost makes sending data to the cloud prohibitive — or when data cannot leave the premises at all. For many UK organisations dealing with GDPR and ICO data residency requirements, that last point is increasingly the primary driver.

Most edge deployments follow a consistent three-tier structure.

The Three-Tier Model

Device layer — sensors, cameras, PLCs. Data acquisition and actuation only. Raw data stays local.

Edge layer — a Linux computer (Raspberry Pi 5, Jetson Orin NX, Dell Edge Gateway) physically near the devices. Runs inference, pre-processing, local storage, and protocol translation. LAN latency: 1–20ms.

Cloud layer — AWS/Azure/GCP or on-prem data centre. Handles long-term storage, cross-fleet analytics, ML training, and dashboards. WAN latency: 20–200ms.

The cloud never sees raw sensor streams. A 10kHz vibration feed gets reduced to a handful of anomaly events per day before leaving the edge node. Fog computing is Cisco’s term for distributed edge computation — in practice the terms are used interchangeably, and you’ll hear both in the same conversation.

Edge vs. Cloud: When Each Wins

The question is never “edge or cloud?” — it’s “which processing step runs where?” Every production deployment is a hybrid.

Use edge when response time must be under 500ms, raw data exceeds 10GB/day per node (a single IP camera at 50MB/s uncompressed generates 4.3TB/day), data is personal data subject to UK GDPR, or the site has intermittent connectivity.

Use cloud when 500ms+ latency is acceptable, you need cross-site aggregation, or ML model training is required.

Inference costs are also telling: a Jetson Orin NX runs roughly £0.0001 per inference; the equivalent cloud API call is £0.004–£0.04. At 1,000 inferences per second, that gap becomes significant quickly.

Protocol Stack

Device to edge uses either MQTT or OPC-UA. MQTT (via Mosquitto 2.x) is the standard for sensor-to-edge messaging — it runs in under 50MB RAM, handles tens of thousands of messages per second on a Pi 5, and has a 2-byte header overhead. OPC-UA is mandatory when integrating with SCADA systems or brownfield PLC vendors (Siemens, Rockwell, Beckhoff). It adds overhead but brings a structured data model and security by default.

Edge to cloud uses HTTPS or AMQP. HTTPS POST works for low-frequency payloads. AMQP (RabbitMQ or Azure Service Bus) adds durable queuing when retry logic matters.

Deployment Patterns

You have three main options for running software on edge nodes.

K3s (Rancher’s lightweight Kubernetes) runs in ~512MB RAM versus 4GB+ for full Kubernetes. Use it when your team knows Kubernetes, the node has at least 4GB RAM, or you’re running multiple containerised workloads.

systemd services are the right choice for single-purpose nodes — one process, known inputs and outputs. Use systemd when the node has under 2GB RAM or your team is more comfortable with Linux than Kubernetes.

Balena (balenaOS + balenaCloud) handles fleet OTA updates, rollback, remote terminal access, and delta updates automatically. Use it when your fleet exceeds 10 devices and remote access is the priority. It’s particularly popular with UK IoT teams who want a managed fleet without building their own update infrastructure.

OTA Updates Are Non-Negotiable

A fleet of 500 edge devices that can’t be updated remotely is a liability. Every security patch requires a physical visit — or it doesn’t get deployed.

A production OTA system needs atomic updates with automatic rollback (A/B partition scheme; if the new partition fails a health check, the device reboots to the previous known-good state), signed artefacts (GPG or Ed25519 signing; devices verify before applying), staged rollouts (push to 5%, monitor, then 25%, then 100%), and bandwidth-aware delivery (delta updates only push what changed).

Mender (open source, self-hosted) implements all four in about a week of setup.

Real-World Example: Predictive Maintenance

A manufacturing plant with 40 CNC machines uses a Raspberry Pi 5 (4GB) per machine with an ADXL345 accelerometer sampling at 10kHz.

The stack: Mosquitto broker → Python FFT feature extraction (1-second windows) → ONNX Runtime Random Forest model → anomaly events forwarded over HTTPS.

Raw data: 80KB/s per machine. Forwarded to cloud: under 10KB/day. Data reduction of 97%. Inference runs in under 2ms per window on the Pi 5’s Cortex-A76 cores. This pattern is running at scale in UK manufacturing facilities today — it’s not theoretical.

5 Questions Before You Buy Hardware

What’s the exact maximum latency between sensor reading and action? If hours are fine, edge may add cost without benefit.
What’s the raw data volume per node per day? Under 1GB total, and cloud ingestion is probably sufficient.
What’s the actual WAN uptime at this site? Ask network operators, not vendors.
Which hardware tier fits the workload: Pi 5 (~~£65) for protocol translation, or Jetson Orin NX (~~£400–800) for ML inference?
Run a single-node pilot for 30 days before buying 500 units — this surfaces OTA behaviour, network reliability, and thermal issues.

The Bottom Line

Edge computing architecture is a set of decisions about where processing runs and why. The hard parts are operational — remote update reliability, fleet monitoring, connectivity resilience. Build those before the pilot expands. A Raspberry Pi 5 running Ubuntu 22.04, K3s, and ONNX Runtime is a legitimate production edge node in 2026. Every edge deployment that went smoothly at scale was someone’s careful single-node pilot first.