Overview
Designed and implemented distributed systems features for peer discovery, replication, and encrypted node-to-node communication, with failure-tolerant synchronization and SQLite-backed local state.
The Challenge
The system needed nodes to find each other, replicate data, and stay consistent — without a central coordinator. Centralized discovery is a single point of failure; the architecture had to be fully decentralized from the start.
Peer Discovery
Built on a Kademlia DHT variant. Each node maintains a routing table of contacts organized by XOR distance from its own ID. Lookups converge in O(log n) hops as the cluster grows.
Routing state is persisted to SQLite so nodes can restart without a full re-bootstrap from scratch.
Encrypted Communication
All node-to-node traffic is encrypted. Nodes authenticate each other using long-term keypairs; session keys are established via a Diffie-Hellman exchange and rotated periodically.
This eliminates the threat of a rogue node injecting data into the replication stream — every message is authenticated before it’s processed.
Replication and Sync
Replication is push-based with pull-based reconciliation as a fallback. If a node misses a replication event (network partition, restart), it detects the gap by comparing vector clocks and initiates a targeted sync from a peer.
The sync protocol is failure-tolerant: partial transfers are checksummed and resumed rather than restarted from scratch.
Local State with SQLite
SQLite backs the local state for routing tables, pending sync queues, and object metadata. The embedded database eliminates the operational overhead of a separate data service per node, while WAL mode provides sufficient write concurrency for the access patterns involved.
Technical Stack
Go · Kademlia DHT · TLS/ECDH · SQLite (WAL mode) · Protocol Buffers · Docker