CI/CD & Test Infrastructure Tooling

Overview

Created tooling for local and CI test environments, automated multi-node network setup, upgrade orchestration, and operational workflows used for release validation and end-to-end testing.

The Problem With Distributed System Tests

Unit tests are necessary but insufficient for distributed systems. A service can pass all its unit tests and still fail in integration because the network is flaky, a peer sends unexpected data, or a migration doesn’t handle concurrent traffic correctly.

The goal was a test suite that could catch these failures before they hit production.

Test Layers

The automation covers six distinct layers:

Layer	What It Tests
Unit	Individual functions and types
Integration	Service behaviour against real dependencies (DB, cache)
System	Multi-service scenarios, happy paths and error paths
Simulation	Injected faults: network partitions, node crashes, slow peers
E2E	Full cluster scenarios from client to storage
Benchmark	Performance regression detection on hot paths

Local Multi-Node Setup

A single CLI command spins up a configurable number of nodes locally, bootstraps the peer discovery network, and connects them into a test cluster. This lets engineers run multi-node scenarios on their laptop without needing a staging environment.

testnet up --nodes 5 --scenario replication-under-partition

Upgrade Orchestration

Schema migrations and binary upgrades are tested against a running cluster by upgrading nodes one at a time and verifying consistency at each step. This catches the class of bugs where new code makes assumptions about data written by old code.

CI Integration

All layers run in CI. Unit and integration tests run on every PR. System, simulation, and E2E tests run on merge to main. Benchmarks run nightly and alert on regressions exceeding a configurable threshold.

Technical Stack

Go · gomock · Docker Compose · GitHub Actions · testcontainers-go