🐒
chaos engineering

Break things on purpose

Production is going to fail. The question is whether you find the failure mode first, or your customers do. Chaos Monkey helps you find it first.

Start Breaking Things 💥

Chaos Experiments

Controlled failure injection for every layer of your stack.

🔌

Kill Processes

Randomly terminate application processes to verify your restart policies, health checks, and process supervisors work as expected.

🌐

Network Chaos

Inject latency, packet loss, and partitions. Simulate cross-region failures, DNS outages, and upstream timeouts.

💾

Disk & I/O Stress

Fill disks, slow I/O, corrupt writes. Find out what your app does when storage misbehaves — before your SSD surprises you.

☸️

Kubernetes Chaos

Evict pods, drain nodes, delete namespaces. Test your PodDisruptionBudgets, liveness probes, and autoscaler response times.

Clock Skew

Shift system clocks forward, backward, or into the future. Certificate expiry, cron drift, and distributed consensus — all exposed.

🔥

CPU & Memory Burn

Saturate CPU cores and exhaust memory to test OOM handling, autoscaling triggers, and graceful degradation under load.

How It Works

Controlled chaos, not reckless destruction.

Define Blast Radius

Pick your target — a single pod, a node, an availability zone. Set boundaries.

Inject Failure

Run the experiment. Chaos Monkey applies the failure condition within your defined scope.

Observe Impact

Watch your metrics, alerts, and dashboards. Did the system self-heal? Did you get paged?

Fix & Repeat

Harden the weakness. Add it to your test suite. Run the experiment again to verify.