Steady state is the measurable, normal operating behavior of a system, expressed as output metrics rather than internal details. It is the anchor of every chaos experiment: you confirm the steady state before a fault, then watch whether it holds during the fault.
Choosing the right signal matters: a gray failure can keep error rate at zero while latency quietly climbs, so a steady state defined only on error rate would miss it. This is why observability comes before chaos.
The practice of deliberately injecting failures into a system to discover weaknesses before they cause outages, by forming a hypothesis about steady state and testing it under real-world fault conditions.
The managed AWS service for running controlled chaos engineering experiments, injecting faults such as stopping instances, adding latency, failing over databases, or isolating an Availability Zone, with built-in stop conditions.
Toc Consulting: AWS Security & Cloud Architecture
Our team helps engineering teams secure and architect AWS the right way: assessment in week one, a prioritized action plan in week two.