All Stories

Making on-call superheros

Building a world-class service is as much about maintaining software as it is about developing it. On-call engineers are typically responsible for ensuring the reliability and availability of your service...

Incident Response 2.0 — The Zenduty Incident Command System(ICS)

We are super excited today to introduce our latest Zenduty integration with Slack, which we are calling the Zenduty Slack Incident Command System(Slack-ICS). This was many months in the making...

Zenduty — SRE Puzzle of the week — Forgotten password

Zenduty — SRE Puzzle of the week — Forgotten password

Incident Alerts — Reducing Alert Noise

Incident Alert Routing — Reducing Alert Noise

On-call doesn't have to be stressful

“Being on-call is a critical duty that many operations and engineering teams must undertake to keep their services reliable and available. However, there are several pitfalls in the organization of...

The importance of GameDays

GameDays were first coined by Amazon’s “Master of Disaster” Jesse Robbins when he created them intending to increase reliability by purposefully creating major failures on pre-planned dates. Game Days help...