How to handle incidents, downtime and outages

Published: 01 January 1970
on channel: Server Density

2,956

This will be an open discussion (no slides) on best practices for handling incidents, downtime and outages. Taking the experiences from both Server Density and Yelp, contrasting a small company to a much larger one, we'll take you through how we deal with things such as:

On call - rotations, scheduling, systems and policies
Preparing for downtime - teams, systems and product architecture
Documentation
Checklists and playbooks
How we actually handle incidents
Post mortems

Speakers

* Charlie Allom, Network Engineer, Yelp
David Mytton, CEO, Server Density
* Brian Trump, Site Reliability Engineer, Yelp

Watch video How to handle incidents, downtime and outages online without registration, duration hours minute second in high quality. This video was added by user Server Density 01 January 1970, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,956 once and liked it 11 people.

516