In this session, Google's Steve McGhee will present an overview of how reliable services and systems are built on the cloud. A recent report published by O'Reilly (https://info.blameless.com/oreilly-building-reliable-services-on-the-cloud) offers easy-to-follow steps for building a reliable service on the cloud.
- Define your objectives for the service to ensure it satisfies users and the business, and make sure the service is structured with the subsequent design decisions and trade-off discussions in mind.
- Know the dependencies that you’ll use to build a service so that you can leverage them effectively to reach objectives.
- Architect your service by developing application programming interfaces (APIs), decomposing the system into components and designing the components so they contribute to the service objectives.
- Avoid common failure modes that can create outages or otherwise cause your service to miss objectives by deploying solutions that target each failure mode.
This model will aid teams in building modern, scalable systems and understanding the design considerations available on today's cloud platforms. People should leave this session understanding:
- How to consider reliability objectives and how those interact with failure domains, redundancy, scalability and efficiency.
- How to design a system while considering dependencies, as well as how different architectures scale.
- How to avoid some common failure modes when scaling a system on the cloud.