Hands-on Site Reliability Engineering
US$ 19.95
The publisher has enabled DRM protection, which means that you need to use the BookFusion iOS, Android or Web app to read this eBook. This eBook cannot be used outside of the BookFusion platform.
Description
Contents
Reviews
Language
English
ISBN
9789391030322
Cover Page
Title Page
Copyright Page
Foreword
Dedication Page
About the Authors
About the Reviewer
Acknowledgement
Preface
Errata
Table of Contents
1. Understanding the World of IT
Structure
Objective
What is the role of IT in an organization?
Hardware availability
Core software services
Compliance and security
Application development and hosting
Enterprise Architecture (EA)
Software delivery
Understanding the IT organization structure
Role of infrastructure teams
Data centers
Virtualization
Containerization
On-premise infrastructure
Cloud infrastructure
Development and deployment platforms
Role of application teams
Cross-functional development teams
DevOps teams
Production support/operations teams
IT security
Change management team
The TCP/IP protocol suite
Domain Name System
Conclusion
Multiple choice questions
Answers
2. Introduction to DevOps
Structure
Objective
Introduction to DevOps
DevOps principles and practices
DevOps principles
DevOps practices
Benefits of DevOps
Overview of DevOps tools
Git
Ansible
Jenkins
Conclusion
Multiple choice questions
Answers
3. Introduction to SRE
Structure
Objective
DevOps and SRE
Rise of internet companies
SRE overview
SRE terms
SRE team responsibilities
Skill set of SREs
Conclusion
Multiple choice questions
Answers
4. Identify and Eliminate Toil
Structure
Objective
Understanding toil
Importance of eliminating toil
Process optimization with automation
Examples of toil with approaches to automate
Purging and archiving of files
Purging of database tables
Installation/Patching
Monitoring
Checking log files
Identify and Access Management
Vulnerability scans
Infrastructure provisioning/decommissioning
Incident management
Conclusion
Multiple choice questions
Answers
5. Release Management
Structure
Objective
Understanding release management
Release planning
Build package
Test for quality and security
Deployment
Release automation with CI/CD
Using IaC for release management
Blue-green deployments
Canary deployments
Conclusion
Multiple Choice Questions
Answers
6. Incident Management
Structure
Objective
Understanding an incident management
Incident
Incident lifecycle
Blameless postmortems
Incident example
Incident detection/notification
Incident triage
Incident communication
Incident resolution
Incident retrospective/postmortem
Incident knowledge base
Role of development teams
Conclusion
Multiple choice questions
Answers
7. IT Monitoring
Structure
Objective
End to end monitoring strategy
Infrastructure monitoring
Server monitoring
Network monitoring
Storage monitoring
Application monitoring
Probes
Checking logs
Capturing processing time
MQ monitoring
Database monitoring
End user monitoring
DNS monitoring
Monitoring Tools
Agents
Transport
Collectors
Data transformation
Storage
Alerting
Dashboarding
Prometheus
Metricbeat
Grafana
ElastAlert
Conclusion
Multiple choice questions
Answers
8. Observability
Structure
Objective
Goals of observability
Service reliability
Operational efficiency
Security and compliance
Three pillars of observability
Standardized libraries/APIs/SDKs
Standardized trace context
Tracers
Cardinality attributes
Open source libraries and tools
Filebeat
Logstash
Fluentd
OpenTelemetry
Conclusion
Multiple Choice Questions
Answers
9. Key SRE KPIs: SLAs, SLOs, SLIs, and Error Budgets
Structure
Objective
Key metrics for SRE
Service level indicator (SLI)
Service Level Objective (SLO)
Service level agreement (SLA)
Error budgets
Error budget policy
Conclusion
Multiple choice questions
Answers
10. Chaos Engineering
Structure
Objective
Introducing chaos engineering
Application/service unavailability
Network delays
Network failures
Resource unavailability
Configuration errors
Database failures
Chaos engineering process
Define steady state
Build a hypothesis
Minimize blast radius
Inject the failure condition
Verify hypothesis
Reverse failure condition
Fix any issues
Automate to run continuously
Chaos GameDays
Injecting failures
Killing a process
Network failures
HTTP failures
Injecting multiple failures
Techniques for building resiliency
Single point of failures
Rate limiting/throttling
Circuit breaker
Handle retry storms
Conclusion
Multiple choice questions
Answers
11. DevSecOps and AIOps
Structure
Objective
Understanding DevSecOps
Code scanning for security
Secure releases using Infrastructure as Code
Introduction to AIOps
Use cases with AIOps
Intelligent alerting
Noise reduction
Automated root cause analysis
Automated remediation
ChatOps
ChatOps example with Rasa, Flask, and Telegram
Conclusion
Multiple choice questions
Answers
12. Culture of Site Reliability Engineering
Structure
Objective
Breaking silos in the organization
Embracing risk
Continuous improvement
Intelligent automation
Shift-left mindset
Conclusion
Multiple choice questions
Answers
Index
The book hasn't received reviews yet.