ThirdBrAIn.tech

ThirdBrAIn.tech

Search

❯

❯

❯

❯

❯

Navigating the Chaos - A Holistic Approach to Incident Management by Hila Fish

Apr 06, 20252 min read

Navigating the Chaos - A Holistic Approach to Incident Management by Hila Fish

AI Summary

Summary: Incident Management Talk by Hila Fish

Introduction

Speaker: Hila Fish, Senior DevOps Engineer with 15+ years of experience.

Topic: Importance of Incident Management due to the inevitability of system failures.

Agenda

Mindset for managing incidents.

Structured process for incident management.

Traits necessary for efficient incident management.

Proactive measures for incident preparedness.

What is Incident Management?

A set of actions to resolve critical incidents.

Involves detection, communication, responsibility assignment, investigation, response, and resolution.

Mindset

Shift from reactive to proactive handling.

Understand the business impact of incidents.

Prioritize incidents based on potential loss of revenue, customers, data, and reputation.

Structured Process

Business mindset: Understand the “why” behind actions.

Structured process leads to incident prevention, reduced resolution time, cost reduction, and preservation of business and reputation.

Five Pillars of Incident Management

Identify and Categorize

Assess the full extent and business impact.

Determine urgency and proper notification channels.

Notify and Escalate

Inform relevant stakeholders (customers, internal teams, management).

Decide if escalation to other teams is necessary.

Investigate and Diagnose

Focus on relevant information for resolution.

Identify and understand the root cause.

Resolve and Recover

Choose the best remediation step.

Address any action items post-resolution.

Review and Learn

Notify stakeholders upon incident closure.

Review and update alerts and runbooks.

Determine if a postmortem is needed.

Traits of an Incident Manager

Think on your feet, differentiate relevant information, operate under pressure, work methodically, ask for help, problem-solving mindset, ownership, good communication, lead without authority, and care.

Proactive Measures

Post-incident: Shift handoffs, postmortem notes, new tasks, modify alerts, update runbooks.

Day-to-day: Read shift handoffs, know escalation contacts, understand system architecture, learn application flows, be aware of team tasks, be a go-to person.

Conclusion

Emphasize the business mindset, follow structured processes, develop necessary traits, and be proactive to prepare for and potentially prevent incidents.

Q&A

Implementation: Start with a workshop, follow up with documentation, use incident runbooks, and integrate reminders into tools like PagerDuty.

Navigating the Chaos - A Holistic Approach to Incident Management by Hila Fish
Summary: Incident Management Talk by Hila Fish

Graph View

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2025

GitHub
Discord Community