This site has been retired. It is now a static archive. Some links may not work. Find Rob at TealUnicorn.com

what is an ITSM Major Incident? ITIL doesnt say.

Submitted by skeptic on Mon, 2014-04-21 02:51

Share this post with

This post is a little easter egg for you from the IT Skeptic. I hope it is useful!
© Copyright Canstock Photo Inc

One of the enigmatic parts of ITIL is Major Incidents. Here are my tips for better Major Incident Management.

If you look at ITIL 2011 Service operation there is a paragraph under 4.2.4.2 headed Major Incidents (MI). It tells you that:

There is a separate procedure
You must agree what constitutes a MI
Form a separate team
Keep incident and problem separate

...um... that's about it.

On Figure 4.3 in that book, the incident process flow branches to a box marked "Major Incident" and never comes back. That box is defined.... nowhere.

So you are pretty much on your own when it comes to Major Incident. Here are a few of my thoughts, as I'm hot on Major Incident Management, or MIM.

MIM is very important. It needs to be well defined.

Some organisations equate a MI with a Priority 1 Incident (or a Severity 1 Incident). I don't think the mapping is that crisp. Incident priority is for sorting and prioritising (and measuring and reporting). A MI is about abandoning the normal process and switching to different procedures. As discussed below, a MI is about having to invent process. So a MI is about the recognition that normal Incident and Problem Management are not going to cut it. A Major Incident is a declaration of a state of emergency.

[From a comment by BoonNam Goh (thank-you!):
A major incident is mid-way between a normal incident and a disaster (where the IT Service Continuity Management process kicks in). It is mid-way towards a disaster in terms of impact (especially, public impact) but is not yet a disaster in terms of having to activate Disaster Recovery (usually in a major incident, the infrastructure or the bulk of it is still intact and so does not make sense to go to the DRC).
Since Major Incident and ITSCM are similar, some of the activities and organisation structures pre-planned for use in a major incident could be a lighter variation or a reuse of those used for ITSCM (e.g. notification of organisation management, involvement of the comms manager etc).]

Don't bother defining what constitutes a MI. Many circumstances will be unforeseen: guidelines on spotting one don't help. A MI is like art: hard to define but you know it when you see it. I call this the "Oh shit!" test.

So put the effort into defining who declares a MI. Specify the roles that can push the big red button: e.g. service desk manager, service delivery manager, operations manager, business owner.

Likewise the process cannot be tightly defined (to be fair to ITIL with their mystery process box). A MI is all about Case Management: you need to take each one on its merits and work it out as you go along. For much more on Case Management see my Standard+Case approach.

What needs to be well defined are:

Policy: if people are making decisions on the fly give them principles, guidelines, rules, bounds, goals, inputs, and outputs.
Roles and responsibilities: especially a Comms Manager and a Technical Manager, who work back to back - one faces outwards and one inwards. One of these could be the overall Major Incident Manager or it could be separate person.
Procedures: comms plan, war-rooms, supplier mobilisation, RCA...

The Major Incident Manager is not automatically the same person as the Incident Manager. The skillsets are different. See Choose your Major Incident Manager.

MIM is about restoring service. Problem resolution is a closely related but distinct process. Don't let chasing the problem distract tech staff from getting the service back on the air as their top priority. Best to have two teams: incident resolution and problem resolution. But then I've been saying that for years, most recently here.

MIM is as much about managing the impacted customers as it is about managing service restoration.

Once you have it defined, then rehearse rehearse rehearse. You traipse down the fire stairs twice a year, but the only time you practice MIM is when it happens, right? Organisations who get lots of MIs are well practiced in MIM. Good stable production environments get complacent and really screw up when the inevitable MI happens. Stay sharp: do MIM drills.

Note: there is such an animal as a Major Problem. Service has been restored but it is still out there ready to strike again, and it passes the "Oh shit" test. Proceed as above, with slightly less urgency.

For more on MIM see

my book Plus! The Standard+Case Approach
Checklist: Declare a Major Incident
Checklist: Resolve a Major Incident
Choose your Major Incident Manager
Braun Tacon's Major Incident Handling
The ITIL Wizard's view (satire)

"The IT Skeptic™", "The Skeptical Informer™", "The IT Swami™", "Chokey the Chimp™" and "BOKKED™" are trademarks of Two Hills Ltd.
ITIL® is a Registered Trade Mark of AXELOS Limited
PRINCE2® is a Registered Trade Mark of AXELOS Limited
M_o_R® is a Registered Trade Mark of AXELOS Limited
P3O® is a Registered Trade Mark of AXELOS Limited
MSP® is a Registered Trade Mark of AXELOS Limited
P3M3® is a Registered Trade Mark of AXELOS Limited
MoV® is a Registered Trade Mark of AXELOS Limited
MoP® is a Registered Trade Mark of AXELOS Limited
ITIL® is registered in the U.S. Patent and Trademark Office.
ITIL Live™ is a trademark of TSO, The Stationery Office.
prISM® is a registered trademark of itSMF International Inc.
COBIT® is a Registered Trade Mark of the Information Systems Audit and Control Association and the IT Governance Institute.
Microsoft® is a Registered Trade Mark of Microsoft Corp. in the United States and/or other countries.
USMBOK™ is a trademarks of the SM101.
CMM® and CMMI® are Registered Trade Marks of Carnegie Mellon University.
ISO® is a Registered Trade Mark of the International Organisation for Standardisation.
KCS was developed by the Consortium for Service Innovation™, www.serviceinnovation.org The KCS Academy, KCS and Adaptive Organization are service marks of the Consortium for Service Innovation™.
DevOps isn't a trademark of anyone.
Except where indicated otherwise, all contents of this site are © Copyright Two Hills Ltd www.twohills.co.nz.

Except where indicated otherwise, the text on this page by Two Hills Ltd is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Content must be attributed to "© Copyright Two Hills Ltd www.twohills.co.nz by Rob England, The IT Skeptic" plus the URL of the source material.
RSS feeds may be used without permission. Permission is granted for anyone to link to this site.
By accessing or viewing this site, you are deemed to have agreed to the Terms and Conditions and to our Privacy Policy.
The contents of this site are unmoderated submissions from authenticated and unauthenticated users. As such they cannot and do not represent the views of Two Hills Ltd.
Use of any trademarks on this website is not intended in any way to infringe on the rights of the trademark holder.
This site is developed, maintained and hosted by Two Hills Ltd

Made in New Zealand by Rob England