ITSM incident and problem: two names for three things

Debate around the definitions of Incident and Problem never seems to end.
Here's my take on the fundamental issue that fuels the endless arguments: we have two entities trying to do three jobs.

When a service breaks, we have to deal with three things in support:

Measuring problem management

How best to measure your Problem Management practice?

ITSM in Cherry Valley

I'm getting lots of positive feedback about my series of articles for The ITSM Review, which use a train crash in Cherry Valley, Illinois as a case study for understanding incident and problem management. (It is part of a wider theme of my articles for The ITSM Review using railroad examples for service management).

It always mystifies me that people (and ITIL) don't grok this simple model: incident management is about users, problem management is about causes.

How ITIL gets Incident vs Problem wrong

In ITIL, we don't separate Incidents from Problems properly. This causes a muddy and confused definition of both. Join me as I try one more time to make this clear.

Riddle me this: matching ITIL theory to the real world

Calling all you ITIL theorists, philosophers, pontificators and pundits. Marty is back: our follower from the real world, trying to make sense of ITIL on its home grounds, the operations of big iron batch computing. Marty asks what happens after a service is restored? What does ITIL call the function of undoing the damage done while a service was unavailable? I have a view - of course - but I'm going to stay quiet - for a while- and hear what everyone else thinks. So have at it.

ITIL Problem versus Risk

It was one of the great ITSM philosophers, Jan van Bon who first explained to me that Problem Management is but a special case of Risk Management.

In a purist theoretical sense he is right, but on a practical level I think the distinction is useful. It is certainly entrenched.

Problem detection is everyone's duty

When a train rolls by, the guys on shovels and brooms, track gangs, crews on the ground, crews on other trains, clerks, station-masters, everyone stops and watches the train and waves to the crew on board. Lazy? Hell no.

Shit happens, or how I learned to love the incident

Complex systems are by definition broken. They will always break and sometimes they will break when everybody did what they are supposed to. Fixing the problem won't necessarily reduce the risk of another incident.

We should create the problem record right up front in an incident

A BOKKED post three months ago drew a lot of attention. It was about the disconnect between Incident and Problem Management in ITIL V3 Service Operation. [See also the ITIL Wizard stirring the pot about Major Incidents] I've just discovered a response to that post which has popped my brain with its simplicity and clarity

ITIL V3 Service Operation disconnect between Incident and Problem Management

There seems to be a major disconnect between ITIL V3 Incident and Problem Management.

Syndicate content