ITIL Problem versus Risk

It was one of the great ITSM philosophers, Jan van Bon who first explained to me that Problem Management is but a special case of Risk Management.

In a purist theoretical sense he is right, but on a practical level I think the distinction is useful. It is certainly entrenched.

And - for me at least - the distinction seems intuitively right. Here are some thoughts on the distinction.

  1. Problem and Risk and opposite ends of a spectrum which has another name --- ?? (Jan would probably say Problem is a point on the Risk spectrum)
  2. Risks are problems that aren't going away any time soon. Probems are risks that can be fixed
  3. Or put another way by M.McEvoy a problem is something that already exists and is having an impact so the probability is 100%. A risk is something that has not occurred but has a probability of occurring and causing an impact - so the probability is anything under 100% (but not 100%).
  4. Or another way, by an unknown visitor: "Risks represent future problems that have not yet resulted in impacts. Problems are risks that were not mitigated and thereby generate impact."

Certainly Jan is right that ITIL is light on Risk - non-risk-centric as it were. Risk should be a central function of IT Management. In fact in recent debates over the ridiculous idea that Cloud might eliminate the need for IT or ITSM or ITIL or or chemotherapy or whatever daft New Age nonsense is being peddled about Cloud this week, I defined "IT Management is about managing the IT risks on behalf of the business". Maybe I'm indulging in my usual hyperbole to suggest Risk is the reason for the existence of an IT Management function. There's all that customer stuff and improvement stuff to worry about too. But Risk is up there in any business and should be one of our top drivers and focus points in IT too. But not if you read ITIL. Risk in ITIL is like customers in ITIL: popping up here and there, lurking in the shadows, implied more often than explicitly addressed, left out in some surprising places, and under-rated everywhere. Improvement gets a whole book. Culture/politics/people don't. Customers don't. Risk doesn't.

To me this shows ITIL is more IT-aligned than business-aligned despite its protestations. Risk management is not about listing a few self-evident risks for each topic. It is a systematic function that should have been its own "process" or whatever you want to call it in ITIL. (Can I call them "practices" please? "Processes" sticks in my craw).

Problem Management gets a whole practice to itself. So does Change. And Information Security. And Access. But they are all different views of Risk management. Should IT have its own Risk practice? or can ITIL abdicate responsibility for risk, saying that risk is a business practice and no concern of ITSM? Well yes you can argue Risk is a business practice, but only if Change is a business practice too, and Continuity, and Problem, and myriad others that in theory IT should not be replicating in miniature.

If it is good enough for IT to do its own Problem Management then I think it is good enough for IT to do its own Risk Management. And unlike JvB I think Problem and Risk are distinct practices... at least in practice.

See also all the great discussion in the comments on Risk Management - the lost process of ITIL V3

I started writing this post just over two years ago. I really must clear the backlog of unfinished posts for this blog. There's getting up towards 80 of them!


commoditization of ITSM

Of course there are many types of risks. They may vary in terms of likelihood, impact, etc. Even an opportunity is a type of risk. But if you look at the process, it's all the very same.
Therefore, to make this simple, you can do with a single pure process in an integrated service management process model. We practice this with great success everyday, in the ISM Method.
'Practices' like Security Management are functions, not processes. As a consequence, they use the elements People, Process and Product to achieve their specific goals. This simply means that Security uses the processes of contracting, changing, restoring, delivering, preventing, etc. Security also uses People: we're currently developing a management guide on this in the Roles books, to be published by TSO around this summer. And you can imagine the Products that are used for security goals.
Setting up a Security Management function is easy, once you have a management system that covers all three elements. Achieving an ISO standard is a lot less difficult if you have this in place.
Believe me - this is much easier than everyone is led to believe. Just look trough the ITIL papers and see the structure. Once you see that, you can speed up the results of your projects by a factor 10, and create lasting results.
This is how ITSM is now commoditized in the Netherlands. We put the ITIL books at their rightful place, the bookshelf. We take them off when we want to be inspired by the best practices that are described in the books. We use an integrated service management system to realize these practices.
I've written plenty of articles that provide details about the architecture and the view of this methodology. It's not rocket science, it's easy to learn, and it brings great results. Once you've got it, you can focus on all the topics that really require attention....

There is no Problem Management

I spoke about this in Pink11 but unfortunately you were doing an interesting panel discussion at the same time.

Problem Management appears only in ITIL, there is no such thing anywhere else. ITIL PM is the unhappy marriage of Problem Solving and Risk mgmt. Problem solving is not a process but a capability or practice which support needs. If an incident (request for support, consumer problem) is hard to solve, you need to activate the problem solvers in your organization but you do not need to start a new process, IM will do just fine.

If you use a workaround and are unable to fix the cause of the incident, there usually is a risk that it will repeat. That would then be a moment to open a risk ticket. All known errors contain some risk and can cause various incidents. This activity is reactive risk management, proactive risk management seeks to prevent things from happening the first time. Managing risk and solving technical problems are two very different practices.


Excellent post and I also agree completely regarding customers in ITIL. Service lifecycle management without customers, sales, business relationship mgmt etc is just plain silly.

Risk as a special case of continuous improvement

And why would we not see Risk Management as one form of Continuous Improvement?

Charles T. Betz

Different things

Charlie, you should talk with your business people ;)

PDCA and Risk

Got a number of interesting hits when I Googled "PDCA and Risk." Including a Van Haren publication, Information Security based on ISO 27001, which advocates using PDCA for implementing a risk management program.

I think I am not the first person to see essential similarity here. Can you be more specific on why it would be harmful to see risk as a continuous improvement opportunity? On a practical level, both need tracking and often involve the same sorts of investigations. At least that's been my experience in working with business partners on both kinds of effort. Certainly, I have been involved in any of a number of continuous improvement reviews that have resulted in risk identification. I've also seen risks identified that resulted in a continuous improvement cycle.

Charles T. Betz

No you are not

If you wish, you can check my presentation at Pink11 (Session 809, 4th slide from end) where I have Security, Capacity, Availability, Continuity and Problem mgmt as all part of both risk management and continuous service improvement. Risk and CSI are the opposite ends of the box, so yes they have much in common. But then there is a difference. For example availability, CSI and risk management all want to reduce interruptions. Improving availability by reducing response times is not risk management; it is service improvement.

I have been wondering is there any need for separate availability, continuity and capacity planning functions (they are not processes). My recommendation has been to make a service plan according to ISO 20000 which contains plans for availability etc. For example planning for service desk capacity needs is quite different from planning storage needs, there is no benefit in trying to centralize those under one "process". I suppose all those things can be split in risk and CSI.

The difference is in the procedure. Risks can and should be identified and recorded. CSI is more of a collection of proposals, programs, activities etc. There can also be some tension between the two, password rules are a good example. Security (risk) wants to have complex passwords and force people to change them often, CSI sees password resets as a recurring incident which reduces customer satisfaction.


Every management practice is

Every management practice is aspecial case of PDCA. I think that generalises to the point of being unhelpful. I was worried that calling problem a case of risk was similarly over-abstracting, but Everything maps to PDCA surely?

less theoretical

I've consulted with the owners of formal risk, problem, and continuous improvement processes, with specific attention to the nuts and bolts of how they work and their enabling data structures.

At the end of the day, they are all relatively unstructured efforts (compared to formalized transactional processes like originating a mortgage) that need identification, scoping, assignment (typically to overutilized SMEs), and tracking to completion.

I see no reason why both Risk and Problem could not be seen as subtypes of a more generalized Improvement Opportunity, which might also include dimensions such as Capacity, Availability, Architecture, Continuity, Security, and so forth.

I think the practical benefit in a generalized Improvement Opportunity process is queue reduction; queue proliferation I think is one of the biggest problems in large, complex, matrixed organizations.

Charles T. Betz

centralised management of an improvement programme

I can't see that. even if they were in one repository or queue, they'd be there as subtypes with different owners. I'm all for centralised management of an improvement programme - in fact I do that for smaller clients. But down to the level of each improvement task in one queue??? I suppose so, but my instinct is still that this is abstracting beyond the useful.

ISO 9000

Just heard that new version of ISO 9000 will drop proactive activities for the sake of clarity, there will be just Control and Improve. I suppose one should have many improvement queues.


Syndicate content