A hundred users call up and say they can't get emails. One incident or 100?


Comment -A hundred users call up and say they can't get emails.

The question details that 100 calls have come in reporting the caller's email is down. There is not enough information - YET - to know if it is one incident or 100. But it is pretty clear that it is some number between 1-100. It is highly unlikely but possible that there are 100 separate incidents, each affecting one email account. It is much more likely that a single failure has affected 100 email accounts. But it may be that one or more of the reports are not related to the more major failure, so you might have 98 accounts affected by a single failure / single incident, and two that are from another or other causes. AT the point in time described by the question, I think you can only assume there is something between 1 - 100 incidents. You probably associate all the calls to a single instance of an incident ticket initially, restore the service, then confirm that all accounts are active. If there are outliers, they surface at that point.

all you folk who chose "One" you are indefensibly WRONG

indeed. I was getting concerned at the very low number of people choosing "Anywhere between one and a hundred" though I see it is now increasing. You can debate whether it is that option or always "A hundred" (see the discussion that triggered this poll), but all you folk who chose "One" you are indefensibly WRONG for the reasons in the comment above. Since that is half the respondents, that's a worry. It is also a lot of readers I've just offended :)

Clarify pls

You say one is WRONG and yet the point above says:
You probably associate all the calls to a single instance of an incident ticket initially, restore the service, then confirm that all accounts are active

Its probably a good practical approach to assume that if you have 100 calls between 10 and 10.30 on a wet tuesday morning that the vast majority might be solved by looking for and finding one solution. Ultimately is this question about resource and how to ensure it serves the customer most efficiently? Intuitively if you focus your resource in solving the one problem the chances are you'll satisfy some 90% of your customers within the hour? If you focus your resource on solving 100 unique calls one by one it's not very efficient is it?

'One' might be wrong technically but treating it as 'one' initially is probably a practical approach...

Happy to be shown to be wrong :-)

Service support is a Trust Tree

I like your argument but I might see it differently if I was one of the 10 who got regular updates that things were being fixed and then they weren't.

If you never screw up, people are neutrally content. if you screw up and fix it well they are - perversely - actually happier.

If you screw up and then screw up the fixing, they remember and it takes a LOT of great fixing later before they forget.

Service support is a Trust Tree - years to grow it and a minute to chop it down.

10% or even 1% poisseed off every time you have an outage soon adds up.

How many are entitled to support?

Well in my fictitious best practice run site we would first check if they were each covered by a support agreement.... which also means that if they were not - we might open 1 ticket to that effect... to address resource use....

If you rephrase the question ...


If you rephrase the question to read "how many contacts are logged" I would expect 100 - and from there they are classified as service requests, incidents, information requests ... (and so forth, similar to your classification list from another post) which also validates that anywhere betwen one to a hundres incidents could be logged.

when ITIL does

I'll clarify the question when ITIL does - that was my whole point. See the original post: ITIL doesn't even mention logging something called a "call" or a "contact"

an attempt

ok - I've got a copy of some of the ITIL books from our library. Let me have a crack and PLEASE show me where I'm going wrong. This is me trying to learn.

Email server dies - lets keep it simple and clear and ignore the initial lag where the service desk may not realize all calls are about the same thing. So they get 100 calls.

Calls need to be logged somehow. There's a fair bit in the metrics section of "Service Operation" on the need to report the number of calls. So calls must be logged. This seems in line with previous replies saying that you'd have 100 contacts logged. ITIL *does* seem to agree (at least if contacts == calls).

However, calls don't necessarily need to be logged as an incident or a request. There's mention of Service Desk dealing with "incidents, requests and other calls".

Somewhere along the line the event is recorded. The calls are not the event, the calls are calls. The event is the change of state of the service (from glossary). That event needs to be recorded - but not 100 times. There was only one event wasn't there? Unless you wish to count the change of state of each individual email account. If you were using multiple monitoring systems and they all reported the event, that's still only one event. In this case how is a call different to an automatic monitor (wetware monitoring)?

Assuming the SD knows of the event (which I am) and is able to confidently link the call to the event then doing so would not be wrong from ITIL's perspective? But what about the alternative? Would it be wrong to log 100 events or 100 incidents? BTW it bothers me that this book seems to ignore events being reported from a call. Maybe I'm confusing events & incidents.

Anyway - the diagram for event management shows event notifications being the precurser for incident management. And the incident management diagram shows calls also being a spereate precurser to incidents. Maybe I am confused on event/incident.

OK whatever - what about an incident. The glossary lists that as an "unplanned interruption to service". Multiple calls come in for the same event (change of state) and for the same incident (service loss). Do we have multiple incidents? I guess it depends on how you count the service and whether you allow aggregation.

The incident definitely needs an incident record. But does ITIL say that every call needs to have a unique incident record? Not that I can see. The incident is a different entity to a call.

If the underlying cause is a power outage to the mail server (power supply, fuse, axe through cable) do you have a separate incident for every service on that server or is it a single incident - hardware not going?

If the mail server is not working - if the service loss is "email" then we've only one incident. If the service loss is regarded as separately for Alice's email account, Bob's email account, Carols email account ... then we might have 100 I guess.

I can buy the argument that ITIL is vague and not prescriptive on this. Calls must be logged separately as unique calls. But I don't see anything which says that there needs to be a 1:1 correspondence between calls, events, and incidents.

The question I see is - how many incidents (loss of service) do you actually have? Or at least - how many is the Service Desk aware of? If there is only one, then isn't it mis-reporting to count 100?

the Higg's Boson of ITIL

We can debate this endlessly. the interesting thing is that over 20 years the ITIL pundits either haven't debated it or don't see fit to share the considerations in the book. And there are major implications to your service metrics depending which way you jump, and a very expensive change if you decide later you went the wrong way. So a little guidance would be useful.

"An interruption to a service". Do we see the service from the customer's perspective? one email. Or from the users'?

Then the point made earlier: the moment you say this call is the same as those calls you stop paying attention to that individual user and trying to determine whether maybe their email stopped for a different reason. Both Murphy's Law and Poisson's statistical distribution say it might have. Random uncorrelated events WILL happen at the same time.

"Call": if call exists as a distinct entity then it is the Higg's Boson of ITIL. Everybody is inferring its existence from some pretty tenuous evidence. Don't you think ITIL would say "log the call. Now think about what it is and categorise it as Incident or Request [or other call]" Since you have the book out, look at Incident is logged with "name/department/phone/location of user". That doesn't sound like a many-to-one relationship to me.
6.2.2 service desk exists for "logging all relevant incident/service request details" nothing about "take calls"
6.2.2 the line you quoted "incidents, requests and other calls" so incidents ARE a type of call
7.3.1 "allow automated relationships to be made and maintained between incidents, service requests, problems, Known errors and all other configuration items" no calls?

And so on and so on... many more We have a couple of ambiguous points in the doc where maybe a separate call entity could be inferred to exist, but we also have many points in the doc where if it existed it should show up loud and clear.

It might exist in the wild (I think so) but it doesn't exist in ITIL

Incident vs fault management

The reason why this discussion pops up now is the fact that V3 changed the definition of an incident. Old incident became fault and therefore V3 incident management is really fault management. As a consequence there is now a void in the model. What should we do with the incidents that are not faults?

Old V2 stated firmly that all calls should be registered, SS 4.4.2. It was and is a good practice. Let's hope the coming version will fix this.

This incident discussion shows me that all ITIL V3 students need to take a new class to unlearn the bad V3.0 practice and then learn the new and hopefully good V3.x practice.


The question wasn't prefixed

The question wasn't prefixed with ITIL specifics, but anyway, 100 people have called and an initial diagnosis points to the same symptoms/issue then I'd say 1 incident - mail failure is reasonable. But that doesn't mean there was only one cause and one fix.

1 or 100?

First, you have to know the meaning of the word "incident."

Incident - (noun) an individual occurrence or event.

Then, you have to know WHAT occurred.

It's not until you find out WHY they couldn't access it, that you know how many incidents it is.
- If a fire caused the server to go ka-blooey (a technical term), then it would be ONE incident.
- If all 100 people clicked on the same spam email, causing only their individual email client to go ka-blooey, then AND ONLY THEN is it 100 incidents.

It helps to think about it in terms of how much work has to be done. Do you only have one server, or 100 individual PCs?

'Nuff said. This string must be retired.

No not 'nuff said

No not 'nuff said. An incident is an interruption to the service to a user.

An event is something different

So is a fault, i.e "what occured".

An incident is unique to the user experiencing it. if you are truly customer-centric you will see that and give the user the respect of their own incident.

Perhaps you should read (a) ITIL and (b) the discussion that triggered this poll.

As i said in a comment above:
I was getting concerned at the very low number of people choosing "Anywhere between one and a hundred" though I see it is now increasing. You can debate whether it is that option or always "A hundred", but all you folk who chose "One" you are indefensibly WRONG for the reasons in the first comment made right at the top of this post. Since that is half the respondents, that's a worry. It is also a lot of readers I've just offended :)

Syndicate content