Sorry about the outages over the last 24 hours

Sorry about the outages over the last 24 hours. WestHost has had a power problem. It is one hiccup in normally superb service. Wonder who their UPS manufacturer is eh?

Comments

Ponderables...

Skep,

  • If a tree falls in the woods and there is no one around to hear it, does it still make a sound?
  • If a site goes down and no one noticed it, was there an interruption of service?
  • If a husband and wife are arguing, is the husband still wrong?

Bottom line is, stuff happens. Thanks for your efforts in running the site.

kengon

variant

I hit the problem twice 12 hours apart, but then I have a special relationship with technology which is apparently organised by Mr Murphy

P.S. The variant I saw is: "If a man speaks in the forest and there is no woman to hear him, is he still wrong?".

Proximity Bias?

Heh. Maybe my interepretation of this was influenced by factors a bit closer to home...

kengon

Skeptic, just wondering.

Skeptic,

just wondering. Does your web hosting service provider align its delivery of services to ITIL or similar Service Management "good practice" or "framework". Do they have Incident, problem, change management etc?

Just wondering if the people that use this site and provide this site demand the same principals from their service providers as they do when providing consultancy and advice totheir customers.

This is not a negative post but I am just wondering do we demand what we preach from the service providers that we rely on (in this instance - the service is a web hosting service).

If IT ain't broke don't fix it

Fair question and I don't know the answer.

ITIL transformation is done when we need it. WestHost does a superb job. So I don't care and the question is academic. If IT ain't broke don't fix it

Hi, I do not agree that this

Hi,

I do not agree that this is an academic question. I think this an important question. I see a huge importance on customers asking their service providers what they have in place to deal with the issues and problems that can present themselves.

I do not buy "If it ain't broke don't fix it" at all because I also look to examine the quality of the service being provided. Granted - increase the quality may require an increas in cost but regardles I think this needs to be looked at in contect of the overall service.

I am taking about service and service provisioning as oppesed to process and functions.

This post is not suggesting that your current provider is or is not providing a quality service. That is for you to know or find out if you want to. I believe that custoemrs should demand that their service providers "demonstrate" that they can provide a quality service that meets the needs of the custoemrs.

Do they manage change.. at a power level?

Hi I agree, with visitor who doesn't agree! Why do we use the term service provider when few in the hosting or outsourcing world apply best practices that we know work!

I'm in a cross over role of developing best practices in managing data centres, applying some of the ITIL concepts where applicable. Guess what, alot of current data centre power problems are because
1. no one has documented the power infrastructure,
2. set a baseline for design and then verifies using monitoring tools, and
3. embeds in the change activities a review of power impact as part of build and move tasks.

To get a picture of how power is distributed and where capacity is close or exceeded requires you to know everything you have in your data centre, know how many power connections it has (plus which are active/standby) so you can predict the impact of change.

The emerging technique is to label each power cable uniquely, note the port and power strip it goes into and then document it so you can ensure you don't trip a circuit breaker. If you think this is a bit excessive, whinge away but this is "best practice" as it avoids problems. So does your hosting company manage at that level, because if not you can expect more disruption - even if they adopt ITIL

Is this another role for the CMDB? (joke)

Dave

Once upon a time there was a service provider

Once upon a time there was a service provider in a little city in a little island country far away. That SP delivered pretty good service, and they did it cheap. They did it based on individual heroics, and sometimes the "Chinese Army strategy": throw people at it until it works. Mnay people who knew the inside workings marvelled at how they could function effectively with such ad-hoc processes, but they did. Nothing flash but it worked.

Then they grew, and they grew until the strategy didn't work any more. Their major customer started to get pissed off, and they renegotiated their contract with each other, and they used ITIL as the framework for that. And it worked. the SP is now one of the more advanced in processes, the client is getting better service (though it is never perfect eh?), and there is lots more transparency and accountability.

The moral: whatever works. And if it stops working or threatens to soon stop working, fix it with something like ITIL. But while it works, leave it alone. Whether it is "right" or not is irrelevant. Haven't you got something better to do with the money?

Power is no joke

Power and cabinet distribution units, and the dependencies of devices thereupon, are entirely appropriate for the CMDB.

One requirement however that no CMDB covers to my knowledge is electrical analytics - "what if" capacity analysis stuff. Not sure that kind of math belongs in a CMDB.

Charles T. Betz
http://www.erp4it.com

Power Issues

Hi Charles,

Totally agree, a service focused "CMDB" has enough problems handling the current service/system mapping without trying to model stuff thats outside the scope of service management and more into the technie level. Doesn't mean that an organisation shouldn't understand its power delivery at a detailed level.

The "what if" analysis isn't easy (I give talks on it) as it assumes that you don't have a baseline, but you can add to it with upcoming project requirements. The same applies to reserving space in racks, ports on switches and power connections. A few of the data centre management tools have this capability where you log a request to install a few servers, add in the data as you been given and check against circuit breaker limits. It does get more complex where you have to take into account primary/standby power feeds and what happens in a fail over. Some blade servers and switches have 6 power supply feeds (or more) so you end up planning for 40% loading on a breaker, to cope with another feed failing over.

On top of this you have the cooling problem - power in = heat out. So where you put a box becomes important so you haven't used up the cooling budget for a zone. For example a lot of data centres are designed around roughly 2kW-4kw per cabinet. An HP blade chassis can generate 12kw and you can get 3 in a rack! The end result is that planning is becoming more complex and in the best tradition of IT, everyone creates another spreadsheet. In the best traditions of ITIL someone assumes that the CMDB must be able to do that as well.

There are some things that the "CMDB" should try and do - its too complex and as with another blog, how would the user interface help you understand. Let the CMDB stick to service management so it might deliver what ITIL promised.

Dave

Power and space

Dave, we're clearly dealing with matters of common interest. What is best practice in your opinion for forecasting and allocating space & power? Where does the "capacity management" rubber meet the road at the tooling and procedural level, so to speak? I am familiar w/Aperture but not much else.

I do have a use case:

"For a given work order concerning a power distribution unit, what services might be impacted"? This needs to be solved for the general on-demand case, informal research is deemed no longer sufficient.

Thoughts?

Charles T. Betz
http://www.erp4it.com

Power/Space Management

Hi Charles,

For allocating space and power there are a number of issues which make it difficult for home grown databases.
1. There are things in cabinets which don't impact usable space but are still there such as vertical power strips, side mounted patch panels etc.
2. Blanking plates (needed for air flow) take up space, but can be removed so searching for rack space needs to include/exclude these
3. Cabinets don't tend to be general purpose, so you have to have attributes for cabinets such as function (server, network, etc.) , network side (green/red), rated power in addition to model.
4. Power is not always delivered via power strips as Unix boxes and high end switches have direct 16A/32A feeds so there is no "power strip" as such. Plus there is the multiple power connections needed (lasr week a Sun box had 12 power connections alone
5. Cabinets need to be zoned for cooling purposes to ensure that local cooling capability is not exceeded.

Existing "Best practice" in this area is to maintain views of racks, power, assets, space in a number of tools and spreadsheets, with manual correlation between teams for planning choices. For forecasting and allocating purposes the technique I use is to create a dummy device called "Request" which have attributes for power, switch ports, U height, requestor, date etc. so you can capture a vague request for "3 Cabs" of servers into the existing repositories. Its then easy to search and filter out current and projected space, power.

For your use case of identifying the impact of a PDU power down. We are just finishing a piece of software called an impact analyser to do just this. The way it works is as follows;

Stage 1 - Physical Connectivity
1. Select a PDU
2. Trace all connected power strips and directly attached devices to the PDU
3. Trace all devices connected to the power strips
4. From the results, filter out servers and other service mapped devices - add to a bucket
5. For any switches, san controllers etc. then trace their connectivity to end devices
6. Filter our any new servers and add to the bucket.
7. We now have a list of potentially affected devices.
8. Look at the list to see if they have alternate connectivity for resilience

Stage 2 - Service Impact analysis
9. Present the list to the "CMDB" analyser and choose target CI level - software, services, systems
10. Bring back a reconciled list of services potentially impacted
11. If necessary consult or draw a service map to recognise the nature of the mappings between the servers and services.

It was a good question! It also shows why the "service focused" CMDB needs to be available, once the physical components have been identified. If you want to get difficult, try to identify the impact of taking down a building, room, cabinet on services.

It was very convenient that I was just testing this bit of code so I could rattle off the thought process.

Dave

...or not

...or not

ITIL offers nothing as a measure of a potential SP.

I agree that if I were hiring a service provider I'd want to know how well they perform. That is why I researched carefully before choosing WestHost.

And I have seen the re-negotiation of a service contract between an outsource provider and a client framed very successfully in ITIL terms.

But:

  • ITIL compliance has no meaning, except perhaps as ISO20000 certification, so they can say what they like. Measure on results not promises.
  • Even if I measured their ITIL compliance to my own satisfaction it would offer no indicator, let alone assurance, that their service is any better than a non-compliant provider.

I went on unsolicited user references in public forums as my number 1 metric. And my #2 measure was gut feel on first dealing with them. Nice Utah boys. Straight as.

ITIL offers nothing as a measure of a potential SP, other than perhaps as a framework for a due diligence, and I'd be more likely to use COBIT or ISO20000 for that.

Hi, I am refering not just

Hi,

I am refering not just to ITIL but to any other "good practice" or "framework" including ISO20000.

"Does your web hosting service provider align its delivery of services to ITIL or similar Service Management "good practice" or "framework". "

I agree that we need to "measure on results not promises" - and isn't that what a framework or best practice is meant to help drive / achieve (again not sticking to ITIL here).

I am just wondering do we practice what we preace to our customers?

the Cobbler's Children

You are right that the answer is often no. There is a reason Charles Betz's book is called "Making Shoes for the Cobbler's Children" :-D Plumbers have the worst plumbing.

We can however err too far the other way, which is what I'm on about. WestHost do a fine job, but I might have rejected them because they did not adher to some arbitrary standard I required. See my old post on this for another example.

If a consultant insists that all clients must comply with say ITIL, then they should be fired. Follow ITIL when there is a business case for it. Likewise I don't insist that my service providers do so. I just insist they deliver a quality service. How they do that is their business.

If WestHost's service levels fell then I might recommend they look at ITIL. Or I might just shift my servers.

Hi, I think we can put this

Hi,

I think we can put this to bed now but one final comment ...

How do you define a quality service. "I just insist they deliver a quality service"

I am not saying that every consultant should insist that every service provider should "comply" (if you can) to a best / good practice or framework.

I do think howver that it will become more important that service providers do use a best / good practice or framework as customers are demanding more quality of service at reduced prices. Using an efficient and effective best / good practice or framework can help achieve this. I do not efer solely to ITIL or SM practice por frameworks here.

I think the market wil drive the requirement and the consultants will help (as they do) some will provide excellent help and other the opposite. Now how do we ensure that the consultants deliver a Quality Service ???

acceptable service

An acceptable service is one that exceeds the agreed service levels. Quality service is harder to define :-D

Application of ITIL Version 3 Service Concepts

It's time to get into the Service Mindset!

Here's my take on how ITIL V3 could apply in this case:

ITIL Version 3, Definition of a Service:

“A means of delivering value to Customers by facilitating Outcomes Customers want to achieve without the ownership of specific Costs and Risks.”

Two other imporant concepts V3 offers the ITSM community are the concepts of Utility and Warranty;

Utility: Fit for purpose. Simply put the service provides utility, in other words it allows things to be done more efficiently.
In this case the Service Provider provides web hosting which allows the Skeptic to facilitate and improve the knowledge of the ITSM community around the world. When the site went down the customers (or in this case professionals/bloggers) temporarily lost the utility.

Warranty: Fit for Use. Think of the old V2 process of Availablility, Capacity, IT Service Continuity, Security (it was a seperate book).
All the goals of these processes should have been met in some way shape or form by the service provider.

When the site was down, the processes that comes to mind first is availability (we are all screaming because we can't post or read) and then secondly the continuity seems to be the next thought - did the service provider provide a valuable service even though the site was down and continuity of service was not perfect????

From my perspective, yes. It's obvious because I continue to post and I value the ability to connect with many other service management professionals easily over the temporarily, limited loss of service.

But thanks for noticing Skep, and bringing it to our attention.......

Syndicate content