TiPSTalkTiPSTalk

Alarm Management Commentary from the Hallways of TiPS, Inc
TiPSTalk

The Smell of Mendacity


Faux pas- (Webster's): An embarrassing blunder that breaks a social convention, a gaffe, error of judgement, etc. Perhaps it should now be termed faux-PAS.

One of my favorite lines from a movie was in "Cat On a Hot Tin Roof". When his daughter-in-law Mae makes a positioning statement for his inheritance, the father Big Daddy (Burl Ives) asks the son (Paul Newman) a question. He says "Do you smell that, Brick? (Brick is his son's name.) That's the smell of mendacity." I had to look the word up the first time I heard it as a kid. I often recall this line when I witness some of the things vendors tell potential clients when attempting to position themselves for "inheritance". Most of you can guess just what odor he refers to. This turned out to be a faux-pas on the part of the daughter-in-law.

I'd like to share a recent faux pas we witnessed. It was like watching a train wreck. You just can't turn your head when its happening, though you probably should for your own future sanity.

Faux pas #1- A service vendor releases a 150 page hardbacked marketing brochure, and postions it as a best-seller non-fiction release with truths that have been until now kept under wraps. They even go so far as to claim that their book is "better than EEMUA". That's Mendacity. Further, the so-called book does not address alarm management as a lifecycle issue. Rather, it deals only with rationalization as if that were a means to the end of alarm managment issues. It is not.

I'm here to tell you right now: THERE IS NO DOCUMENT CURRENTLY AVAILABLE, EITHER PURCHASED, DOWNLOADED, OR STOLEN THAT ECLIPSES EEMUA 191 as an Alarm management reference. It is simply the pre-eminent authority on alarm management available today.  All written alarm management works point to it.

Faux pas #2- Since that time, the same service vendor has tried to blur the lines of truth between that book, and another published with the ISA, and their adherance to and ownership of ISA "specifications". They even went so far as to announce the same at the ISA Expo in Houston, where it was almost immediately repudiated by the ISA, and any others involved in the establishment of a specification. Their goal is to spread what is referred to in the sales world as "FUD" or Fear, Uncertainty and Doubt. The hope is that they can make alarm management seem so complex as to require assistance from an outside consultant- perhaps one who wrote a book on part of it.

Faux pas #3- To befuddle this issue even more, they claim the existence of a so-called specification that only they can meet. Since no specification exists, they hope the FUD will convince you one is looming. Truth: The ISA is currently working on a specification, but to date none exists other than the SP-18.01 which deals with annunciator panels. I don't believe they can claim to address that specification since their book contains a specific chapter on the death of the annunciator panel.

Alarm management is neither a secret "void", proprietary to one company, nor is it rocket science. Most of the information needed to perform Alarm Management activities (the how-to) has been published in public documents. (see www.tipsweb.com/amlibrary.asp ) And, as stated, the EEMUA dcument gives a very concise view of the technical expectations. 

So, buy a copy of the new EEMUA 191 Guide, and you will be 80% of the way there. It tells you how to measure, what to examine, and how to go about it. (Order one at www.eemua.co.uk )

If you're looking for a book on the subject, save your money until one comes out from a true expert on the subject- one who has no service sales axe to grind- one that addresses more than just the rationalization side of alarm management. Until that time, there is plenty available in the papers (www.tipsweb.com/amlibrary.asp ) and EEMUA (www.eemua.co.uk ) guidelines.

Mendacity stinks. Don't get it stuck to your shoes.

Rationalization Quality Assurance

Though they didn't really know what to fully expect from the result of the rationalization exercise, they did know that what they received was NOT what they expected. << MORE >>

Total Cost of Ownership (TCO)


TCO is a term slowly gaining acceptance in the process market. It has long been a term of endearment for many other industries- such as Risk Management, Banking/Finance, and others who have been very software- dependent. These were the industries who first gave us computing a la COBOL, and other historical giants. In fact, the first spreadsheets (anybody remember Visicalc, Lotus 123, etc.?) were designed for those industries.

Its not surprising therefore that the processing industry remains a step behind these other markets in acceptance of both technologies, and norms that seem everyday in those markets.

So, just what is TCO? Total Cost of Ownership is a measurement of the cradle-to-grave costs of owning a specific piece of software. It gives recognition to the fact that when you buy a specific piece of software, the price you pay is not your total cost of owning it. There are other factors to consider. As a result of this, the aforementioned industries generally have professional IT software guys who grill you about the underlying architecture of your product. They could care less about the features if the architecture is one that they know to be problematic. These days, most of them point to MS .NET ASP as the preferred platform, with thin-client web browser access to the components.  They've learned that no matter how attractive the features of a software may be, that if it takes a human pretzel to make it work, it just isn't worth the investment.

So, TCO should be top on your list when considering software, and many factors add up to influence it. What are some of those factors?

Ongoing license costs- Is this package all that you need? Are there additional licenses from this vendor, or other vendors that will be necessary to realize the desired benefit from this package? Some vendors try to cloud this picture by doing a very good job of listening to exactly what you think you expect, and giving you a price for only those pieces, failing to tell you that you will not resolve your problem until you expend further investment. Good software vendors listen closely to your problem, and help you to understand the additional software license costs involved and potentially other technologies invovled in handling that problem. We know of many clients who went with a competitor (who shall remain nameless) only to find out that the pieces they really needed were not included in what was delivered, because it was not specifically asked for.The old saying is be careful of what you ask for...

Ease of Use- What will be the learning curve associated with this package? Can I get in quickly and get many of the things I want, or will I need specific training? One good example of this is Microsoft Word. It does a lot of things I will never use, but for the most part, I can get into it, and type up a report, and print it out with very little help. One area to be very wary of in the technical arena is products that are built on Microsoft Excel. On the surface it seems like these would be the easiest to use. On the contrary, remember back to the first time somebody gave you their Microsoft Excel spreadsheet that they had designed. Each of us has learned how to use Excel in our own way, and we like to use it that way. Add to that the fact that when you use it as a product basis, you lock down certain things- so you may even be blocked from using it the way you like. And if the product is trying to do serious statistical calculations, you enter the realm of things like pivot tables, etc. Have you ever used them? If not, you won't like a tool that does. If you are attracted to Excel spreadsheets to resolve a technical issue, I would encourage you to build your own rather than buying somebody else's. It's much less frustrating, and the learning curve of what you CANNOT do is much less. Oh- and be sure that results are exportable to Excel- That part IS valuable whether for further processing, or for writing reports, and sharing.

Of course, most vendors who offer high-difficulty packages make up for it with a deep services offering. Their hope is that you will purchase the (hook) software having seen the demo. You’ll then try to fly into it like the experienced tech guy who demo’ed it to you and get lost. You then immediately phone for help, and experience what the guys at Fluor used to refer to as a change-order-driven income stream. It's not what you expected when you purchased, but you are now trapped. An initial small investment in software can lead to a very large investment in ongoing services to obtain the expected value. I wrote another blog referring to stone-soup software that approximates this problem. One way of checking this is to examine a vendor's historical and expected income. A true Software Vendor expects less than 20% of their income to come from services. And that means on any given project (so examine the quote for this balance) and for any given product. Note that some companies sell certain products that almost install themselves, but have higher services on others. If the latter is a smaller percentage of their business, they may skew the numbers when talking with a prospect to invite them into the trap fro the services-oriented piece.

A side note in the Alarm Management field here is that we’ve learned over the last few months that many people went to certain service vendors to get a rationalization performed. They didn’t really know what to expect when it was performed, but they did know that what they received for their money did not meet their expectations. (there’s some deep irony in that statement.) In fact, they’re still not sure just what they got for their money. TiPS is soon to release some standards that will help to shed some light on this issue, and increase both the expectation and understanding of the recipient of such services.

Speed of Processing the job- This often has to do with work flow built into the product, and the basic product design elements. It also gets into issues like ease of installation. Has the vendor made the software so it is easy to install, configure, and process the information necessary to accomplish the job? Or, will they be on site inventing a new way to tie their so-called OPC driver into your system, while you teach them how to do it? The sooner you start receiving results, the sooner you can recoup your investment. Conversely, the longer you spend making it work, the more its base installation is costing you, and the more man-hours are adding to TCO- whether they are apparent or not.

The next piece of that is of course, how much they have automated their product to perform the task at hand.  Did they build the product with the mindset that they would augment it with services, so it was not necessary to automate its actions? Or did they build it with the hope of making it so powerful they would never have to visit your installation site? Have they strived to make most functions have two-click access to the underlying information? One easy way of judging this is to find out how many of their clients are using the product to accomplish the end-use goals themselves. Is it greater than 50%? Or do they try to make excuses for how busy their clients are, and how this operation is one that just shouldn’t be tried by those not familiar with its intricacies.

One other test for this is to notice how many times they talk about what they have done for their users versus how many things their users have done. A very experienced technical person may often  be the sign of a very weak software package. The more they try to impress you with their own knowledge, and the less they try to show their software architecture, the greater chance their software is not designed with automation in mind.

Note that in some particular instances, software simply cannot do everything, and services will be necessary. In those instances, you’re looking for the package that requires the least amount of services, having built in as much automation as possible. The package which requires the least on-site time usually wins this category.

Costs to Upgrade- What will be the upgrade path? If it is a client server application, do I have to install a client package on every platform that wants to use it, or does it use pure thin-client architecture, requiring only server installation? The cost for this difference can be HUGE, and is the least examined. Many try to cloud the issue by having certain parts of their application be thin-client oriented, while the upper level application layers require additional client installations. The other dodge is to offer a "web-based" wrapper that accesses the legacy applications via a "look-in" window. This is very dangerous, and maintenance-prone. Think through the process of what will happen when Microsoft releases a new Service Pack, and you require an upgrade installed. The average software company has one major release a year, and almost quarterly releases for patches to accommodate Microsoft issues, etc. What is your corporate upgrade policy? Does it guard against periodic upgrades? What is the danger of being caught a few releases behind? What is your potential cost to stay up to date with their releases? Get a true IT professional involved in the negotiations to help examine these costs.

Training Costs- These costs go hand-in-hand with those mentioned above. Sometimes you may find that the company either requires extensive training, or doesn’t even offer training. The second Is acceptable if the product has high ease-of-use. Ask which different layers of personnel need training? Do you need specialized administrative training for DBA’s or such? How far does the invovlement string continue into your personnel? Does every department have to attend for the software to work? Or is it controllable at the source by a singular person? THese are two extremes, but it illustrates teh difference.

Periodic Maintenance- What is the maintenance cycle on the product? Doe it require constant attention? How often must an administrator go to the server that contains the application to keep it refreshed, and running? How does it fail when things go wrong? Does it fail “gracefully”, and restore itself without a hitch? Or does it require human intervention every time it has a glitch? The cumulative costs of this aspect can be staggering. Not to mention that operations personnel will eventually just turn it off, and you will find it is not running at the time you want its results the most.

Other intangibles- Has the vendor included any specialty features that add to the value, and lower the cost to operate? Some examples might be automated report generation, remote service access, error report generation, and even e-mail of glitches and system issues. Some even provide pager access to prevent extended down-time. Others provide remote service support of the application, so long as the client’s security procedures will allow it. For example, we maintain remote access keys to some systems, and we can actually log into the client systems from our offices. There may be small things which can greatly reduce TCO.

Examine TCO. This is just a primer. You'll find more to it than I have listed. A good look at these costs on the front end can help to avoid a massive unplanned outlay.

 

Alarmaholics Anonymous

I was discussing this post with a friend and he noted that Alcoholism is a serious problem in the US, not to be taken lightly. I know this- my brother has a coin that he is very proud of for his years away from the bottle. Fate and genetics can strike any of us. I know that in ways I don't care to reveal. However, I also know that sometimes we have to look at the world in a lighter context. So I hope you appreciate the humorous side of this posting.

This same friend was noting that one service company out there claims that they have a 7-step approach to alarm management. Another claims a 6-step approach. I'm trying to figure out what it is supposed to cure you of, since everybody that has tried them is still looking for answers. It's like the Britney Spears of rehab programs. They want you to have to return for rehab often and expensively.

This friend advocated a 12-step approach which stole from the famous AA aproach. A few conversations later, and shazaam- the concept was born of - Alarmaholics Anonymous. - We urge you to join if you share the symoptoms, and are ready to solve the problem. Help us end the senseless addiction to more alarms. What are the symptoms? Here are some:

Have you been sneaking in at lunch when nobody was watching and putting an alarm on the system?

Do you need an alarm when you get up in the morning, and perhaps do you prepare one before you go to bed at night?

Do you have dreams about alarms and you can't reach the acknowledge button? (this one came from a client)

Do you feel you can handle a larger number of alarms than others?

Do you keep a small alarming system in your pocket in case you need to be alarmed in a place where you might otherwise have not had access to them? (Crackberry users apply here)

When you arrive at work, do you just not feel right until you've had your first alarm?

If you go to a party, is it important that the house have alarms for you to feel comfortable?

Do you prefer to have your alarms alone? Do you find yourself not wanting to be in a crowd when an alarm goes off?

Do you feel that others don't notice that you are dealing with alarm issues every day?

Do others tell lies for you to cover up your alarm problem?

All of these can be symptoms. Please e-mail me if you have more. So here's the 12-step approach:

1. You have to admit that you have an alarm problem, and you want to solve it.

2. You must believe that you can be helped.

3. You're willing to ask for that help no matter what it means.

4. You are ready to inventory how big the problem is.

5. You begin by admitting to others that you have an alarm problem.

6. You are prepared to resolve whatever defects are found.

7, Now- make the call to ask somebody to help you. (TiPS has operators waiting...)

8. Make a list of all the problems this has caused besides just the alarms themselves. (We call this the situation awareness approach)

9. You fix those problems wherever possible except when the cost is greater than the return.

10. Continue to inventory and track the problem symptoms. (KPI's come in here.)

11. You now enter a greater path of situation awareness understanding, and study all of its nuances. Continuing to keep a monitor on potentials for back-sliding.

12. Have a spiritual awakening as a result of these steps, and want to spread the results in the form of assistance to others who are trapped in the same issues. (Attend TiPS User group meeting.)

This was written with a tongue-in-cheek approach, but as I examine the steps, I realize how close they are to what actually has to be done to resolve an alarm management problem. And I can guarantee you that every alarm management problem is so different that there is no tried-and-true 6- or 7-step approach that will cure them all. Each alarm management situation needs to be considered in its own standing, and an approach crafted to fit the particular situation.

As I write that last paragraph, I also note that there is another service company who advocates that they have utilized four drastically different approaches depending on the circumstances. I believe that is getting a little closer to the reality of the situation. But they negate this idea by then proposing the aforementioned 6-step approach.

Do you have an alarm problem? We can help. We've been helping people SUCCESSFULLY resolve alarm issues since 1990. Come to see us at the TiPS/ Expertune user conference. Held in Beautiful Austin, Texas- we promise to have at your availability the world's leading experts in alarm management. Sure cures to alarmaholism. You can schedule some private time with these experts, and hear their presentations. Product training is available.

Yep- LogMate (http://www.tipsweb.com/products/logmate/) is the sure cure for Alarmaholics. It won't just help you inventory the size of the problem, and get it behind you- it will keep you informed should it ever rear its ugly head again.

Rations on Libations?

If you're about my age, you probably grew up on a diet of Saturday Night Live, with John Belushi, Steve Martin, Gilda Radner and the rest of that wacky bunch. I loved the one when Gilda Radner took off (as Rosanne Rosanna Danna) on the righteous question of what was wrong with Saxophones and Violins on TV? When they stopped her and explained that it was actually sex and violence, she responded in the traditional "Oh-well- Never Mind" for which her character was famous. Recall she also did pieces on the "Youth in Asia" and other memorable fractured phrases.

So-to borrow from Gilda, a lot of you are probably wondering "What's all this talk about rations on libations"? Well, I'm here to enlighten you. And like the "Oh-well- Never mind." I also realized they're talking about Rationalization. Was that ever a shock. I was wondering why so many people were inviting a service company to come in for a project that never seemed to make anybody much better off. Especially if it meant they were cut off on libations. Here's what my research has found.

To start with, I looked up the word RATIONALIZE in Websters. ra·tio·nal·ize: to attribute (one's actions) to rational and creditable motives without analysis of true and especially unconscious motives (rationalized his dislike of his brother) broadly : to create an EXCUSE or more attractive explanation for (rationalize the problem)

So, do I need for people to come to my plant and make up excuses for my alarms? Rationalization as a word and a practice is at the least overused, and is nominally something done to make an excuse for the fact that we put in a bad design to start with. So, let's re-look at alarm systems as a design problem rather than a fix-it problem. If you get the design right, it's fixed for good.

Perhaps we should look at the problem in the way that is supported by the ISA's SP-18 committee. As already stated, the fact that you have a bad alarm system is due to the fact that it was not designed properly to start with. Or, perhaps it was designed properly, but it was not engineered properly with the protections and safeguards in place to ensure that it would not grow to an unmanageable state. That is because there was no attention paid to the alarm system design lifecycle. The upcoming ISA SP-18 specification will pay particular attention to alarm system design lifecycle. The point is that if you put the proper design in place, and install proper safeguards and practices, your alarm system should NEVER become a problem. So, let's stop paying so much lip service to alarm rationalization, and instead pay particular attention to alarm system design and the tools to protect its integrity.

How do I do that?

Alarm management is not rocket science. There's no special algorithms, and no special proprietary secrets necessary to make it work. The EEMUA 191 document is written just for those who need to know how to get their arms around it, and it's not really that bad to read. You can buy it with your credit card at www.eemua.co.uk. If you have designed your system such that it is now out of hand, an alarm system lifecycle review will tell you that you need to rectify that to get it back within a state of control.

In the late 80's and early 90's there was a lot of talk about process optimization. There were people who would come to your site, and optimize your process. When they left, things ran well- all the loop controllers were tuned, all the targets were properly set, and things seemed to run fantastic for a few weeks. A few weeks later- back to square one. They did not install the tools to protect your unit from falling out of an optimized state. In fact, they did not want to install such tools, as that was how they made money. If they did install a tool, it was so rudimentary as to make you need their help anyway.

Such is the state of alarm management that is being sold by a handful of service companies. Yes, they offer software packages, but it is not in their best interests to make that package address your alarm system lifecycle. If it did, their income would drop drastically.

What is the pattern? A team comes on site- does a rationalization, and all is well for awhile, and then we're right back where we started. Why is that? It's because to be successful, we need to institutionalize and internalize the processes associated with good alarm management. And we need to correct the bad practices that allowed the alarm system to get into the shape it is in to start with. These need to be incorporated as a part of our internal processes and practices. Just as we did with process optimization (if we did). And as Mr. Ian Nimmo says (www.mycontrolroom.com) you need to consider all the factors associated with good SITUATION AWARENESS (his term- read his papers).

What is meant by that? It's simple- other than absolute nuisance alarms- all alarms exist for a reason. Unless you discover and resolve the cause for an alarm, it makes no sense to get rid of it just for the sake of reducing alarms. It will simply reappear, or- lead to an incident. Too few alarms can be more damaging than too many alarms. This means that some alarms just cannot be reduced without additional work around the control automation subsystems. Some have learned this the hard way on some of their initial rationalization attempts.

So- just what am I getting at?

Some service vendors out there want to sell you a rationalization project. They would also like to sell you a product which supports rationalization the way they see it. And at least a few of them have a very narrow view of the subject. They do not recognize the lifecycle aspects of an alarm system, just their desire to rationalize. So what do you do with their product once you've rationalized? Well- its only valuable if you need to rationalize again, and it is their goal to ascertain that need will exist. That's rather disappointing if you've designed a complete bid process around finding a product that will help you rationalize your systems. Quite frankly if that product did what it was desired to do, you could use it once and throw it away.

Much like the Webster's definition of the word rationalize, their motives are to create a rationale to hide the true motives behind their actions. i.e they've rationalized rationalizatrion, and made it sound as if it's a final answer when it is not. It's ignoring the causal drivers for bad alarms Essentially, it is akin to fixing a flat tire on your car everyday,rather than finding why it goes flat.

Ignore the noise you're hearing in this area long enough to consider a lifecycle approach to alarm management. Think of your alarm system as a system that needs to be properly designed to start with. If it is not, or was not, then rectify that situation, but be certain that you consider the need to get the system under control. There's a lot that's been written on this issue, but its not attractive because it says you have to alter practices and procedures and perhaps even take on additional automation changes rather than just doing a project and checking it off the list of to-do's.

So, if you have a new alarm system, put the tools on to start with to be certain it is properly tracked and kept within reasonable limits. You'll save massively in the years to come. If you have an old one, consider what is needed to get it back to a state of good design.

Yes, excuses may thus have to be made for your existing alarms, but your mindset will be one of getting it within a realm of control and tracking to be certain it stays there rather than just having to re-do it every six months. And the tools you choose will support this approach rather than one of continuous rationalization.

Place rations on their libations and they might come to the same conclusion...

I'll be addressing this issue more in the coming months.

Do you have alarm floods?

Recently there has been a lot of discussion about avoidance of alarm floods- the holy grail of alarm management. Despite the efforts of hundreds of smart engineers and scientists in eliminating alarm floods, they still plague operators. In my opinion, it is because we keep trying to eliminate it, rather than just trying to help the operator deal with it.

I recently attended a presentation where the discussion was mainly focused on alarm rationalization. One of the reasons given for alarm rationalization was to reduce the number of floods. There was disagreement about what that meant, and several opinions offered. Prior to that I had been at another meeting where half of the group seemed to think they were in a state of constant flood, while the other half seemed to think that they had no alarm issues at all. At this point, I realized that perhaps everybody does not measure a flood using the same metrics. So, that gets us to the root of today's subject - just what is a "flood"?

According to EEMUA guide, a flood is anything exceeding 10 alarms in 10 minutes - or one per minute. I have looked at enough data to tell you many units are in constant flood by that measurement. However, the operating team does not perceive themselves as being in a flood condition. It is simply their normal alarm load. That team considers a flood as a group of alarms that they simply don't have time to deal with before something bad might happen. In other words, alarms associated with an upset condition, an equipment malfunction, or a process wobble, and perhaps a potential incident in the making. Often, a flood consists of unfamiliar alarms since they are outside of normal operating conditions and a range of normal training/familiarity.

That leads us to the unusual conclusion that "flood" as specified by data is not necessarily a flood to the operator. The operators know why the alarms are there, and they are essentially ignoring them because he knows they will not lead to anything significant. Some of them may even be there because they elect to leave them active for any number of reasons. In fact, once innured to the sound of the alarm annunciator, he is so comfortable with these alarms that he'd probably feel uncomfortable if they were not there. It's kind of like the friend with the overly talkative spouse. Did you ever notice how they were able to function through the incessant prattle despite the fact that it was starting to drive you up the wall? The human mind is capable of ignoring things in this way. This is why in the case of many incidents, ignorance of alarms has been noted as a contributing factor. Ignorance is a strong tool, and is in fact can be a good quality in the right circumstances. Soldiers learn to focus on their goals with bullets flying over their heads. Air traffic controllers learn to analyze situations under circumstances that are increasingly pressurized. Beekeepers learn how to ignore even more.

So, how does this relate to EEMUA guidelines? Simply stated, EEMUA guidelines are not applicable to a unit until that unit has observed, and put into practice ALL of the EEMUA recommendations for alarm management. Paradoxically, if you do not clean up your nuisance alarms, you will not pass the EEMUA spec of normal alarm state, yet your operator may not feel that he is in any state of disrepair. The EEMUA benchmark becomes vital only at the moment you accept the first tenet of the EEMUA documant:: Every alarm has to have an associated operator action. Subject to those conditions, every alarm that occurs beyond the allowable level contributes to a flood. On the flip side, every alarm that does not require action may not be contributory to a flood. From a purist viewpoint, it is, but not necessarily to the operator who is ignoring it. So why all the rush to rationalize alarm systems to reduce flooding? Does it really work?

Our data shows rationalization costs a lot of money, but doesn't necessarily solve the problem. Thus the reason for all of the papers purporting new cost-effective methods. In fact, our experience shows that the most bang for the buck comes when you get rid of all nuisance alarms, and using the information around the others to point you to the situations for which they exist. In other words, it points to Situation Awareness as the cure. This is contrary to all the written expert opinions currently in print.

Much study has been done surrounding this problem by the ASM Consortium (ASM is a registered trademark of Honeywell). Their members were the first to explore alarm rationalization some 15+ years ago. Shouldn't that mean they are now satisfying the EEMUA constraints of the EEMUA 191 report which they co-produced? Their evidence says no.

See the paper they published on this issue.

Yet, their members don't seem to think they have drastic alarm problems, and many feel they have the alarm problem pretty well in hand. Not totally resolved, but in hand. And that's fifteen years after having started. Again that points to the fact that perhaps the problem solution does not lie in all of the things they've tried. Again- I point to a study in Situation Awareness as the required answer. That is the real jewel they have uncovered.

As a short diversion, another path has been taken lately that deserves some attention- tools that will resolve the issue once you have fixed the basics. These tools are based on principles for smart alarming, or state-based alarming. I have met lately with DCS providers and learned that tools are coming which will address this situation in most new DCS systems. I think you'll like them once you see them. Unfortunately, most require system upgrades to recent releases to make use of them. Where does that leave us with respect to legacy systems which will be with us for many years to come?

It has not been TiPS policy to recommend tools that attempt to handle alarm dynamics POST-DCS. There are many reasons for that. Read my post on dynamic alarming to see a few. However, we have seen some tool sets that deserve consideration.

The first is from a company called UReason. See their website at http://www.ureason.com/ . These tools allow for a super-imposed alarm handling screen that "subsumes" alarms, creating a display that makes more sense. For example, it will recognize patterns, and reduce the alarm count automatically by knowing such things as when a pump shuts off, or when a start-up or shut down condition is occurring. Their OASIS system will use pre-designed filters to deliver only the information pertinent to the situation. In other words, those seventeen alarms you once received will be subsumed, and replaced by a single alarm that tells you the pump is down. Or, if the pump is automatically replaced, it may not bother the operator with the information at all - only sending its data to maintenance for service follow-up.

Note that the level of attention and maintenance for such a super-imposed system is increased to levels beyond even the maintenance you have NOT been providing to your alarm system. However, for those who have gone to the trouble of cleaning up their alarms, this could be a next logical step. WARNING!! DON'T TRY IT WITHOUT HAVING CLEANED UP THE BASIC ALARM SYSTEM FIRST. As a voice of experience from one who has tried it both ways, I can tell you that it has no chance of success if you don't do it in the right order. Also, don't try it in conjunction with complex state-estimation models. They have never worked, for a variety of reasons. Handle only the simplest configurations first, increasing your complexity to a level as you see the operations team can support it. The rule here is that if it must be maintained by engineers or mathemeticians, then operators will not receive sustained benefits from it.

To be fair, I should state that I have seen cases where complex model-based systems did work, and were maintained past their initial installation for some period. All of these successful examples share two common traits. The first was that the problem could not have been solved any other way. The second was that the value of solving the problem was so great that it justified the PhD mathemetician who had to be kept on the problem. Note that one other issue was that the PhD mathemetician's interest was also maintained because there were enough twists and turns to the solution that it required his ongoing inventiveness to keep resolving new issues. Perhaps more of these would be in existence if somebody were to offer an inexpensive and efficient expert system package...

We have also seen products from a company in Louisiana called Prosys. http://www.prosys.com/. Having not seen their tools first-hand, I can only guess what they have from descriptions of their products. The approach is similar - a replacement of the DCS alarm screen, but giving a more powerful view to the operator. My understanding is that they have a few examples of these having been installed and maintained for an extended period.

With proper implementation of these tools - not trying to take on the world, but simply to give the operator better information, it is possible that flood management - the holy grail of alarm management - may be within reach. We're certainly heading that way.

The ROI on Alarm Management

I was at a mixed-company training a while back. There were companies there who used software from various vendors, and had solution projects performed by the same, and a greater mixture of "alarm management newbies"- i.e. those only recently addicted/afflicted. The question of ROI for alarm management came up. The Newbies wanted to know how the experienced ones evaluated ROI for the effort.

The comment from a battle-scarred veteran was "Yes- our management is asking that same question. They want to know what they have received back for the $x Million (it was a shocking number) they've invested in all of these rationalizations, etc. They noticed that the control room seemed a bit quieter, but they still saw the same shut-downs and plant upsets as before."

Their answer to management was that they wanted to work with the same vendor who had not yet solved their problem to invent plant state estimators, and develop a smart alarming system. WARNING Will Robinson! Aliens approaching. Be careful of what these aliens promise you. You'll be stuck with the result, and they'll exigrate back to their home land. Management's response back was "You mean we haven't already got that for what we've paid?" I repeat my mantra. You ain't seen an alarm flood until you've seen a smart alarm flood. As a veteran with battle scars in that area, I've seen too many brilliant people wreck their ships on those shoals to bet that a service provider is going to resolve this problem.

Don't take this wrong- the vendor they hired did just as promised, and did a good job of rationalizing their alarm system. Its just that at the end of that alarm rationalization rainbow, the leprechaun didn't have the pot of gold they'd hoped for. This is not necessarily the service vendor's fault (though promises were PROBABLY made to get the project started). It should have been obvious that a simple rationalization could not solve the problems that made the system get bad to start with- after all , it didn't change any processes, just an underlying configuration.

Again- a warning. This alarm management stuff is NOT rocket science, and reducing alarms is not the end resolution of your problems. Alarms are the automation backstop of the plant, and thus the best indicator of your problems. But, other than nuisance alarms which are easily dispatched, the real bottlenecks lie elsewhere. So, perhaps you can save all that rationalization money and invest it instead in the bottlenecks your alarm system is pointing to. Upgrade your control room. Redo your graphics. Increase your maintenance interval. Install better communications. These (and their cousins) are usually the root cause of alarm problems.

So- lets take this smart alarming thought process a step further- because I've been there and done that. Suppose you and I- with our vast intelligence- sat in a room and came up with every possible scenario the plant might face, and designed an expert system that would automate the plant through and past each problem. Well, guess what- we'd miss. And the ones that caught us would be the no-brainers that we were too smart to think of. We used to say that the solutions we offered didn't solve the problem, but only confused you more than before. However, we felt you were confused on a higher level and about more important things.

Alright- where is the ROI on alarm management? Actually, it's pretty easy, and it has to come from operations. How many times have you lost a pump, and how much did it cost? If you can avoid a lost pump due to a clearer message, what does it save? The answer to this question lies in benchmarking pre- and post-alarm management effort. Problem here is- if I do prevent a lost pump, how can I claim that savings if I don't know whether it would have happened otherwise? I cannot run parallel timelines and see what happens in another universe with different decisions...

So, let me help you get true ROI out of your alarms management investment. Don't bite off too much too quickly. Put the measurement tools and benchmarks in place up front. Let the data tell you where your problems lie. If you have a problem that can be solved by alarm rationalization, analysis of the data will indicate that. If your problems lie elsewhere, analysis of the data will indicate that. But don't put the cart before the horse. Don't spend $5Million on rationalization exercises that will produce no measurable effect. If you've got $5Million to spend, let the data point you to where your problems lie.

I suspect you'll learn that your alarm data will point you to several areas that seem obvious once you've seen the data. Oh- and stop looking at alarm management as an ROI exercise- it's NOT. It's really a RISK MANAGEMENT exercise. You should be looking at the dollars you are putting at risk by NOT optimizing your alarm system. This is always the scarier side of this equation, but infinitely more realistic. We'll have more articles dealing with this issue in the future.

The Value of Process Data to Alarm Management

For those of you who haven't seen it yet, TiPS has an add-on feature to our LogMate product called TRAC. http://www.tipsweb.com/products/logmate/forensics_trends.asp This product was something that came out of our relationships with pharmaceutical clients. Those clients have a great need to have alarm data and process data together in the same spot to assist with variance reviews. In a straightforward implementation of the best of Microsoft, we are able to extract the process data, and align it with alarm data for any given period. There's even a relative time capability so you can view it batch-by-batch.

Cool- huh? But, that's not the data I want to discuss. The item I'm interested in today is that of the process data's importance to alarm management during the rationalization exercise.

Rationalization is something which TiPS advocates should be internalized as much as possible, and in most cases, should be under the realm of operations management. I expand on this philosophy in a two-part technical paper series. YOu can see the first installment of that series at http://www.tipsweb.com/downloads/A_Practical_Approach_to_Alarm_Management_Part_1.pdf

The process of rationalization often is one of the few opportunities for the DCS/ instrument engineers, control engineers, process engineers, maintenance and operators to sit together in a room and talk about how operators use the control environment. The tools involved are what each of these other factions work to have available for what the process engineer hopes is making what is planned. As such, there is a lot revealed about just which pieces are used, and how they are used, maintained, and intended. This process is thus very valuable. In fact, it is so valuable that it should not be shopped out to a vendor, except as an enabler, or as an internal partner to the process. Much of the time spent with a vendor is in the process of educating them to the level of what your internal people already know. The exception here is that alarm management specialists DO have a knowledge of the process necessary to accomplish a good project, and a partner in this process has a stake in making things work. Bottom line- avoid strictly project work except to fill in the blanks.

I used to visit an operations manager (who shall remain nameless) in Canada who told me a story which I'll share.

"I've looked this process over from a historical standpoint, and discovered we're operating at about 150% efficiency. I base that discovery on the fact of all of the justifications that have been put through for all of the projects we've performed over the past 10 years. Each small project was justified to get, and post-audited as receiving x% improvement. Well, all those improvements added up equals 150% or thereabouts. Now, we both know this can't be true, but it's taught me one thing. That is the importance of having continual projects in the plant. It seems that projects cause people to be interested in what we're doing, and they continue to fight against the natural tendencies for things to return to chaos. (remember the second law of thermodynamics- all things tend to greater randomness- it's still true). So, I like to keep projects going that keep people communicating and pushing each other. Otherwise, they get bored, and things fall into disrepair."

That's an interesting concept. I get from that the need to communicate to one another what is going on, and what is working, and what is not.

Years ago, I was doing some modeling on a unit. I was using one of those fancy Neural Net packages. (Not the nets with scribed steel handles- those would be knurled nets- totally different) Again- no names. I was told by the engineer who gave me the data that two specific variables were the most important variables to the control of the unit. Guess what? They were flat-lined. The engineer thought this was possibly correct, as they needed to be relatively flat to accomplish tight control. No- I mean they were flat-lined. NO data change at all. Just one value. In talking with operations, those sensors had been dead for over two years. They had developed other methods to accomplish control (essentially things which mirrored the action of these variables). Without that discussion, and a review of the data, the engineers would have never understood how operations was controlling the unit. This is just one story among many where similar discoveries were made. Neural nets were cool for the fact that they renderd the sensitivity, and data range of the variables involved in the model. I could go on, as that's fun stuff to talk about, but let's get back to how this relates to alarm management.

The point here is that the rationalization process allows a unique opportunity for a thorough operations review. TREAT IT AS SUCH, and you will benefit from more than a rationalized alarm list out of the exercise. One of the most overlooked items in this process- and you will not see this in any book, guidelines, or paper yet published- is examining the data activity around the alarms you assign. What does that mean?

Within the rationalization process (assuming you rationalize the complete operating unit), you will end up with a prioritized list of alarms. For at least the top priority alarms, and safety critical alarms, examine the data activity under these alarms. If you have time, you should do it for ALL configured alarms. To do this, you will need access to your process data historian, and the statistical toolkit it contains. Examine the data distribution, range, and standard deviation of each of the sensors you have rationalized. What do you see? Is it what you would expect? Why or why not? Have you assigned a high priority to alarms on sensors that are flat-lined? Have you assigned a low priority to an alarm on a sensor that is all over the map, and reflects something important to the process? This is just the tip of the iceberg of the discoveries you will make with a process data review of the important variables.

With this data in hand, and the process specifics it reveals, intercommunication between these factions of your plant is priceless. As my Canadian friend would point out- it forces communication for the purpose of a project that might not otherwise have happened. It gives people who might rather avoid each other a reason to have to talk. And that's valuable to your operation.

Lastly, let me take this opportunity to also issue a warning. There is a practice being proposed by a service company which they call "focused rationalization" or something like that. RUN AWAY from this methodology. I point to my earlier logic to help you understand the danger of this approach. The method of this madness is to focus rationalization efforts only on the alarms which have activated recently, or ones you designate as important. On the surface, this may sound rational.

Where does it fall apart? If you have alarms which do not activate because they are flat-lined, they may not get rationalized. If you have a sensor which is being used by operations to take the place of that sensor, you may not even place a properly prioritized alarm on that sensor. And even if one is identified as important- is it valuable to have an alarm on a flat-lined sensor?? A review of the historical process data would reveal this. Additionally, we all know that by Murphy's law, it will be the alarm which hasn't activated- and the operators therefore had little experience with, that will be the one which will cause the next incident- the one we ignore because it never activates. This method must have been hatched by somebody with very little statistical process data experience. It is statistically unsound, and could be very dangerous to your process. Unless you like talking to regulatory bodies and lawyers.

TiPS advocates that if you are looking to reduce your rationalization effort, it must be with a consideration of overall process unit risk. And any unit which is slated for rationalization must be FULLY rationalized. You can read more about this in our recent two-part series on Practical Alarm Management. The first part is available on our web site now, and the second part will be released next month. I hope you will read it, and comment.

Thin Clients- Is software history repeating itself??

The historical context:

In the late 80's software was passing from the mainframe domain to the desktop PC. It was an exciting time. Those software companies who had invested huge sums in mainframe programs told their clients that PC's were "just a fad" that would pass.

This was a mistake. They failed to realize the inherent desire of engineers to be in control of their own technical solution domain. They also missed the fact that the mainframe IT organizations were back-charging their departments with horrendous usage costs for each "CPU-based time unit" that they made the mistake of using. This charging method meant that users tried to limited usage (or their boss did) to avoid interdepartmental charges, and that was exactly the opposite of what you wanted engineers doing when they were trying to resolve tough process problems.

Those software companies who saw the writing on the wall invested in the major task of rewriting their software to take advantage of both the new availability the desktop machines offered, and the desire to have it there. Then there were others. They saw the desire, and understood the market change, but didn't want to make the investment, hoping they could continue to milk the investment out of what they had already written. So they wrote "wrappers". These were interface programs- shells essentially- that allowed the user to access program basic setup within the PC environment, and then fed the information in batch form to the mainframe. In some circumstances, the complexity of the application justified this approach, as PC's of the day weren't quite powerful enough to replicate the application. In others it was simply laziness and greed, or perhaps some hope that the naysayers were right- it was just a fad, and they were better off waiting until things changed again.

This pattern is repeating today in traditional process industries.

History repeats itself

Unfortunately, in the process industries, there is not a large appreciation for software standards of practice. This is not true in some other industries- financial, and risk management are two that come to mind. As a result, many of the software vendors who service the industry are hoping that most of their users are not aware of the thin-client revolution. The cost to redesign their applications (again) could be horrendous.

Living around Austin, Texas, we have seen software standards taken to their ultimate test. We were in the center of the dot-com (dot-bomb) revolution. The software platform necessary to support the dot-bomb revolution had four basic requirements. It had to be quick to develop; it had to be easy to install; it had to be easy to use; and it had to be cheap to maintain. Without those elements, you could not expect to sustain the minimal profit margins that these companies were expecting to make up in volume in a quickly evolving market. In many cases, it turned out that the volume was zero, so low percentages of zero caused for shuttered companies (and a flood of used Herman Miller chairs on ebay).

Shuttered companies aside, the design techniques and development platforms to support this software revolution are newly available. They are phenomenal in their value and reach. And they enable better software. Those willing to make the herculean effort to redesign on these platforms realize the instant resultant lower TCO (Total Cost of Ownership) for their clients. They also guarantee extended viability for their products as interoperability issues come to the forefront. this is very similar to the revolution caused by the introduction of the PC desktop in the '90's.

I want to sing the praises of the MS ASP.NET architecture. TiPS Inc. has made the large investment over the past few years of redesigning our products to work within this architecture. As a result, we become instantly interoperable with any and all other products built under this framework. Our clients avail themselves to our products over a true thin-client route that requires a dedicated investment to undertake and maintain.

unfortunately, this architecture allows the use of a ".NET wrapper" just the same as a PC allowed the use of a menu-driven batch file to be created. At any rate, it is only a wrapper- not a true thin-client application.

Buyer Beware

I recently saw a company advertising their "software suite that was built on MS. NET" Hah- the "suite" is a container built on .NET that then allows web-browser interface to these legacy applications. Their products they want to sell you under that container are still legacy products built several years ago. I guess the container or shell makes it viable to say the suite is "under" this framework.

In other industries, as mentioned before, the first questions asked of any software sealesperson are the right questions. They stop you mid-feature description and ask: Is the APPLICATION itself built on a thin-client architecture? If not, what is it built on? Any other answer causes no further questions to be asked. Yes, they will allow you to finish, and polite ones don't even look at their watches. These users have learned the Total Cost of Ownership (TCO) has very little to do with features of the software. Modern architectures allow any software to add features on request. But a bad architecture will live with you for many years, and make you wish you had asked certain questions up front.

A quick check you can employ is how does the application deploy clients? Do they have to be installed (even if automatically) on each machine, or do you just open up a standard application such as Internet Explorer, and have instant access to the application? Lack of a true thin client is a sure sign of a cluged application infrastructure. True .NET applications make full use of web services that is integrated into the microsoft framework.

Other signs to look for are: What about integration with MS SQL, and true real-time access to the SQL architecture? How long does it take to perform a query on real-time, instantaneous results? Are results available in real-time, or does data have to be batched in for occasional review? Does a different application require installation of another module?

What about ease-of-use? Does it seem intuitive to use? Can you just click on things and start to get answers? Is there an effort towards two-click resollution? Recall that this is the model that Microsoft brought to the industry. As an example: Though MS Word will do a lot of things, you can get in quickly and make a letter without ever studying all the intracacies. True industrial applications should offer this same level of ease of use. True .NET software developers strive towards two-click resolution.

Lastly, what about upgrade and maintenance issues? How much disruption is the release of a new revision going to cause when I need to install it throughout my system? Do I go to one point, and I'm done, or do I have to find all users, and upgrade their clients also? Vendors who are short on architecture usually try to lead the argument with features, or point to their strong services surrounding their software.

Review these issues when you are studying software purchases. You'll find that for many, software is a side-line to their main business- normally services. Such companies cannot usually be expected to keep up to date on the latest software development issues. As I have already mentioned in a previous post, that is not important to them if they are also going to be the users of the software. They know the command-line syntax to get around all of their software quirks.

We perform to these standards, but then, we live in the middle of the Austin software think-tank. The'd laugh us out of the neighborhood if we didn't perfrom to standards.

In sharing this with you, I am revealing the biggest fear of any software company. Someday, somebody will write the operating system that is so easy that 16 year olds will build what we've invested so much in within a three-day period. The tools will just be that easy. It hasn't happened yet, but the new software out there truly does offer levels of development speed and inter-operability that were previously unseen. For those few of us who have invested in this infrastructure, our clients will reap the benefits for many years to come.

For those of you who wish to understand this a little better, study the way vendors advertise their software in other industries (CRM, for instance) Go to their websites, and review their software descriptions. They don't even mention features. They just talk architecture. Sometimes, its difficult to even figure out just WHAT their software does (feature-wise). That's because their market is matured to the point that several generations of software maintenance have shown where the true costs of software reveal themselves. So, be aware, and ask the right questions, and watch for uneasy squirming. Educate yourself to, and examine what is termed "Total Cost of Ownership" (TCO) issues. Your company will be glad you did.

Drinking Your Own Bathwater

TiPS as a company is a PURIST in two aspects.

1. We focus 100% on software development and deployment.

2. We focus 100% on alarm management.

Why is this valuable to you?

Using your own software to do project work is somewhat like drinking your own bathwater. You probably can do it occasionally to let you know how bad it is, but you don't want to make a habit of it for fear that you begin to think its not all that bad.

Software by its nature is like a child to those who develop it. When the first person makes the first remark about the lack of inherent beauty, you are ready to take them to battle. However, it is only those who are willing to listen to the voices of others who ultimately win the war and actually make battle-hardened software. If I'm willing or employed to use it myself, I simply elbow the naysayer aside, and go to work.

Let's examine the difference. If I always use my own software, then I don't really care about its inherent lack of usability. In fact, those of us who have ever used command-line programs know that they are the best. Once you learn to speak the language - you can bypass a whole level of manipulations and go directly to the point of solution you desire. This is great for those of us who want to immerse ourselves in one program, and live with it day-in and day-out. But what about those who have more things to do than just run one program all day long? That portion of society needs software that is designed with ease of use in mind. And for that, there is nothing more telling than having to sit NEXT to somebody else who is using your software, and finding where they get trapped. There's a goal in the software purist side of the industry for "two-click resolution". That means you strive for no more than two clicks to the next drill-down the user needs in any given situation. We're not fully there (nobody is) but it is our goal.

As a software company, think about the trouble we would get into if we just made our software hard enough that they had to hire us to use it. Or scared them into the "don't try this at home" model.

At TiPS, we instead try to make a practice of working elbow-to-elbow with our users. In fact, for many years, that was probably our biggest weakness. Rather than convincing the managers to buy our stuff (i.e. in volumes and based on a good PowerPoint presentation), we were convincing the users to buy it. And they did. In fact, we have more alarm management customers than any two of our closest competitors. And that says something. Our software is the clear choice of the end user. That is because of this closeness we've shared with them and our ongoing attempts to be certain that somebody else besides us could use our software.

It is TiPS' continued goal to make our software more automated, easier to use, and more adaptable with every line of code we write. Its not good enough to just perform a function, and add that to the laundry list. It's only good if the end user can easily access that function, and it fits with every thing else he has to use the product to do (i.e. he doesn't have to load up another application to do it. Does this sound like anybody you know??).

Don’t get me wrong. We use our own software, we just don’t solicit project work with it. But we do run through terabytes of data, and set up demos for people on our servers with their data, so they can see how easy it works. And we do come up with new and easier ways to analyze data by the continuous analysis we perform on data sets.

That leads to the second point. Because we focus 100% on alarm management, its what we do best. It means the conversations in our halls always revolve around how we can do something better for our clients. And we don't worry about how well that fits with our latest driver sales or advanced controller, or whatever. It simply has to be able to resolve your alarm management issues, and do that task better than anybody else.

Two recent major corporate contracts signed with companies who did extensive product analysis (both were approximately 6 months in duration) tell us that we're doing something right. We intend to continue to do those things right, and continue to do them better than anybody else.

If anybody asks you if TiPS will come out and perform a job with their software, just tell them, "No. They automated the task for me, and I was able to do it myself. And I used less effort than if they came on site, and I had to teach them all the ins and outs of my plant." If you do need extra manpower to perform a full rationalization, there are others who provide that service, and do it quite well.

Oh - one last point. We will NOT take your data and share it with the world. Though we've reviewed more alarm data than any other company, I don’t know ten customers who want us showing ANYBODY how bad their alarm system used to be.