What is RCM ?



RCM has been successfully applied to ground industries for over two decades
RCM2 is a logical discipline for the development of efficient scheduled (proactive) maintenance programs for complex assets, and the on-going management of such programs. These programs are called reliability-centered maintenance (RCM2) programs because they are centered on achieving the inherent safety and reliability capabilities of assets at a minimum cost. RCM2 is used to define maintenance strategy. A central problem addressed with RCM2 is how to determine which maintenance tasks, if any, should be applied to an item and how frequently assigned tasks should be accomplished. The net result is a structured blend of experience, judgment, and operational information to identify the right work at the right time. Today more than ever there is a burning need to increase asset performance while reducing costs. This is not a new concept as a typical asset undergoes between 30 and 50 reliability, maintenance and production optimization initiatives per year. A key to ensuring success is to shift from the conventional repair and modification approaches to a focus on failure consequence mitigation – the core principle of RCM2.
In the past 20 years, we have applied RCM to over 2000 sites in 80 countries in every major ground industry over the six continents.
Read Chapter 1 of our RCM book

Humanity depends more and more on the wealth generated by the continued operation of highly mechanized and automated businesses and services. More than ever, these depend in turn on maintaining continued integrity of physical assets.

Economic demand for greater mechanization and automation means that reliability and availability are increasingly important to the bottom line. This also means that more and more type of failures increasingly affect our ability to sustain any quality standards. More and more failures have serious safety or environmental consequences, yet society's demands of better standards in these areas created situations where survival of the corporation or industry is dependent of the integrity of our physical assets. Also, at the same time as our dependence physical assets is growing, so is their cost - to operate and to own. To secure the maximum return on investment, they must be kept working efficiently as long we want them too. To achieve this, the cost of maintenance has been increasing in absolute terms, and in many industries is now the highest operation cost, so there is pressure to reduce spending on maintenance. How do we nationally balance all these conflicting demands?

We are more and more dependent on the wealth generated by the continued operation of highly mechanised and automated processes. We are also becoming increasingly dependent upon services taken for granted, such as the uninterrupted supply of electricity, trains which run on time, clean water and efficient waste treatment. It is amazing how much these in turn have come to depend on the continued integrity of physical assets. Assets which when they fail not only erode our wealth and upset and inconvenience us, but sometimes threaten our very survival. Equipment failures that lead to serious accidents not only affect the equipment operators, but are the leading cause of major environmental incidents - incidents which have become bywords such as Amoco Cadiz, Chernobyl, Bhopal and the Piper Alpha oil platform in the North Sea.

More and more corporation are beginning to recognise the extent to which society as a whole can be affected these incidents in addition to their potentially catastrophic financial implications. Not only did Piper Alpha kill 167 people, but insurance payments amounted to HK$1.2 billion and one quarter of North Sea oil production was shut down for several months - all because one crucial piece of equipment was not functional at one critical moment. As a result, businessmen everywhere are starting to place an increasingly high priority on learning more about what must really be done to manage equipment failures. This is especially true in Hong Kong following the tragedies and potential disasters which have occurred in recent months.

Now in all major corporations, maintenance is more than keeping equipment running, it is 'managing equipment failures'. What is more, most people believe that the more often and the more thoroughly something is maintained the safer it will be. In recent years, this has led to the development of preventive maintenance programmes which place great emphasis on comprehensive equipment overhauls at regular intervals.

However, following intensive research into the real causes of failure, the international civil aviation industry established some years ago that a surprisingly high proportion of the failures experienced by aircraft occurred wither soon after they were put into service or soon after a major overhaul. Similar finding in other industries have shown that regular overhauls very often - but not always - do more harm than good.

However, it is equally true to say that if equipment does not receive the right maintenance at the right time, it can also become unreliable and sometimes very dangerous.

This apparent contradiction led to a thirty-year search, beginning in the early 60’s, for a way to establish exactly what is meant by the term “maintenance”.

In the early stages of this search, it seemed that there were more questions than answers. Firstly, it began that there are several different methods of managing failure, each of which is appropriate in different circumstances. This led to a further search for a simple way of deciding when to use which method. When a universal decision framework had been developed - this alone took twenty-five years - it then became necessary to decide who was in the best position to make the decisions.

Once all of these questions had been answered, a process began to emerge which enabled people who use it to transform the effectiveness of the maintenance function in a remarkably short space of time. This process is known as “Reliability-centred Maintenance”, or RCM. To understand RCM in more details, it is worth looking more closely at how it was developed.

The First Step

was the discovery that there a large number of legitimate ways of managing failures. These can be divided into five broad categories.

The first category includes well-known routine overhauls, where equipment is taken to pieces and reassembled at fixed intervals, and cases where components are replaced at fixed intervals without checking their condition. This category is generally known as preventive maintenance, and is occasionally still appropriate, but far less often than first thought.

The second category covers vast array of techniques known as predictive or condition-based maintenance. This entails checking equipment regularly to find out if it is in the process of failing, and taking corrective action only if it is needed. We do this when we check the oil level in our cars - if the oil level is OK we take no action, but if the level is low we add oil. In industry, if sufficient warning that a failure is about to occur can be obtained, then this method provides engineers with time to collect the necessary spares and plan remedial action in a way which minimises disruption.

The third category covers a special but increasingly common class of equipment. These are protective devices which usually can fail in such a way that no-one knows that they have failed unless someone makes a special point of checking. These checks differ from predictive maintenance because they entail checking whether an item has failed, rather than checking whether it is failing. An apparently subtle difference but one which profoundly affects both the nature of the check and the frequency with which it has to be done. These checks are known as detective maintenance. Typical examples includes periodic checks on smoke detectors and burglar alarms to find out if they have failed.

The fourth broad category of failure management techniques is known as corrective maintenance. As the name suggests, this entails fixing items either when it is immediately evident that they have failed of their own accord (normally referred to as a 'breakdown'), or when they are found to be failing following a predictive maintenance check, or if they are found to be failed after a detective maintenance check.

The fifth and final way of dealing with failures is to change of design of the asset in such a way that the failure no longer occurs, or if it does occur, to change the consequences of the failure in such a way that the failure no longer matters. Redesign usually is the last resort. This is because the engineer on duty today has to maintain the equipment as it exists today, and not what should be there or what might be there at some time in the future. Since redesigns take time, we only challenge the design when a cost-effective maintenance programmes cannot be found.

Once the early researchers had gained a clear understanding of the strengths and weaknesses of these different approaches, the next step was to develop a sensible basis of deciding when to use which one. The first point which became clear was that it is too simplistic - and too dangerous - to haphazardly choose any one of these approaches. Different approaches are needed for different machines, or even for different parts of the same machine. Think again about a private car. We check the brakes and tyres for wear on a regular basis - both forms of preventive maintenance, while checking periodically to see the hazard warning flashers are predictive maintenance. On the other hand, changing the oil and the spark plugs at fixed intervals is still working is detective maintenance. Then again, we would probably do not routine maintenance at all on (say) the cigarette lighter or the electric window winders , and only arrange for them to be repaired if they fail.

In trying to establish a sensible frame work for making these choices, the aviation industry realised that the reason why we worry about failures at all is because they have consequences. Sometimes failures only cost the fund of the repair job. More often, they interfere with production or operations, in which case they usually cost rather more than the direct cost of repair. In the most serious cases, they can lead to serious environmental incidents or kill people directly.

Clearly, the more serious the consequences of a failure, the more time and effort should be spent on trying to prevent it. This led the airlines to use a formal evaluation of failure consequences as a basis for deciding what maintenance should be done. The decision-making framework which they developed to do this is at the heart of Reliability-centred Maintenance.

The application of RCM starts with an analysis of the functions of the equipment. This is done in close consultation with the equipment users. Maintenance is all about preserving the function of physical assets, so a comprehensive review of equipment functions enables everyone to agree on what maintenance is trying to 'maintain’. It is also surprising - sometimes very surprising - how much people learn about how the equipment is supposed to work.

Once the functions of the equipment have been agreed, the next step is to analyse all the ways in which it is reasonable likely to fail, and what would happen if each failure actually did occur. This step is called failure modes and effects analysis, and it enables everyone to agree on what they are actually trying to prevent when they do 'preventive’ maintenance.

The Next Step

is to assess the consequences of each failure in a strict sequence. Depending on both the nature and the severity of the consequences, the final step is to select the most appropriate of the five types of maintenance listed above for dealing with the failure.

This process leads to much more tightly focused and far more effective maintenance programmes. RCM has contributed to a massive decrease in the number of civil aircraft accidents over the past thirty years - from more than sixty crashes per million takeoffs in 1960 to less than two per million takeoffs in 1988. Such results led to its widespread adoption in ground-based industries around the world. RCM has and is being used to initiate step changes in maintenance effectiveness - often in the space of a few months - by someone the world's leading companies in fields such as petrochemicals, food manufacture, pharmaceuticals, railways, electricity generation and distribution, mass housing, steel making, water distribution, military undertakings and auto-mobile manufacture.

Given the power of the RCM process and the speed with which it can produce results, the final questions concern how and by whom it should be applied. Most people tend to believe that the equipment supplier is in the best position to provide a viable maintenance programme for his equipment. However, no supplier - indeed no outsider - can possibly appreciated all the unique features of the user's business which will affect the machine throughout its life, nor can they fully appreciate all the ways in which the failure of their equipment might affect the users business. This means that although the supplier can - and in the case of new equipment, should play a part in developing such a programme, there is no way that he can do it all.

In practice, the organisations which have derived the most value from the RCM process are those which have developed the capability to apply it themselves. They do so using multi-disciplinary teams of people trained to use RCM under the guidance of even more highly trained individuals known as RCM facilitators. These teams usually include members of the operations department, because they understand most clearly what the equipment is to be used for and what the consequences are if it fails. The equipment users are the 'customers’ of the maintenance department, and it is wholly in keeping with all the principles of Total Quality that the customers should be involved in specifying the kind service they expect. The teams also include members of the maintenance function, because they tend to have the clearest understanding of what fails and what can be done to prevent or repair it.

Getting members of operations and maintenance departments to apply RCM together in this way not only ensures that the 'best' maintenance programme is drawn up with all the available information. It also causes two notoriously hostile sections of the business to start working together as a team to fulfill common, clearly understood objectives. This feature of RCM alone is causing it to be widely adopted by some of the world's largest multinational manufacturing organisations.

RCM is transforming the world of industrial maintenance from what is often seen as an expensive and erratic drain on resources, into truly proactive contributor to world-class manufacturing excellence.