News

Risk Management in the Oil and Gas Industry

Testimony of Professor Nancy Leveson before the United States Senate Committee on Energy and Natural Resources

Nancy Leveson

I thank you for inviting me here today to speak on risk management in the offshore oil and gas industry. To provide some background, I have been practicing, teaching, and doing research in system safety engineering for over 30 years. Although I am a professor of aerospace engineering, I have experience in almost all industries including aerospace, defense, transportation (automobiles, trains, air traffic control), oil and gas, chemicals, nuclear power, medical devices, and healthcare. I have been involved in the investigation of many major accidents, most recently I was on the Baker Panel investigation of the BP safety culture after the 2005 Texas City Oil Refinery explosion and a consultant to both the Columbia Accident Investigation Board and the Presidential Oil Spill Commission. I am also a co-owner of a 20-year old company that provides safety engineering services.

System safety engineering (which should not be confused with occupational safety) has been in existence as a system engineering discipline for at least 50 years. In the process industry, this engineering discipline is called process safety engineering. Much is known about how to engineer and operate safer systems and to manage safety risks successfully. The low accident rates in industries that apply these principles, such as commercial aviation, nuclear power, and defense systems, are a testament to their effectiveness. The recent accidents and subsequent investigations in the offshore oil industry makes it clear that at least some players in this industry are not using basic and appropriate safety engineering technologies and practices.

Commercial aviation is an example of an industry that decided early that safety paid. After World War II, Boeing wanted to create a commercial airline industry but, because of the high accident rate (there were 18 airplane crashes in 1955 despite a relatively small number of flights), only 20 percent of the public was willing to fly. Today the commercial aircraft accident rate is astoundingly low, particularly considering that there are about 10 million commercial airplane flights per year in the U.S. and over 18 million world-wide. In 2010, for example, U.S. Air Carriers flew 17.5 million miles with only one major accident.

Another surprisingly safe industry is defense. We have never, for example, accidentally detonated a nuclear weapon in the 60 years they have been in existence. The nuclear Navy, which prior to 1963 suffered the loss of a submarine on average every two to three years, instituted a wildly successful safety program (called SUBSAFE) after the loss of the Thresher nuclear submarine in 1963. No U.S. submarine has been lost in the 48 years since that program was created. Nuclear power in the U.S., after the wakeup call of Three Mile Island, has also had an extremely successful safety record.

These success stories show that even inherently very dangerous technologies can be designed, operated, and managed in ways that result in very low accident rates. Accidents are not inevitable nor are they the price of productivity. Risk can be managed successfully without reducing profits long-term, but some effort must be expended to do so. We know how to do this and the costs are surprisingly low when done right.

Common Factors in Major Accidents

Major accidents share some common factors:

Two additional common factors in accidents are primarily found only in the process (chemical, oil, and gas) industry:

Safety as a Control Problem

Traditionally, safety has been considered to be a system component failure problem. Preventing accidents then simply requires making each individual component very reliable. This approach, however, oversimplifies the accident process and cannot prevent accidents created by interactions among components that have not failed. A new, systems approach to accidents instead considers safety to be a control problem [Leveson, 2011]. In this conception, accidents result from a lack of enforcement of constraints on safe behavior. For example, the O-ring did not control the release of propellant gas by sealing the gap in the field joint of the Challenger Space Shuttle. The design and operation of Deepwater Horizon did not adequately control the release of hydrocarbons (high-pressure gas) from the Macondo well. The financial system did not adequately control the use of dangerous financial instruments in our recent financial crisis.

Figure 1: The Operational Safety Control Structure for the Macondo Well

Behavioral safety constraints are enforced by the safety control structure of the organization or industry. Figure 1 shows the control structure for operations at the Macondo well in particular and offshore oil drilling in general. The system-level hazard is uncontrolled methane gas surging up the well. Similar control structures, not shown, exist for engineering development and licensing of the well equipment and for emergency response.

Each of the components in this structure plays different roles and has different responsibilities for ensuring safe behavior of the physical process and the organizational components of the structure. Between components there are feedback control loops where control actions are used to achieve the system and component goals (see Figure 2). Feedback provides information about how successful the control actions have been. For example, the cementer pours cement and receives feedback about how the process is proceeding.

Decisions about providing control actions are partly based on a model the controller has of the controlled process. Every controller must contain a model of the process it is controlling. For human controllers, this model is usually called a mental model. Accidents often result from the process models being inconsistent with the actual state of the process. For example, managers use occupational safety data to make decisions about the state of process safety or an engineering manager believes the cementing process was effective and provides a command to remove the mud.

Figure 2: The General Form for a Control Loop in the Safety Control Structure

Control decisions are also influenced by the social and environmental context in which the controller operates. To understand individual behavior requires understanding the pressures and influences of the environment in which that behavior occurs as well as the model of the process that was used.

Losses occur when this control structure does not enforce appropriate behavioral safety constraints to prevent the hazard. In Figure 1, there are physical controls on the well such as the blowout preventer, mud, and cement. Each of the other components of the safety control structure has assigned responsibilities related to the overall system hazard and controls they can exercise to implement those responsibilities. These controls may involve physical design, technical processes, social (cultural, regulatory, industry, company) processes, or individual self interest. For example, part of the responsibility of the MMS was to approve plans and issue drilling permits. Partial control over the safety of operations in the GOM could, at least theoretically, be implemented by appropriate use of the approval and permitting processes.

Determining why an accident occurred requires understanding what role each part of the safety control structure played in the events. Accidents can result from poor design of the control structure, individual components not implementing their responsibilities (which may involve oversight of the behavior of other components), communication flaws, conflicts between multiple controllers controlling the same component, systemic environmental factors influencing the behavior of the individual components, etc. Major accidents, such as the Deepwater Horizon explosion and oil spill, usually result from flawed behavior of most of the system components.

Preventing accidents requires designing an effective safety control structure that eliminates or reduces such adverse events.

An important consideration in preventing accidents is that the control structure itself and the individual behavior of the components is very likely to change over time, often in ways that weaken the safety controls. For example, a common occurrence is for people to assume that risk is decreasing after a period of time in which nothing unsafe occurs. As a result, they may change their behavior to respond to other conflicting goals. Migration toward states of higher risk may also occur due to financial and competitive pressures. Controls must be established to prevent or at least to detect when such migration has occurred.

There is not just one correct or best safety control structure. Responsibilities may be assigned to different components depending on the culture of the industry, history, or even politics. It is important to note that all responsibility for safety should not necessarily rest in the government or a regulatory authority. Because the lower levels of the structure can more directly impact the behavior of the controlled process, it is much more effective for primary safety responsibility to be assigned to the companies with the regulatory authorities providing oversight to ensure that the proper safety practices are being used. In some industries, however, the companies are unable or unwilling to shoulder the bulk of the safety control responsibilities and the regulatory authorities must provide more control.

The safety control structure as defined here is often called the safety management system.

Establishing Controls to Prevent Future Oil Spills

Given this system and control view of safety, we can identify the flaws in the safety control structure that allowed the Deepwater Horizon accident to occur and what can be done to strengthen the overall offshore oil and gas industry safety control structure. Many of the recommendations below appear in the Presidential Oil Spill Commission report, which is not surprising as I played a role in writing it, particularly Chapter 8. The general key to preventing such occurrences in the future is to provide better information for decision making, not just for the government regulators but for those operating the oil rigs.

There are many changes that would be useful in strengthening the safety control structure and preventing future oil spills. Focus should not just be on BOEMRE but on all the components of the control structure. Some general recommendations follow.

There is one recommendation in the Presidential Oil Spill Commission report about which I have some reservations and that is the use of safety cases. While what is in a safety case will determine its efficacy, the common definition of a safety case as an argument for why the system will be safe has some serious drawbacks. There is surprisingly little evidence for the efficacy of the safety case approach to regulation. In fact, the use of safety cases has been highlighted in accident reports as a major causal factor, most notably the Nimrod accident mentioned earlier. A major problem with safety cases is what psychologists call “confirmation bias.” In simple terms, people look for evidence that supports the goal they are trying to achieve. So when making a safety case, focus is on evidence that supports that goal and the safety of the system. People do not usually look for evidence that contradicts the goal and often ignore such contradictory evidence when it presents itself. A paperwork, compliance-oriented culture can be created where safety efforts focus on proving the system is safe rather then designing it to be safe. Safety must be designed into a system from the beginning, it cannot be argued in after the fact.

References:

Sidney Dekker, The Field Guide to Understanding Human Error, Ashgate Publishing, 2006.

Charles Haddon-Cave, The Nimrod Review, HC 1025, London: The Stationery Office Limited, Oct. 28, 2009.

Nancy Leveson, Safeware, Addison-Wesley Publishers, 1995.

Nancy Leveson, Engineering a Safer World, MIT Press, to appear fall 2011. In the meantime, a final draft can be downloaded from http://sunnyday.mit.edu/safer-world.

Mike Martin and Roland Schinzinger, Ethics in Engineering, McGraw-Hill Book Company, 1989.


Renewable energy

Press inquiries: miteimedia@mit.edu

We're hiring! Learn more and apply