OPERATIONAL RESILIENCE IS defined as the ability of firms, industries, and sectors as a whole to prevent, respond to, recover, and learn from operational disruption. It is a set of techniques that allows people, processes, and informational systems to alter operations in the face of changing business conditions.
Enterprises that are operationally resilient have the organizational competencies to ramp up or slow down operations in a way that provides a competitive edge and enables quick and local process modification.
A resilient enterprise is able to recover its key business services from a significant unplanned disruption, protecting its customers, shareholders, and reputation—and, ultimately, the integrity of the financial system. But enterprise operational resilience is about more than just protecting the resilience of systems; it also covers governance, strategy, business services, information security, change management, run processes, and disaster recovery. Avoiding disruption to a particular system that supports a business service contributes to operational resilience.
Thus, operational resilience is an outcome. Operational risk, meanwhile, is a risk—which, if not properly controlled, threatens operational resilience. Therefore, in order to achieve operational resilience a firm must first manage operational risk effectively.
The Operating Environment and the Influence of Regulators
The operating environment for financial firms has changed significantly in recent years, with many adverse and material events becoming a near certainty. Regulators now want operational resilience to be a process that boards and senior managers are directly engaged with and responsible for through governance and assurance models.
Regulators are promoting the principles that foster effective resilience programs and their benefits for firms, customers, and markets. In July 2018, the U.K.’s financial services regulators—the Bank of England, the Prudential Regulation Authority, and the Financial Conduct Authority— brought the concept of operational resilience into the limelight with the publication of a joint discussion paper, “Building the UK Financial Sector’s Operational Resilience.”
The key requirements noted in the discussion paper include the following:
- Governance: The paper emphasized the role of the boardroom in operational resilience. Accountabilities and responsibilities for senior management need to be clearly defined and set against an unambiguous chain of command.
- Business operating model: The model must be properly understood, including key business services and the people, systems, processes, and third parties that support them, with accountabilities agreed to.
- Risk appetite and tolerances: Organizations need to understand and clearly articulate their operational risk appetite and impact tolerance for disruptions to key business services through the lenses of impacts to markets, consumers, and business viability.
- Planning and communications: Organizations need to have meaningful plans that are tested not only by the organizations themselves but in partnership with their contributing stakeholders.
- Culture: There must be a shift in mindset toward service continuity and a continuous improvement approach. That can be achieved by embedding a “resilience culture” that reinforces and promotes resilient behaviors.
What It All Means
The Bank of England and other central banks are likely to be more interested in system-wide scenarios of disruption and common vulnerabilities (for example, firms relying on third parties), while individual firms will often focus on and test firm-specific scenarios.
Central banks may wish to test whether firms collectively have adequate resources to deal with a severe operational disruption and whether firms may be undertaking their contingency planning without the availability of common resources.
This is especially relevant in the payments system and may require a common sharing of payment capability if a firm’s systems were to be compromised. The idea of sharing a competitor’s payment platform may seem absurd, but the need to ensure for the greater good may outweigh an individual firm’s vested interests.
The Bank of England’s approach is built around two key concepts: impact tolerances and business services.
Impact tolerance is defined as a firm’s tolerance for disruption in the form of a specific outcome or metric. Crucially, tolerance is built on the assumption that disruption will occur and that the tolerance remains the same irrespective of the precise nature of the shock. The tolerance is causeagnostic. So, rather than concentrating risk mitigation solely on minimizing the probability of a disruptive event, impact tolerance focuses the board and senior management on minimizing the impact of a disruption. Impact tolerance thus provides a focus for response, recovery, and contingency planning alongside traditional operational risk management.
Impact tolerance is then linked to a business service. Doing so provides a clear focus for firms’ efforts to enhance their operational resilience, which may include, for example, plans to upgrade IT systems, business continuity exercises, and communication plans. Importantly, the focus is on business services—not IT systems.
What Will Your Institution’s Approach Be?
Firms should be taking six critical actions to support and evolve their approach to operational resilience:
- Identify critical services: This is the discovery phase. The enterprise should begin by documenting its business services and mapping them to the underlying technology (cloud infrastructure, data centers, applications, etc.) and business processes (disaster recovery, cyberincident response plans, etc.).
- Understand impact tolerance: In this phase, the underlying technologies and processes are then assessed against key performance indicators or key risk indicators. This assessment is used to create a risk score for each business service, which is then reviewed against agreed-to impact tolerances. Through the use of scenarios, firms need to estimate the extent of disruption to a business service that could be tolerated. Scenarios should be severe but plausible and assume that a failure of a system or process has occurred. Firms must then decide their tolerance for disruption—that is, the point at which disruption is no longer tolerable.
- Know your environment: Using the assessment, the firm develops a remediation plan that gives priority to the business services with the largest disparity between risk score and acceptable impact tolerance. Having been communicated to the regulators and aligned with their expectations, the remediation plan is then funded and executed, and the business service is reassessed for resilience. This should incorporate third parties, which are the second-largest root cause of operational outages after missteps in change management.
- Operationalize the program: The operational resilience program must be able to evolve with the business. Firms should understand which external or internal factors could change over time and which trends could impact the key business services identified, then adjust their resilience plans accordingly. An important step in the process is testing, which is also prioritized by the risk materiality of key business services. Tests such as the simulation of disruption events can advance the enterprise from having informed assessments to demonstrating capabilities to stakeholders and regulators.
- Robust and coherent reporting: For boards and senior management, risk metrics and reporting provide an important insight into the effectiveness of the operational resilience program. Having a robust communication policy and strategy that uses all forms of media and engages with all stakeholders is essential to any resilience program.
- Collaboration: Firms should work together, pool resources, and share information—in short, develop noncompetitive solutions to a shared threat—to the extent possible.
Operational resilience is essentially an upgrade that moves operational risk management from passive to active. Operational risk management, once the poor sibling of credit and market risk management, has stepped into the limelight because its importance can no longer be overlooked. That being the case, it needs upscaling and upgrades of both resources and vision to bring ORM programs to a more resilient state. Given the number of pressing regulatory programs, firms must weave operational resilience into their infrastructure and mindset.