Skip to content

Commit

Permalink
Operational Risk
Browse files Browse the repository at this point in the history
  • Loading branch information
robmoffat committed Jan 2, 2025
1 parent 215dab3 commit c8e38cb
Show file tree
Hide file tree
Showing 7 changed files with 2,193 additions and 26 deletions.
1 change: 1 addition & 0 deletions dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -371,3 +371,4 @@ automakers
pinto
uptime
nokia
outsourcing
2 changes: 2 additions & 0 deletions docs/practices/Deployment-And-Operations/Automation.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ practice:
- tag: Schedule Risk
reason: "Automating laborious tasks clears the schedule for higher-value work."
attendant:
- tag: Operational Risk
reason: "Automated processes may be less observable than manual ones."
- tag: Complexity Risk
reason: "Introducing automation adds to the complexity of a project"
- tag: Feature Fit Risk
Expand Down
9 changes: 6 additions & 3 deletions docs/risks/Environmental-Risks/Environmental-Risks.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@ tags:

In this section we're going to start considering the realities of running software systems in the real world.

There is a lot to this subject, so this section is just a taster: we're going to set the scene by looking at what constitutes an [Operational Risk](/tags/Operational-Risk), and then consider just two specific further types of environmental risk, [Security Risk](/tags/Security-Risk) and [Legal Risk](/tags/Legal-Risk).
It's important to understand that software is always operating within a context. Whether it's a product being offered by a startup, some utility downloaded from an app store or a large government or enterprise deployment, the context really matters, and therefore the risks presented by this context are relevant to the overall risk position of the software itself.

## PEST / PESTLE

# PESTLE.
One useful technique for environmental analysis is [PEST or PESTLE](https://en.wikipedia.org/wiki/PEST_analysis), which breaks down the environment into specific components: Political, Economic, Social, Technological, Legal and Ecological. Other frameworks suggest looking at Demographic, Geographic or Military elements too.

## Types Of Feature Risk
There is a lot to this subject, so this section is just a taster: we're going to consider just two specific types of environmental risk, [Security Risk](/tags/Security-Risk) and [Legal Risk](/tags/Legal-Risk). And then cap off the taxonomy of risks by looking at [Operational Risk](/tags/Operational-Risk), which really encompasses the others.

## Types Of Environmental Risk

<TagList tag="Environmental Risk" />

Expand Down
84 changes: 61 additions & 23 deletions docs/risks/Environmental-Risks/Operational-Risk/Operational-Risk.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
title: Operational Risk
description: Risks of losses or reputational damage caused by failing processes or real-world events.
description: The risk of loss resulting from inadequate or failed internal processes, people and systems or from external events.

slug: /risks/Operational-Risk


featured:
class: c
element: '<risk class="operational" />'
sidebar_position: 1
sidebar_position: 4
tweet: yes
tags:
- Risks
Expand All @@ -19,36 +19,74 @@ tags:

<RiskIntro fm={frontMatter} />

> "The risk of loss resulting from inadequate or failed internal processes, people and systems or from external events." - [Operational Risk, _Wikipedia_](https://en.wikipedia.org/wiki/Operational_risk#Definition)
When building software, it's tempting to take a very narrow view of the dependencies of a system, but [Operational Risks](/tags/Operational-Risk) are often caused by dependencies we _don't_ consider - i.e. the **Operational Context** within which the system exists and the way that events and changes in that context can impact the smooth running of the system.

## Worked Example

![Difficulty of providing good support](/img/generated/risks/posters/operational-risk.svg)

## Operational Risks
Firm A is operating an online service and has customers all around the world in many different time zones. When there are issues with the service, customers can call in or use an online chat tool to get help. However, firm A is struggling with the volume of support, the wide variety of requests and the need for staff in multiple different time zones.

When building software, it's tempting to take a very narrow view of the dependencies of a system, but [Operational Risks](/tags/Operational-Risk) are often caused by dependencies we _don't_ consider - i.e. the **Operational Context** within which the system is operating.<!-- tweet-end --> Here are some examples:
They decide to [out-source](/tags/Outsourcing) their support function to a low-cost provider who promises to handle the staffing issues for them. However, the quality of the support declines and customers are less happy. Furthermore, because the support function is now at arms-length, firm A has broken a key feedback loop between its customers and its product development staff. Operational Risks mount up - customer retention declines and issues go unfixed leading to reputational problems.

- **[Staff Risks](/tags/Staff-Risk)**:
- Freak weather conditions affecting ability of staff to get to work, interrupting the development and support teams.
- Reputational damage caused when staff are rude to the customers.
## Example Threats / Intersection With Other Types of Risk

Any of the risks we've covered so far can knock-on to be an [Operational Risk](/tags/Operational-Risk) - that's one of the main reasons we've left this until the end. Below is a non-exhaustive look at some of the ways these intersect.

In reality, there is a long laundry-list of everything that can go wrong due to operating in "The Real World". Effective [Operational Risk Management](/tags/Operational-Risk) means we have to consider that these dependencies will fail in any number of unusual ways, and we can't be ready for all of them. Preparing for this comes under the umbrella of [Operations Management](Operations-Mmanagement).

### 1. Process Risk

Process Risk looks at the risks of having processes at all. But processes are a key part of running a successful operation, so Process Risk "rolls up" into Operational Risk.

**Threat:** Insufficient controls which means you don't notice when some transactions are failing, leaving you out-of-pocket.

**Threat:** Data loss because of bugs introduced during an untested release.

**Threat:** Regulatory change (itself a [Legal Risk](/tags/Legal-Risk), which means you have to adapt your business model and change your processes.

**See:** [Process Risk](/tags/Process-Risk)

### 2. Agency Risk

An operation is sure to involve decision making by other parties, staff or software systems.

**Threat:** Key staff leaving

**Threat:** Suppliers changing their terms-of-service.

**Threat:** Reputational damage caused when staff are rude to the customers.

**See:**: [Agency Risk](/tags/Agency-Risk)

### 3. Reliability Risk

- **[Reliability Risks](/tags/Reliability-Risk)**:
- A data-centre going off-line, causing your customers to lose access.
- A power cut causing backups to fail.
- Not having enough desks for everyone to sit at.
Your operation is certain to be built on top of key dependencies, many of which you take for granted in terms of their reliability.

**Threat:** A data-centre going off-line, causing your customers to lose access.

**Threat:** A power cut causing backups to fail.

**Threat:** Freak weather conditions affecting ability of staff to get to work, interrupting the development and support teams.

**See:** [Reliability Risk](/tags/Reliability-Risk)

- **[Process Risks](/tags/Process-Risk)**:
- Regulatory change, which means you have to adapt your business model.
- Insufficient controls which means you don't notice when some transactions are failing, leaving you out-of-pocket.
- Data loss because of bugs introduced during an untested release.
### 4. Security Risk

Your operation has to secure itself against malicious actors that might try to disable it or exploit it in some way.

- **[Software Dependency Risk](/tags/Software-Dependency-Risk)**:
- Hackers exploit weaknesses in a piece of 3rd party software, bringing your service down.
**Threat:** Hackers exploit weaknesses in a piece of 3rd party software, bringing your service down.

- **[Agency Risk](/tags/Agency-Risk)**:
- Workers going on strike.
- Employees trying to steal from the company (bad actors).
- Other crime, such as hackers stealing data.

This is a long laundry-list of everything that can go wrong due to operating in "The Real World". Although we've spent a lot of time looking at the varieties of [Dependency Risk](/tags/Dependency-Risk) on a software project, with [Operational Risk](/tags/Operational-Risk) we have to consider that these dependencies will fail in any number of unusual ways, and we can't be ready for all of them. Preparing for this comes under the umbrella of [Operations Management](#operations-management).
:::tip Anecdote Corner

In 2022, The Bank of England fined TSB Bank £48m for operational risk management failures, [citing poor governance and failure to manage outsourcing risks](https://www.bankofengland.co.uk/news/2022/december/tsb-fined-for-operational-resilience-failings). This was as a result of a 2018 effort by TSB to migrate customer accounts onto a new IT platform.

All of TSB's branches and a great number of its 5m customers were affected and as well as the fine, TSB had to pay out £32m in compensation to the customers.

In the UK, financial services are regulated by two bodies: the Financial Conduct Authority (FCA) and the Prudential Regulation Authority (PRA), who investigated the outage. They both issued fines to TSB, citing a lack of planning and inadequate operational risk management.

:::



39 changes: 39 additions & 0 deletions src/images/generated/risks/posters/operational-risk.adl
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<?xml version="1.0" encoding="UTF-8"?>
<diagram xmlns="http://www.kite9.org/schema/adl"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xslt="http://www.kite9.org/schema/xslt" id="diagram-113"
xslt:template="/public/templates/risk-first/risk-first-template.xsl">
<container bordered="true" id="c"
style=" --kite9-vertical-sizing: maximize; ">

<mitigated>
<risk class="agency" />
</mitigated>

<mitigated>
<risk class="complexity" />
</mitigated>

<mitigated>
<risk class="funding" />
</mitigated>

<label id="id_16">Firm A is struggling to provide
a support function and struggles with the complexity
of quality support across multiple locations
</label>
</container>
<group
style="--kite9-layout: down; --kite9-horizontal-align: left;">
<action style="--kite9-horizontal-align: left;">Outsourcing</action>
</group>
<container id="d"
style="--kite9-vertical-sizing: maximize; ">
<risk class="operational" style="--kite9-vertical-align: top; " />
<label id="id_16-kg">But there are down sides to
outsourcing support too and
quality of service declines
</label>
</container>
</diagram>
Loading

0 comments on commit c8e38cb

Please sign in to comment.