BankingUK Systemically Important Bank (G-SIB)January 31 - February 2, 2025 (72-hour outage)

Barclays Three-Day Mainframe Outage: GBP 12.5M in Compensation and the Case for DORA Art. 11

On January 31, 2025, a software problem in Barclays' UK mainframe locked millions of customers out of their accounts for three days — coinciding with payday and the UK tax deadline.

Pillar I — ICT Risk Management Pillar II — Incident Management Pillar III — Resilience Testing

Published April 9, 2025

Key Metrics

Outage Duration

72 hours

was: RTO target: 4-24h

3-18x over typical DORA-aligned RTO

5% of all attempts

was: Normal (<0.1%)

50x normal failure rate

Payment Failure Rate

56% of attempted payments

was: Normal (<0.5%)

Over half of payments failed

Direct Compensation

GBP 5-7.5M (this incident)

was: N/A

GBP 12.5M total over 2 years

Regulatory Response

Parliamentary inquiry + FCA/PRA engagement

was: Standard monitoring

Escalated to political accountability

The Situation

Three Days Without Banking

The operational impact of the Barclays mainframe failure was both severe and prolonged, extending far beyond what would be acceptable under any modern operational resilience framework.

According to data disclosed by Barclays in subsequent regulatory filings and press statements, the failure metrics were stark. Approximately 5% of all login attempts to Barclays digital banking services failed outright during the outage period. This figure, while appearing modest as a percentage, represents hundreds of thousands of failed access attempts given Barclays' customer base. Of the customers who did manage to log in, 17% attempted to make payments — a figure that reflects the urgency of the payday and tax deadline context. Of those attempted payments, 56% failed. The compounding effect was devastating: customers who needed to move money on the single most time-sensitive day of the month found that more than half their payment attempts were unsuccessful.

The timeline of the outage underscores the severity of the recovery failure. The initial disruption began on Friday, January 31. The system was not fully restored until Sunday, February 2 — a 72-hour outage for a Tier 1 bank's core transaction processing capability. During this period, customers reported being unable to see salary credits in their accounts, even when employers confirmed payments had been dispatched. Direct debits for mortgage payments, utility bills, and credit card payments failed, potentially triggering late payment fees from creditors and affecting customers' credit records. Small business customers relying on Barclays for daily cash management found themselves unable to pay suppliers or employees.

The financial consequences were substantial. Barclays' total IT failure compensation over the preceding two years reached GBP 12.5 million, with this single outage expected to account for GBP 5-7.5 million in direct compensation. These figures represent only the direct remediation costs — customer refunds, penalty reimbursements, and goodwill payments. They do not capture the reputational damage, the regulatory scrutiny costs, or the competitive losses from customers migrating to rival banks.

The incident triggered formal regulatory engagement. The Financial Conduct Authority (FCA) and the Prudential Regulation Authority (PRA) both engaged with Barclays on the operational resilience implications. UK Members of Parliament used the incident to renew calls for stronger accountability mechanisms for bank IT failures, including personal liability for senior executives under the Senior Managers Regime.

The customer experience dimension deserves particular attention. Social media reports and press coverage documented customers unable to pay tax bills before the HMRC deadline, unable to access wages deposited by employers, unable to make mortgage payments, and unable to purchase essentials. For vulnerable customers living paycheck to paycheck, a three-day loss of access to their own money was not an inconvenience — it was a genuine hardship. The UK's consumer advocacy organizations noted that IT outages disproportionately harm lower-income customers who lack alternative banking relationships or cash reserves to bridge the gap.

The Challenge

The Worst Possible Timing

On January 31, 2025, Barclays — one of the United Kingdom's four systemically important banks, serving approximately 48 million customers globally — experienced a critical software failure in its UK mainframe infrastructure. The timing was catastrophic. January 31 is simultaneously the UK's primary payday (the last business day of the month when the majority of salaried workers receive their wages) and the self-assessment tax filing deadline for Her Majesty's Revenue and Customs (HMRC). It is, by any measure, one of the two or three most transaction-intensive days of the year for UK retail banking.

The failure was not caused by a cyberattack. According to Barclays' public statements and subsequent reporting by the Financial Times, BBC, and other outlets, the root cause was a software problem within a critical UK mainframe module. Mainframe systems — IBM Z-series and their descendants — remain the backbone of transaction processing at most large UK banks, handling the high-volume, low-latency workloads that underpin current account operations, payment processing, and interbank settlement. These systems are decades old in architectural lineage, and while they are extraordinarily reliable in normal operation, failures in mainframe environments tend to be severe precisely because of the concentration of processing they represent.

The outage manifested progressively. Initial reports indicated intermittent failures in the Barclays mobile banking application and online banking portal. Within hours, it became clear that the problem was systemic: the mainframe module responsible for processing customer transactions had entered a degraded state that could not be resolved through standard operational procedures. Payments were not processing. Salary credits were delayed or invisible. Direct debits and standing orders failed. Customers attempting to pay their tax bills on the final day before penalties applied found their accounts inaccessible.

The political dimension amplified the operational crisis. UK Members of Parliament demanded that bank chief executives explain the pattern of IT failures across the banking sector. The House of Commons Treasury Committee had already been scrutinizing bank IT resilience following a series of high-profile outages across UK banks in 2023 and 2024. Barclays' three-day failure became the focal point for broader questions about whether UK banks were investing sufficiently in their core technology infrastructure, or whether decades of cost optimization had created a fragility that was now manifesting as customer harm.

For DORA observers, the Barclays outage was significant not because it was unusual — UK bank IT failures are distressingly common — but because it demonstrated precisely the kind of operational resilience failure that DORA Art. 11 was designed to prevent: a critical system failure with inadequate recovery capabilities, causing extended customer impact during a period of peak demand.

The Approach

What DORA Would Have Required

Although the UK is no longer an EU member state and DORA does not apply directly to UK-headquartered institutions, the Barclays outage provides a precise case study for several DORA articles that address exactly this failure pattern. Moreover, Barclays maintains significant EU operations through its Irish subsidiary, meaning DORA's requirements are directly relevant to a substantial portion of the group's activities.

Art. 11 — Response and Recovery (The Core Failure)

DORA Art. 11 requires financial entities to establish comprehensive ICT business continuity policies and ICT response and recovery plans. Specifically, Art. 11(1) mandates that these plans be "tested, reviewed and updated at least yearly" and that they cover "all functions and ICT assets" of the entity. Art. 11(3) requires that recovery plans include "adequate recovery targets, including recovery time objectives (RTO)" and that entities "ensure that they are able to implement the plans in a timely manner."

The Barclays outage lasted 72 hours. For context, DORA-aligned recovery time objectives for critical banking services typically range from 4 hours (for payment processing) to 24 hours (for non-critical customer channels). A 72-hour outage of core transaction processing is not a marginal failure — it is a categorical failure of recovery capability. Under DORA, this would trigger immediate supervisory engagement and likely a formal investigation into whether the entity's ICT risk management framework (Art. 5-6) was adequate.

The mainframe architecture is particularly relevant. Mainframe systems consolidate processing in a way that creates single-point-of-failure risk. DORA Art. 9(2) requires institutions to maintain "sound, resilient and updated ICT systems" with mechanisms to "minimize the impact of ICT risk." A mainframe configuration where a single software fault can disable transaction processing for 72 hours is inconsistent with this requirement unless the institution can demonstrate that adequate redundancy, failover, and recovery mechanisms were in place and tested.

Art. 12 — Backup Policies and Restoration

Art. 12 requires financial entities to maintain backup policies that "clearly set out the scope of the data" subject to backup and "the minimum frequency of the backup." The 72-hour recovery timeline suggests that either backup/restoration procedures were inadequate for the specific failure mode that occurred, or that the restoration process itself was untested under realistic conditions.

Art. 17-19 — Incident Classification and Reporting

Under DORA's incident classification framework, the Barclays outage would unambiguously qualify as a major ICT-related incident. The classification criteria include: (a) the number of clients affected (millions), (b) the duration of the incident (72 hours), (c) the impact on transactions (56% payment failure rate), and (d) the economic impact (GBP 5-7.5M direct, significantly more indirect). DORA Art. 19 requires initial notification to the competent authority within 4 hours of classification as major. An intermediate report must follow within 72 hours, and a final report within one month.

Art. 14 — Board-Level Reporting

Art. 14 requires the management body to be adequately informed about ICT risk. A 72-hour outage coinciding with the most critical transaction day of the month raises questions about whether the board had been presented with realistic assessments of mainframe failure risk, recovery capability, and the potential customer and financial impact. If the board's risk appetite assumed maximum 24-hour recovery and the actual recovery took 72 hours, the risk framework materially understated the institution's exposure.

Art. 24-25 — Resilience Testing

DORA requires financial entities to maintain a resilience testing programme. Art. 25 specifies that testing must include "scenario-based testing" covering "severe but plausible scenarios." A software failure in a critical mainframe module during peak transaction volume is not a remote scenario — it is a straightforward operational risk. The question is whether Barclays' resilience testing programme had tested this specific failure mode, and if so, whether the recovery performance matched the test results.

The Results

Compensation, Accountability, and Structural Lessons

The financial and regulatory consequences of the Barclays outage were significant and multi-dimensional, providing a concrete example of what DORA-era accountability looks like in practice.

Direct Financial Impact

Barclays' total IT failure compensation over the two-year period encompassing this outage reached GBP 12.5 million. The January 2025 mainframe outage alone was expected to generate GBP 5-7.5 million in compensation payments. These figures, while substantial, almost certainly understate the true cost when factoring in:

Regulatory remediation costs: The FCA and PRA engagement following the outage required Barclays to conduct internal reviews, produce reports, and implement remediation measures — activities that consume significant management bandwidth and specialist consultancy fees.
Technology remediation investment: The mainframe failure exposed architectural weaknesses that required investment in redundancy, failover capabilities, and modernization — costs that likely dwarf the direct compensation figure.
Customer attrition: While difficult to quantify precisely, industry data suggests that major IT outages drive measurable customer switching, particularly among digitally active customers who have the lowest friction in moving to competitors.
Reputational capital: For a bank that positions itself as a technology-forward institution, a 72-hour mainframe failure is a credibility event that affects commercial relationships, institutional client confidence, and talent acquisition.

Parliamentary and Regulatory Response

The incident triggered a renewed push from UK Members of Parliament for stronger accountability mechanisms. The Treasury Committee's scrutiny of bank IT resilience intensified, with members questioning whether the Senior Managers Regime — which assigns personal accountability to named individuals for specific operational areas — was being applied effectively to IT resilience failures.

The FCA's operational resilience framework, which requires UK banks to identify "important business services" and set "impact tolerances" by March 2025, was directly tested by this incident. The critical question is whether Barclays had set an impact tolerance for its payment processing service, and if so, whether a 72-hour outage exceeded that tolerance. If the tolerance was set at, say, 4 hours — as would be typical for a payment processing service — the Barclays outage exceeded it by a factor of 18.

Structural Lessons for DORA Implementation

The Barclays outage provides five transferable lessons for financial institutions implementing DORA:

1. Mainframe concentration risk is real. Institutions that consolidate critical processing on mainframe platforms must treat the mainframe as a single point of failure and implement recovery capabilities commensurate with the criticality of the functions it supports. Under DORA Art. 11, this means tested, validated recovery plans with RTOs that reflect the business impact of extended outage.

2. Recovery capability must be tested under realistic conditions. A recovery plan that has not been tested against the specific failure mode that actually occurs is not a recovery plan — it is a hypothesis. DORA Art. 24-25 requires scenario-based testing, and the Barclays incident demonstrates that mainframe software failure during peak load is a scenario that must be tested.

3. Timing matters as much as duration. The same outage occurring on a quiet Sunday would have had a fraction of the customer impact. DORA-aligned resilience testing should include "worst-case timing" scenarios — failures during peak processing periods, regulatory deadlines, and seasonal spikes.

4. Compensation costs are a lagging indicator. The GBP 12.5M compensation figure captures direct remediation but not the structural cost of inadequate resilience. Under DORA, supervisory penalties (Art. 50-64) would add a regulatory cost layer that makes under-investment in resilience significantly more expensive.

5. Political accountability is accelerating. The UK parliamentary response to bank IT failures signals that operational resilience is no longer a purely technical or regulatory matter — it is a political issue. DORA's reporting requirements (Art. 19) and board accountability provisions (Art. 14) anticipate this trend by ensuring that major incidents are visible to supervisors and that management bodies cannot claim ignorance.

Lessons Learned

1DORA Art. 11 recovery requirements demand tested, validated RTOs for critical services. A 72-hour recovery for payment processing is a categorical failure — not a marginal miss — and would trigger immediate supervisory intervention under DORA.
2DORA Art. 12 backup and restoration procedures must cover specific failure modes including mainframe software faults during peak load. Generic backup policies that have not been tested against realistic scenarios provide false assurance.
3DORA Art. 17-19 incident classification would categorize this as a major ICT-related incident requiring 4-hour initial notification, 72-hour intermediate report, and 1-month final report to the NCA.
4DORA Art. 14 board reporting obligations mean that management bodies must receive realistic assessments of mainframe failure risk, including worst-case timing scenarios, not sanitized presentations that minimize the probability of extended outage.
5DORA Art. 24-25 resilience testing must include mainframe software failure during peak processing as a standard scenario. Institutions still running critical workloads on mainframe infrastructure cannot treat these systems as inherently reliable without periodic validation.
6The GBP 12.5M compensation cost would be substantially amplified under DORA Art. 50-64 penalty provisions, making the business case for resilience investment significantly stronger in the DORA era.

bankingmainframerecoveryArt-11Art-12Art-17Art-14UKcompensationpaydayG-SIB

Disclaimer:This case study is based on anonymized data from real-world DORA compliance programmes. Names, specific figures, and identifying details have been changed to protect confidentiality. The outcomes described are specific to the institution's context and may not be directly replicable.

Facing similar challenges?

See how Valendir can help your institution achieve and maintain DORA compliance with deterministic workflows, immutable evidence, and continuous assurance.

Discover Valendir More case studies