CrowdStrike/Microsoft Global Outage: The Concentration Risk DORA Was Designed to Prevent
InfrastructureGlobal Financial Infrastructure (Cross-Sector)July 19, 2024 (single day event; multi-day recovery)

CrowdStrike/Microsoft Global Outage: The Concentration Risk DORA Was Designed to Prevent

On July 19, 2024, a faulty CrowdStrike Falcon sensor update crashed approximately 8.5 million Windows devices globally, disrupting banks, payment processors, and insurers worldwide.

Published

Key Metrics

Devices Affected

~8.5M globally

was: N/A

Publicly reported by Microsoft

Time to Revert Update

~78 minutes

was: N/A

Per CrowdStrike PIR

Recovery Duration

12-16 hours (critical)

was: N/A

Per press reports

Est. Global Insured Losses

~USD 1.5B

was: N/A

Parametrix/Reuters estimate

The Situation

The Systemic Context

The CrowdStrike outage exposed what regulators and risk professionals had long warned about: ICT concentration risk at the infrastructure layer. According to publicly available market data, CrowdStrike held a significant share of the enterprise EDR market at the time of the incident, with deployment across a large proportion of Fortune 500 companies.

The financial services sector's exposure was structural, not incidental. Regulatory requirements around endpoint protection, threat detection, and continuous monitoring had driven widespread adoption of a small number of EDR vendors. Institutions had implemented CrowdStrike's Falcon platform precisely because it was considered industry-leading — creating a paradox where compliance-driven adoption amplified concentration risk.

Key structural factors publicly reported:

  • Automatic update propagation: CrowdStrike's content updates deployed automatically to all managed endpoints without staged rollout or customer-controlled gating. This design — optimized for rapid threat response — meant that a single defective update could affect all customers simultaneously.
  • Kernel-level access: As publicly documented, the Falcon sensor operates at the kernel level of the Windows operating system, meaning a fault in the sensor could crash the entire operating system rather than just the application.
  • Cross-sector dependency: Financial institutions shared their EDR vendor with airlines, healthcare providers, government agencies, and other critical infrastructure operators, creating correlated failure risk across sectors.
  • Manual recovery requirement: The remediation required physical or remote console access to each affected device in Safe Mode, making automated recovery impossible. For institutions with thousands of endpoints, recovery took 12-16 hours or longer according to press reports.

The European Supervisory Authorities (ESAs) had already flagged ICT concentration risk as a key concern in their DORA-related publications. The CrowdStrike incident provided real-world evidence of the scenario that DORA's Pillar IV provisions (Art. 28-44) were specifically designed to address: a critical ICT third-party provider whose failure could affect the operational resilience of multiple financial entities simultaneously.

The Challenge

The Incident

On July 19, 2024, at approximately 04:09 UTC, CrowdStrike deployed a content configuration update to its Falcon sensor for Windows systems. According to CrowdStrike's publicly released Preliminary Post Incident Review (PIR), the update contained a logic error in Channel File 291 that triggered an out-of-bounds memory read, causing Windows systems running the sensor to crash with a Blue Screen of Death (BSOD).

The impact was immediate and global. According to Microsoft's published assessment, approximately 8.5 million Windows devices were affected. The faulty update propagated through CrowdStrike's content delivery infrastructure before the issue was identified and the update was reverted approximately 78 minutes later. However, affected systems required manual remediation — each machine needed to be booted into Safe Mode and have the faulty channel file deleted — making the recovery process labor-intensive and prolonged.

Financial services institutions were disproportionately affected due to their heavy reliance on endpoint detection and response (EDR) solutions for regulatory compliance:

  • Payment processing disruptions were publicly reported across multiple networks. According to press coverage, several major banks experienced issues with customer-facing services, ATM networks, and internal operations.
  • Trading platforms experienced outages during the European and Asian market opens.
  • Insurance claims processing was disrupted at multiple carriers.
  • Airline and travel payment systems — which financial institutions depend upon for corporate card networks — were severely affected, with thousands of flights canceled globally.

The incident demonstrated a systemic vulnerability: a single software update from a single vendor, distributed automatically to millions of endpoints, could simultaneously disable critical systems across an entire sector.

The Approach

DORA's Relevance: What the Regulation Would Have Required

The CrowdStrike outage occurred six months before DORA's application date of January 17, 2025. Analyzing the incident through the lens of DORA's requirements illustrates why the regulation was designed as it was.

Pillar IV: ICT Third-Party Risk Management (Art. 28-44)

DORA's third-party risk provisions directly address the structural vulnerabilities this incident exposed:

  • Art. 28(1): Financial entities must manage ICT third-party risk as an integral component of their ICT risk management framework. Institutions would need to have classified CrowdStrike as a critical or important ICT service provider and maintained a documented risk assessment.
  • Art. 28(3): The register of information on ICT third-party arrangements must capture the dependency, the nature of the service, and the criticality of the functions supported. This would have made the concentration visible.
  • Art. 28(8): Financial entities must define and implement exit strategies for critical ICT services. Institutions relying solely on CrowdStrike for endpoint protection would need documented alternatives and transition plans.
  • Art. 29(2): Concentration risk assessment is mandatory. Institutions must evaluate whether "the conclusion of further contractual arrangements" with a provider "would allow them to substitute the ICT services concerned." The homogeneity of CrowdStrike deployments across the sector would have been flagged.

Pillar I: ICT Risk Management (Art. 5-16)

  • Art. 9(2): Institutions must maintain "sound, resilient and updated ICT systems, protocols and tools" with mechanisms to "minimize the impact of ICT risk." Automatic, ungated kernel-level updates from a third party are inconsistent with this requirement.
  • Art. 11(1): ICT business continuity policies must cover scenarios including "severe business disruptions" and "the failure of ICT third-party service providers." The specific scenario of a globally correlated EDR failure should appear in continuity planning.
  • Art. 12(1): Backup policies must "clearly set out the scope of the data that is subject to the backup and the minimum frequency of the backup." Recovery from the CrowdStrike incident required access to backup images or known-good system states.

Pillar V: Information Sharing (Art. 45)

  • Art. 45(1): Financial entities "may exchange amongst themselves cyber threat information and intelligence." The rapid sharing of remediation procedures (boot to Safe Mode, delete Channel File 291) across institutions would be facilitated by formal information-sharing arrangements.

The Results

Publicly Reported Outcomes and Industry Response

According to publicly reported information, the financial services impact of the CrowdStrike outage was significant but varied by institution:

  • Recovery timeline: Most affected financial institutions reportedly restored critical systems within 12-16 hours, with full remediation of all endpoints taking several days for large organizations. According to CrowdStrike's remediation guidance, each affected device required manual intervention.
  • Payment network resilience: According to press coverage, major payment networks experienced disruptions but activated contingency procedures. The event highlighted the difference between institutions with tested business continuity plans and those without.
  • Insured losses: According to estimates published by Parametrix Solutions and reported by Reuters and other outlets, the global insured losses from the CrowdStrike outage were estimated at approximately USD 1.5 billion, with total economic losses significantly higher. Financial services was among the most affected sectors.
  • Regulatory response: The European Supervisory Authorities referenced the incident in subsequent DORA-related communications as evidence supporting the importance of ICT concentration risk oversight. The incident reinforced the rationale for DORA's oversight framework for critical ICT third-party service providers (Art. 31-44).

Industry actions publicly reported following the incident:

  • CrowdStrike published a Root Cause Analysis and announced changes to its content update deployment process, including staged rollouts, additional testing layers, and customer-controlled deployment options.
  • Microsoft published guidance on resilience improvements and announced changes to Windows kernel access policies for security vendors.
  • Several financial regulators issued guidance on ICT concentration risk assessment in the wake of the incident.
  • Industry associations reported accelerated timelines for DORA compliance programmes, with the CrowdStrike incident cited as a catalyst.

Lessons Learned

  1. 1DORA Art. 29(2) concentration risk assessment would have quantified the sector-wide dependency on a single EDR vendor, potentially driving diversification or compensating controls before the incident occurred.
  2. 2DORA Art. 28(8) exit strategy requirements would have compelled institutions to maintain documented alternatives to CrowdStrike, reducing the sense of being "locked in" during the crisis.
  3. 3DORA Art. 9(2) requirements for resilient ICT systems imply that automatic, ungated kernel-level updates from third parties should be subject to institutional change management controls — not deployed on faith.
  4. 4DORA Art. 11 business continuity planning should explicitly model the correlated failure of a widely-adopted ICT service provider, not just the failure of a single institution's systems.
  5. 5DORA Art. 45 information-sharing arrangements would have accelerated cross-institutional remediation knowledge transfer during the recovery phase.
  6. 6The incident demonstrates that concentration risk extends beyond cloud infrastructure providers to any ubiquitous software component with privileged system access.
concentration-riskthird-partybusiness-continuityedrkernel-levelglobal-outagepillar-iv

Disclaimer:This case study is based on anonymized data from real-world DORA compliance programmes. Names, specific figures, and identifying details have been changed to protect confidentiality. The outcomes described are specific to the institution's context and may not be directly replicable.

Facing similar challenges?

See how Valendir can help your institution achieve and maintain DORA compliance with deterministic workflows, immutable evidence, and continuous assurance.