DORA Article 25: Why Threat-Led Penetration Testing Changes Everything
Not Your Annual Pentest
Every financial institution does penetration testing. It is a well-understood discipline: hire a firm, scope the engagement, test for vulnerabilities, receive a report, remediate findings. It is valuable, necessary, and for DORA compliance, it is not enough.
Article 25 of DORA establishes the general requirements for digital operational resilience testing. Article 26 takes it further by introducing threat-led penetration testing (TLPT) as a distinct, advanced requirement for entities identified by competent authorities. Per Article 26(1), TLPT "shall be carried out at least every 3 years." TLPT is governed by the TIBER-EU framework and its national implementations (TIBER-FR, TIBER-DE, TIBER-NL, TIBER-LU). It represents a fundamentally different approach to testing operational resilience, and the orchestration complexity it introduces is unlike anything most institutions have managed before.
Understanding this difference is not academic. It determines whether your testing programme satisfies regulatory expectations under Art. 24-27 or generates findings.
What Makes TLPT Different
Threat intelligence drives the scope. In a traditional penetration test, the scope is defined by the institution: "test our internet-facing applications" or "assess our network segmentation." In TLPT, the scope is driven by current threat intelligence. A threat intelligence provider analyzes the institution's threat landscape, identifies the most realistic attack scenarios based on active threat actors targeting the financial sector, and produces a Targeted Threat Intelligence Report (TTIR). The testing scenarios are derived from this intelligence, not from a generic testing checklist.
This means the tests reflect what actual adversaries would do, not what is convenient to test. Art. 3(17) defines TLPT precisely as a framework "mimicking the tactics, techniques and procedures of real-life threat actors perceived as posing a genuine cyber threat."
Testing hits live production. TLPT is conducted against live production systems and real business processes — specifically those supporting critical or important functions as defined in Art. 3(22). Not a staging environment. Not a sanitized copy. Production. This creates a tension that traditional pentesting does not face: the test must be realistic enough to validate resilience while controlled enough to avoid causing the very disruption it is testing for.
Managing this tension requires rigorous planning, clear escalation procedures, real-time monitoring of test activities, and kill-switch mechanisms. It also requires coordination with operational teams who may not know a test is in progress (the "white team" managing the exercise is deliberately small to maintain realism).
Three-team structure. TLPT operates with a formal three-team structure:
- White team: A small group within the institution that manages the exercise, coordinates with regulators, and has visibility into the test plan. Typically 2-4 people. Art. 26(5) requires that the white team include members with appropriate seniority.
- Red team: External threat intelligence and penetration testing providers who plan and execute the attack scenarios. They operate with the knowledge and tactics of the threat actors identified in the TTIR. Art. 26(8) establishes requirements for red team testers, including accreditation standards.
- Blue team: The institution's own security and operations teams who respond to the test as they would to a real attack. Critically, they do not know a test is underway.
This structure means the test validates not just technical controls but the human response: detection speed, escalation procedures, communication under pressure, decision-making quality. The ECB's 2024 cyber resilience stress test across 109 banks examined similar dimensions — the ability to detect, respond to, and recover from a cyber incident — and found that response and recovery capabilities are areas where most institutions need improvement, according to the ECB's July 2024 press release.
Regulatory involvement. TLPT is not a private exercise. The competent authority (or designated TLPT authority) is involved in scoping, receives progress updates, and validates the results. Art. 26(3) requires that the competent authority validate the scope and testing approach. The final Purple Team exercise, where red and blue teams share findings collaboratively, must produce a remediation plan that the authority reviews. This level of regulatory engagement transforms TLPT from a security activity into a governance activity.
The Orchestration Problem
Here is where TLPT becomes genuinely difficult: orchestration.
A typical TLPT engagement spans 6 to 12 months from initial scoping to remediation closure. During that time, the institution must manage:
Phase 1 — Scoping and preparation (8-12 weeks)
- Engage threat intelligence provider
- Receive and validate the TTIR
- Define test scope with competent authority (Art. 26(3))
- Select and onboard red team provider (meeting Art. 26(8) accreditation requirements)
- Establish white team with strict need-to-know controls
- Define rules of engagement, escalation procedures, and kill-switch mechanisms
- Set up secure communication channels
Phase 2 — Active testing (8-16 weeks)
- Red team executes attack scenarios against production
- White team monitors progress and manages risks
- Evidence must be collected at every stage: what was attempted, what succeeded, what was detected, what was missed
- Real-time risk management: if the red team discovers a critical vulnerability being actively exploited by real threat actors, the exercise may need to pause for emergency remediation
Phase 3 — Purple team and reporting (4-8 weeks)
- Red and blue teams conduct collaborative analysis
- Full report produced covering: attack paths, control effectiveness, detection gaps, response quality
- Remediation plan with prioritized findings, owners, and timelines
- Competent authority review and validation (Art. 26(7))
Phase 4 — Remediation and verification (12-24 weeks)
- Each finding requires a corrective action with clear ownership
- Remediation must be tracked to completion with evidence
- Some findings may require architectural changes that span multiple quarters
- Retesting to verify remediation effectiveness
- Final closure report to competent authority
Each of these phases generates evidence that must be preserved with integrity, linked to specific findings, attributed to specific actors, and available for regulatory review years later. The evidence chain must be unbroken from the initial threat intelligence report through to the final remediation verification.
Why Manual Orchestration Fails at Scale
Consider what managing this lifecycle looks like with traditional tools.
The white team maintains a shared folder with restricted access. Test plans live in Word documents. Evidence is scattered across file shares, email attachments, and the red team's reporting portal. Findings are tracked in a spreadsheet. Remediation assignments are communicated via email. Status updates require manually polling each workstream owner. The audit pack at the end is assembled by hand from multiple sources, with no guarantee of completeness.
Now multiply this by the DORA requirement: Art. 26(1) states that TLPT "shall be carried out at least every 3 years," covering all critical or important functions over the testing cycle. For a large institution with numerous critical functions, this means multiple concurrent or overlapping TLPT engagements, each with its own intelligence report, red team, evidence chain, and remediation pipeline.
At this scale, manual orchestration does not just become difficult. It becomes a compliance risk in itself. If you cannot demonstrate a complete, integrity-verified evidence chain for each TLPT engagement, the testing programme itself becomes a finding.
Purpose-built operational resilience platforms orchestrate the entire testing lifecycle from campaign creation through evidence collection to deviation tracking and closure, with every step producing an auditable, integrity-verified record. The campaign workflow enforces the governance gates: test plan approval, readiness checks, evidence completeness validation, and 4-eyes closure (separation of duties — the person who executes cannot be the person who signs off). Findings automatically create deviations with assigned owners and SLA-driven remediation timelines. Evidence is stored with SHA-256 integrity verification and cannot be altered after collection. The entire chain is exportable as an audit pack in seconds, not days.
The Remediation Trap
TLPT findings are not like traditional pentest findings. They tend to be deeper, more systemic, and harder to fix.
A traditional pentest might find a missing patch, a misconfigured firewall rule, or an SQL injection in a web application. These are technical findings with clear, bounded remediation paths.
TLPT findings often reveal organizational weaknesses: the SOC did not detect lateral movement for an extended period. The escalation procedure broke down when the incident crossed departmental boundaries. The business continuity plan assumed an infrastructure failure, not a sophisticated adversary. The third-party SIEM vendor's detection rules did not cover the attack technique used. These findings touch Art. 10 (detection capabilities), Art. 11 (response and recovery), and Art. 13 (lessons learned) simultaneously.
These findings require remediation that spans teams, budgets, and quarters. They require corrective action plans with milestones, evidence of progress, and verification that the remediation actually works. They require someone to own the finding through to closure, which might be 12 months away.
Tracking this remediation lifecycle through email and spreadsheets is where institutions lose control. Findings slip through the cracks. Remediation timelines extend without accountability. Evidence of fixes is not collected. By the time the next TLPT cycle begins, the institution cannot demonstrate that it addressed the findings from the previous cycle — a particularly damaging outcome given that Art. 26(7) requires sharing TLPT results with the competent authority.
This is why deviation management, with enforced ownership, SLA tracking, evidence requirements, and retest discipline, is not optional for TLPT. It is the mechanism that turns testing insights into measurable resilience improvement. See our glossary entry on TLPT for a detailed breakdown of the regulatory definition.
The Cost of Unpreparedness
Institutions that approach TLPT without adequate preparation face several risks beyond compliance findings:
Budget overruns. TLPT engagements involve significant expenditure for threat intelligence, red team services, and regulatory coordination. The cost varies considerably by institution size, scope complexity, and market (threat intelligence providers and accredited red teams command premium rates). Without structured orchestration, scope creep and extended timelines increase costs materially.
Reputational risk. TLPT findings are shared with the competent authority. Institutions that demonstrate weak detection, poor response, or inadequate remediation face reputational consequences in their supervisory relationship.
Cascading findings. TLPT findings often trigger secondary compliance gaps: a detection failure reveals an Art. 10 gap; an inadequate response reveals an Art. 11 gap; a missing post-incident review reveals an Art. 13 gap. Each cascading finding requires its own remediation cycle.
Preparing for TLPT
For institutions preparing for their first TLPT engagement, the priorities are:
- Establish the white team early. Identify 2-4 senior people who can manage the exercise with strict confidentiality. They need authority to make decisions under pressure and familiarity with Art. 26 requirements.
- Invest in the governance infrastructure. Before the first red team engagement begins, you need the platform infrastructure to manage evidence, track findings, enforce remediation, and produce audit packs. Building this during the engagement is too late.
- Practice the coordination. Run a tabletop TLPT exercise before the real one. Walk through the lifecycle: intelligence, planning, testing, evidence collection, purple team, remediation, closure. Identify where your processes break down.
- Plan for remediation capacity. TLPT findings will generate significant remediation work. Budget the engineering capacity before you start, not after you see the findings. TLPT engagements typically produce multiple significant findings that require months of remediation effort — plan accordingly.
- Engage the competent authority early. Regulatory engagement is part of the TLPT process (Art. 26(3)). Building a constructive relationship with your authority before the formal engagement begins makes the entire process smoother.
TLPT is not optional for entities identified by competent authorities under Article 26. It is not a box to check. It is a rigorous, regulatory-supervised exercise that will reveal things about your organization that you did not know. The question is not whether to do it. The question is how you orchestrate it.