9.16. Advisory TFV-16 (CVE-2026-0995)
Title |
SME erratum in C1-Pro means memory accesses from the SME unit can remain outstanding after another CPU issues TLBI+DSB |
|---|---|
CVE ID |
CVE-2026-0995 |
Date |
Reported on 23 September 2025 |
Versions Affected |
TF-A version from v2.10 onwards |
Configurations Affected |
Arm C1-Pro prior to r1p2 |
Impact |
SME can access memory after it has been re-allocated, potentially overwriting the new owner’s data. |
Fix Version |
Gerrit topic #gr/CVE-2026-0995 Also see mitigation guidance in the Official Arm Advisory |
Credit |
Arm |
9.16.1. C1-Pro / CME CVE-2026-0995 Workaround
9.16.2. Overview
C1-Pro/CME CVE-2026-0995 affects CPUs implementing SME and
Streaming mode. Under specific micro-architectural conditions, a
TLBI + DSB sequence performed on one CPU (PE1) may not guarantee
completion of certain in-flight memory accesses performed on another CPU
(PE0). As a result, those accesses may complete after translation
changes have taken effect, potentially resulting in memory accesses
outside the expected translation or privilege boundaries.
To ensure architectural correctness, all affected CPUs must execute a
local DSB whenever any CPU performs TLB maintenance. The TF-A
workaround provides a coordinated EL3 mechanism that guarantees this
synchronisation across all online C1-Pro CPUs.
9.16.3. Erratum Status
This erratum applies to C1-Pro multi-core configurations with NUM_CME != 0 and (per TF-A runtime checks) to C1-Pro revisions up to r1p2 (inclusive).
The erratum is mitigated in software through EL3
coordination.
9.16.4. Erratum Description
TLBI + DSB might fail to ensure completion of memory accesses caused by FP/SIMD, SVE, and SME instructions while in Streaming mode, and by LDR/STR ZA/ZT0 instructions.
9.16.5. Description
A TLBI + DSB sequence executed on PE1 may fail to ensure
completion of some memory accesses on PE0 associated with:
LDR/STRto or fromZAorZT0.FP/SIMD, SVE, or SME memory accesses while in Streaming mode.
9.16.6. Configurations Affected
The erratum affects:
All multi-core configurations
Where
NUM_CME != 0(i.e., systems using a CME complex)
9.16.7. Conditions
The erratum occurs when all of the following are true:
PE0 executes: *
LDR/STRto/from ZA or ZT0, or * Memory accesses tied to FP/SIMD, SVE, or SME execution while in Streaming mode.PE1 performs: * A TLB invalidate instruction affecting a page used by PE0’s memory accesses, followed by A
DSBinstruction.Complex micro-architectural timing conditions occur.
9.16.8. Implications
If the above conditions are met:
The
DSBon PE1 may complete before certain affected memory accesses on PE0, even though those accesses are architecturally in scope for PE1’s TLBI.As a result, stale or incomplete memory accesses may occur after translation changes have taken effect.
This can lead to memory being accessed outside the expected translation or privilege boundaries, depending on the software context.
9.16.9. Enabling the Workaround in TF-A
Support for CVE-2026-0995 is build-time selectable and must be enabled by the platform.
To enable the workaround, the platform must:
Set
WORKAROUND_CVE_2026_0995=1.Include the C1-Pro workaround source in
BL31_SOURCES.Include the CPU service sources used by the SMC interface.
For example, a platform that contains affected C1-Pro CPUs should add:
WORKAROUND_CVE_2026_0995 := 1
ifeq (${WORKAROUND_CVE_2026_0995},1)
BL31_SOURCES += lib/cpus/aarch64/c1_pro_pubsub.c \
${CPU_SVC_SRCS}
endif
If the option is not enabled, or if the platform does not include the workaround source and CPU service sources, TF-A will not mitigate this erratum.
9.16.10. Why the Workaround Must Be Implemented in EL3
Due to interaction of Linux, pKVM, and GIC security states:
The non-secure world cannot reliably deliver SGIs without races.
A CPU entering
CPU_OFFmay be unable to receive SGIs without violating PSCI rules.Interrupt masking and CPU power-down sequences may interfere with SGI delivery.
EL3 is the only domain capable of:
Issuing secure SGIs.
Tracking CPU on/off and suspend/resume transitions.
Ensuring correct ordering during secure world entry/exit.
Avoiding SGI interference during CPU power-down.
9.16.11. Workaround Mechanism
The TF-A workaround ensures that:
Every affected CPU performs a local DSB whenever another CPU performs TLB maintenance.
A coordinated, EL3-managed SGI rendezvous ensures all online C1-Pro CPUs participate.
The mechanism uses:
A global atomic counter incremented by the SMC caller.
Local counters on each CPU tracking participation in each epoch.
Secure SGIs sent to all active C1-Pro CPUs.
The reference implementation uses the EL3 secure SGI number
ARM_IRQ_SEC_SGI_6.A wait-for-completion loop ensuring all CPUs have executed the mitigation before returning from SMC.
9.16.12. Global Counter and Ordering
Each SMC caller performs:
atomic_inc_return(global_counter)A barrier ensuring visibility before SGIs:
dmbish
The returned counter value is treated as that caller’s deadline: all CPUs must update their local counter to at least this value.
9.16.13. SGI Rendezvous
After incrementing the counter:
The caller sends a secure SGI to all online C1-Pro CPUs.
Receivers execute: * A local
DSB(the mitigation) * Update their local counter from the global counterThe caller waits until all CPUs have reached its deadline.
9.16.14. Handling PSCI CPU_OFF Races
SGIs may race with CPUs powering down.
To avoid violating PSCI semantics:
The caller sends a secure SGI to all online C1-Pro CPUs.
Receivers execute the local mitigation sequence, which: * May include a
DSBif required by architectural configuration (for example, whenSCTLR_EL3.IESBis not set). * Updates the local counter from the global counter.The caller waits until all CPUs have reached its deadline value.
9.16.15. Tracking Active C1-Pro CPUs
TF-A maintains:
A per-core signed bytemap tracking whether a C1-Pro CPU is currently active (reference count 0/1).
Per-CPU MPIDRs for SGI targeting.
The implementation supports up to 64 cores (PLATFORM_CORE_COUNT <= 64);
the SMC handler uses a 64-bit mask to track which CPUs were sent SGIs.
EL3 updates this information on:
psci_cpu_on_finishpsci_cpu_off_finishpsci_suspend_pwrdown_start/finishcm_entering_secure_world/exited_secure_world
9.16.16. Memory Ordering Requirements
Strong ordering is required to make the SGI rendezvous reliable across cores and across PSCI/world-switch transitions:
dsb()Drains outstanding memory accesses. On C1-Pro, the workaround performs a local
dsbonly whenSCTLR_EL3.IESBis not set; ifSCTLR_EL3.IESBis set, taking an exception to EL3 is sufficient and the explicitdsbcan be skipped.dmbishEnsures ordering/visibility between: * the atomic increment of
global_counterand subsequent SGI delivery, and * updates to the active-core bytemap before using it to decide who must participate in a rendezvous.isbUsed in the SGI handler to ensure the counter load occurs after interrupt acknowledge (prevents stale speculative loads if SGIs merge).
9.16.17. Summary
CVE-2026-0995 allows stale SME/SIMD/SVE memory accesses to persist beyond TLBI + DSB, violating architectural expectations and potentially compromising security.
The TF-A mitigation:
Ensures every active C1-Pro CPU drains outstanding accesses.
Coordinates CPUs via secure SGIs and atomic counters.
Handles PSCI races safely in EL3.
Is enabled at build time via:
WORKAROUND_CVE_2026_0995=1
The result is a robust, race-free mitigation suitable for systems deploying any security-sensitive workloads.
9.16.18. OS Coordination Requirement
Full mitigation of CVE-2026-0995 requires coordinated updates in both:
Trusted Firmware-A (EL3) — this advisory
The Operating System
The EL3 implementation provides a secure SGI-based rendezvous mechanism. The Operating System must invoke the defined SMC interface when performing affected TLB maintenance operations to ensure architectural ordering across all C1-Pro CPUs.
Deploying only one side of the mitigation is insufficient to guarantee architectural ordering and full protection.
For further information, affected CPUs, and detailed guidance, refer to the full Official Arm Advisory.