My originally scheduled for release on December 17, 2025, it was released ahead of schedule!
Preface: PCIe is one protocol used to transfer data across the SerDes. Different protocols can be used to transfer data across the SerDes.
SerDes (Serializer/Deserializer) in PCIe Gen 5 and Gen 2 are critical physical layer components that handle high-speed data conversion, but Gen 5 operates at a massive 32 GT/s per lane, doubling Gen 2’s 5 GT/s, requiring advanced signal integrity (like DSP, error correction) and sophisticated retimers to manage vastly higher frequencies (32 GHz vs. 5 GHz), while maintaining backward compatibility for seamless integration, crucial for modern AI/ML and data-intensive applications
Background: Recent AMD advisories (CVE-2025-9612/9613/9614) highlight vulnerabilities in PCIe Integrity and Data Encryption (IDE) and IOCTL-based SERDES control. These flaws allow attackers with low-level or physical access to manipulate PCIe traffic or exploit driver logic, potentially leading to:
- Data Integrity Violations
- Privilege Escalation
- Denial of Service (DoS)
Google’s Ironwood TPU architecture uses PCIe Gen5 (host) and Gen2 (gBMC) links, both relying on SERDES for high-speed signaling. If TPU drivers expose IOCTL interfaces for SERDES configuration without strict validation, similar risks apply:
- Direct Hardware Access: IOCTLs can alter PLL settings, lane equalization, and data rates.
- Privilege Escalation: Weak access control could let non-root processes manipulate SERDES.
- Kernel Attack Surface: Complex IOCTL handlers may enable buffer overflow or arbitrary code execution.
Impact on Cloud TPU Deployments: Ironwood’s scale—9,216 chips per pod, 1.77 PB shared memory, and 9.6 Tb/s interconnect bandwidth—amplifies the consequences of link instability or compromised isolation. Multi-tenant environments increase exposure to insider or advanced persistent threats.
Recommended Mitigations
Restrict IOCTL Access: Enforce root-only privileges for SERDES configuration commands.
Implement capability checks for sensitive operations (e.g., SERDES_SET_DATA_RATE).
Validate Input Parameters: Ensure strict bounds on PLL, lane equalization, and data rate settings.
Apply PCIe IDE Erratum #1 Guidance: Update firmware to address IDE vulnerabilities per PCI-SIG recommendations.
Enable Hardware Security Features: Use IOMMU and Zero-Trust principles for PCIe endpoints.
Continuous Monitoring: Deploy anomaly detection for PCIe link training and SERDES state changes.
– End of article –