Do PCIe IOCTL and SERDES pose a risk to Google Ironwood TPU? (15th Dec 2025)

My originally scheduled for release on December 17, 2025, it was released ahead of schedule!

Preface: PCIe is one protocol used to transfer data across the SerDes. Different protocols can be used to transfer data across the SerDes.

SerDes (Serializer/Deserializer) in PCIe Gen 5 and Gen 2 are critical physical layer components that handle high-speed data conversion, but Gen 5 operates at a massive 32 GT/s per lane, doubling Gen 2’s 5 GT/s, requiring advanced signal integrity (like DSP, error correction) and sophisticated retimers to manage vastly higher frequencies (32 GHz vs. 5 GHz), while maintaining backward compatibility for seamless integration, crucial for modern AI/ML and data-intensive applications

Background: Recent AMD advisories (CVE-2025-9612/9613/9614) highlight vulnerabilities in PCIe Integrity and Data Encryption (IDE) and IOCTL-based SERDES control. These flaws allow attackers with low-level or physical access to manipulate PCIe traffic or exploit driver logic, potentially leading to:

  • Data Integrity Violations
  • Privilege Escalation
  • Denial of Service (DoS)

Google’s Ironwood TPU architecture uses PCIe Gen5 (host) and Gen2 (gBMC) links, both relying on SERDES for high-speed signaling. If TPU drivers expose IOCTL interfaces for SERDES configuration without strict validation, similar risks apply:

  • Direct Hardware Access: IOCTLs can alter PLL settings, lane equalization, and data rates.
  • Privilege Escalation: Weak access control could let non-root processes manipulate SERDES.
  • Kernel Attack Surface: Complex IOCTL handlers may enable buffer overflow or arbitrary code execution.

Impact on Cloud TPU Deployments: Ironwood’s scale—9,216 chips per pod, 1.77 PB shared memory, and 9.6 Tb/s interconnect bandwidth—amplifies the consequences of link instability or compromised isolation. Multi-tenant environments increase exposure to insider or advanced persistent threats.

Recommended Mitigations

Restrict IOCTL Access: Enforce root-only privileges for SERDES configuration commands.

Implement capability checks for sensitive operations (e.g., SERDES_SET_DATA_RATE).

Validate Input Parameters: Ensure strict bounds on PLL, lane equalization, and data rate settings.

Apply PCIe IDE Erratum #1 Guidance: Update firmware to address IDE vulnerabilities per PCI-SIG recommendations.

Enable Hardware Security Features: Use IOMMU and Zero-Trust principles for PCIe endpoints.

Continuous Monitoring: Deploy anomaly detection for PCIe link training and SERDES state changes.

– End of article –

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.