The advent of Peripheral Component Interconnect Express (PCIe) 5.0 and related protocols such as Compute Express Link (CXL) underscore a trend in the data center industry toward heterogenous computing topologies and computation-intensive workloads. Industry heavyweights and aspiring startups alike are developing semiconductor integrated circuits (ICs) to process an ever-increasing quantity of data. These purpose-built ICs have two overarching requirements in common: high bandwidth and low latency.
The low latency requirement is fairly intuitive; when you have multiple processing elements working on the same data sets, there must be a low-latency connection between all of them to efficiently maintain the coherency of the data.
The high bandwidth requirement is also an obvious one, but its implications on system design complexity and cost are far reaching. By 2021, it’s estimated that the number of workloads and compute instances will almost triple (2.7 fold) in cloud data centers compared to 2016; in that same period, the compute density (workloads and compute instances per physical server) will increase by 50%1. With the deployment of more servers, each one having more compute density, the challenges associated with moving data between processor, networking and storage nodes cost-effectively are exploding; and signal integrity (SI) will be the primary pain point for these densely-packed systems. Solving the SI problem will require a careful balance between signal retimers and low-loss printed circuit board (PCB) materials.
With Higher Bandwidth Comes Heartburn
Moving data from point A to point B in a server is no simple task. Figure 1 shows an 8 gigatransfers per second (GT/s) PCIe 3.0 server channel topology. It is possible to construct a 10-inch motherboard plus a 1-inch riser card plus a 4-inch add-in card (AIC) with mainstream PCB material (such as FR4 TU-862) and still meet the 22-dB insertion loss budget across temperature without requiring a signal retimer.
For 16-GT/s PCIe 4.0, the data transfer speed per lane doubles; this same topology now exceeds Gen 4’s 28-dB budget by 3 dB. Either a retimer or redriver (typically placed on the riser card) or low-loss PCB material is necessary to close the gap. Read the blog post “PCI-Express Retimers vs. Redrivers: An Eye-Popping Difference” to understand the difference between retimers and redrivers.
For 32-GT/s PCIe 5.0, the speed doubles again. The same topology with mainstream PCB materials now violates Gen 5’s 36-dB budget by 16 dB when accounting for temperature and humidity effects (more on this later). In fact, the insertion loss budget is exceeded before the signal ever leaves the motherboard, and a retimer is necessary on the motherboard for any card electromechanical (CEM) slot greater than 5.5 inches away, with or without a riser card (see Table 1).
Signal integrity challenges of a common server topology and the use of retimers.
PCIe channel reach for standard topologies using mid-loss PCB material.
Upgrading to an ultra-low-loss PCB material (such as Megtron-6) is an option, but this can be an expensive proposition depending on board size, layer count and volume2,3. Even when using ultra-low-loss material, many common topologies—multiconnector channels, captive channels longer than 14.9 inches (total) or standard CEM slot channels longer than 12.1 inches on the base board (see Table 2)—still exceed the total channel budget. In such cases, a retimer will still be necessary to ensure low error rates and robust link performance.
PCIe 5.0 channel reach for different topologies using ultra-low-loss PCB material.