Optical Computing

Designing Data Centers with Co-Packaged Optics and Disaggregated Resources

Explore the transition from pluggable transceivers to co-packaged optics to achieve terabit-per-second bandwidth and lower latency in cloud clusters.

Emerging TechAdvanced12 min read

In this article

The Looming Electrical Wall in Data Center Interconnects

The Physics of Copper Attenuation
SerDes Overhead and Power Density

The Transition from Pluggable Transceivers to CPO

The Front Panel Bottleneck
Eliminating the Retimer Layer

Architectural Implementation of Co-Packaged Optics

Silicon Photonics Modulators
The Role of External Laser Sources

The Software and System Impact

Reducing Distributed Training Latency
Reliability and Repairability Challenges

The Looming Electrical Wall in Data Center Interconnects

As data centers scale to meet the demands of large language model training and real-time inference, the physical limits of traditional copper interconnects have become a primary bottleneck. Electrical signals traveling through standard copper traces on a printed circuit board encounter significant resistance and capacitance which leads to signal degradation and heat. This phenomenon is particularly problematic at frequencies required for 112G and 224G signaling where the energy required to transmit a single bit increases exponentially with the distance of the trace.

The current architectural response relies on sophisticated Serializer/Deserializer or SerDes components to manage signal integrity across these copper lanes. These components consume a massive portion of the total system power budget just to compensate for the loss of signal over a few inches of PCB. We are reaching a point where the energy spent moving data between chips is rivaling the energy spent processing the data itself which is an unsustainable trajectory for sustainable computing.

pythonModeling Energy Cost per Bit

1def calculate_link_efficiency(bit_rate_gbps, power_mw, link_length_mm):
2    # Energy per bit in picojoules (pJ/bit)
3    # A typical goal for optical is < 1 pJ/bit
4    energy_per_bit = power_mw / bit_rate_gbps
5    
6    # Efficiency factor drops as link length increases for copper
7    if link_length_mm > 100:
8        efficiency_penalty = 1.5
9    else:
10        efficiency_penalty = 1.0
11        
12    return energy_per_bit * efficiency_penalty
13
14# Scenario: Traditional Electrical SerDes at 112Gbps
15print(f'Energy/Bit: {calculate_link_efficiency(112, 560, 150):.2f} pJ/bit')

To solve this problem we must shift from electrons to photons for high-speed communication within the chassis rather than just between racks. By converting electrical signals to light as close to the processor as possible we can drastically reduce the power required for data transport. This transition requires a fundamental change in how we package optics and silicon together.

The Physics of Copper Attenuation

At high frequencies copper acts more like an antenna than a controlled conductor which causes electromagnetic interference and crosstalk between adjacent lanes. The skin effect also forces current to flow only on the outer surface of the conductor which increases the effective resistance and generates more heat. These physical properties mean that high-speed electrical signals can only travel very short distances before they become unreadable without heavy signal processing.

Modern switch designs try to mitigate this by using expensive PCB materials like Megtron-6 or by using twinaxial cables to bypass the board entirely. However these solutions add significant complexity and cost to the hardware manufacturing process. Eventually the physical dimensions of the front panel and the routing complexity of the PCB traces create a hard ceiling on total system bandwidth.

SerDes Overhead and Power Density

SerDes engines are the workhorses of modern networking but they are becoming the largest heat generators on the silicon die. As we move from NRZ to PAM4 signaling the complexity of the digital signal processing required to recover the data increases. This leads to a thermal cycle where more power is used to drive signals which generates more heat and requires even more power for cooling systems.

The Transition from Pluggable Transceivers to CPO

For decades the industry has relied on pluggable optical transceivers like the QSFP and OSFP form factors to handle long-distance networking. These modules are convenient because they allow for field-replaceable units and a modular approach to bandwidth allocation. However the physical distance between the switch ASIC and the front-panel pluggable module creates a significant electrical loss that must be overcome by retimers.

Co-Packaged Optics or CPO solves this by mounting the optical engine directly on the same substrate or package as the ASIC. This proximity allows the electrical connection between the silicon and the optics to be extremely short which eliminates the need for power-hungry retimers. This architectural shift represents the biggest change in high-performance computing hardware in the last twenty years.

Reduced Trace Length: Shortens the electrical path from several inches to a few millimeters.
Improved Bandwidth Density: Enables more terabits per second per millimeter of the package edge.
Power Efficiency: Reduces the energy per bit by up to thirty percent by removing redundant signal processing.
Lower Latency: Bypasses multiple stages of retiming and signal recovery.

The transition to Co-Packaged Optics is not just an upgrade; it is a survival necessity for hyperscale networking as the electrical signal integrity margin for 224G lanes approaches zero.

The Front Panel Bottleneck

The front panel of a 1U rack switch has a limited surface area which limits the number of pluggable modules that can be physically installed. As bandwidth demands increase we cannot simply add more ports because the connectors are too large and generate too much heat in one concentrated area. CPO allows us to move the bulk of the optical processing away from the faceplate and closer to the cooling solutions of the central processor.

Eliminating the Retimer Layer

Retimers are essentially signal repeaters that clean up and boost electrical signals as they travel across a PCB. In a system using CPO the electrical link is so short that the signal reaches the optical modulator without needing to be retimed or amplified. This saves significant power and reduces the overall component count on the motherboard which improves long-term reliability.

Architectural Implementation of Co-Packaged Optics

Implementing CPO involves a complex integration of silicon photonics where optical components are manufactured using standard CMOS processes. This allows lasers, modulators, and detectors to be integrated on a silicon die alongside traditional logic. The challenge lies in managing the different environmental requirements of silicon logic and optical components especially regarding temperature sensitivity.

Most CPO designs use a remote laser source or RLS to keep the heat-sensitive laser components away from the high-temperature environment of the main ASIC. The light is piped into the package via fiber optics where it is then modulated by the optical engine using the data from the processor. This decoupled architecture ensures that the laser operates at peak efficiency while the processor can run at its thermal limit.

cppHigh-Level Hardware Interface Simulation

1// Simulation of an optical engine driver in a CPO environment
2class OpticalEngine {
3public:
4    void configureModulator(int lane_id, float power_target) {
5        // Set target drive voltage for silicon photonics modulator
6        this->lanes[lane_id].voltage = power_target * 0.85;
7        this->lanes[lane_id].status = ENABLED;
8    }
9
10    bool checkLaserLock() {
11        // Remote laser source must be locked to frequency before transmission
12        return this->external_laser.is_stable();
13    }
14
15private:
16    struct LaneConfig {
17        float voltage;
18        bool status;
19    } lanes[128]; // Supporting 128 lanes of 100G for 12.8Tbps
20};

The packaging itself uses advanced techniques like Through-Silicon Vias or TSVs to connect the electrical die to the optical die. This vertical stacking allows for thousands of high-speed connections in a very small footprint which is impossible with traditional side-by-side wire bonding. The result is a compact high-performance unit that can handle terabits of data per second with minimal overhead.

Silicon Photonics Modulators

The modulator is the heart of the optical engine and is responsible for converting electrical bits into optical pulses. Mach-Zehnder Interferometers or MZI and Micro-ring Resonators are the two primary types of modulators used in silicon photonics today. While MZIs are more stable over temperature ranges micro-ring resonators offer much higher density and lower power consumption which makes them ideal for CPO.

The Role of External Laser Sources

Lasers are inherently inefficient and generate a significant amount of heat relative to the amount of light they produce. By placing the laser in a separate module on the front panel we can replace the light source without replacing the expensive switch ASIC. This modularity is a critical requirement for data center operators who need to ensure high availability and ease of maintenance.

The Software and System Impact

From a software perspective the move to CPO is largely transparent but it enables new topologies that were previously impossible. Higher bandwidth density means we can build flatter network hierarchies with fewer hops between compute nodes. This directly translates to lower tail latency in distributed applications like large-scale training of neural networks where synchronization is the primary bottleneck.

Developers working on low-level drivers and network stacks will see more integrated telemetry from the optical components. Instead of basic link-state reporting CPO modules provide detailed metrics on laser power, modulation error ratios, and thermal margins. This data allows for more intelligent traffic routing and predictive failure analysis in large-scale clusters.

As we integrate optics more deeply into the compute fabric we move closer to the concept of optical circuit switching. This would allow software to dynamically reconfigure the physical network topology based on the workload patterns. Such flexibility would represent a paradigm shift from static fat-tree networks to dynamic demand-driven interconnects.

Reducing Distributed Training Latency

In a distributed training environment the time spent in All-Reduce operations can consume up to fifty percent of total execution time. By using CPO to increase the bandwidth between GPU nodes we can significantly reduce the communication overhead and improve the scaling efficiency. This allows for larger models to be trained on the same physical hardware footprint.

Reliability and Repairability Challenges

One major trade-off of CPO is that if an optical component inside the package fails the entire switch ASIC might need to be replaced. This is why the industry is focusing heavily on reliability testing and the use of redundant optical lanes. Engineers must weigh the performance benefits of integration against the potential increase in the cost of hardware failures.

Accelerating Neural Networks with Optical Matrix-Vector Multiplication Architecting Hybrid Electronic-Photonic Systems for CMOS Compatibility