At Nvidia GTC 2026, the focus is on collaboration between Rubin NVL72 and Groq 3 LPX, forming a new inference paradigm. Nvidia breaks down the inference process, with GPUs handling preprocessing and attention calculations, while LPX focuses on real-time decoding, significantly boosting efficiency. Groq's LPU architecture reduces latency, suitable for real-time AI applications, with the LPX cabinet featuring 256 LPUs further enhancing inference performance.