Performance of C4 when CPU bound
Back in November 2025, when doing tests of C4 on a loopback address, I observed that C4 achieved lower data rates than Cubic or even BBR. Since “performance under loopback” was not a high priority scenario, I filed that in the long pile of issues to deal with later. Then, in early January 2026, I read a preprint of a paper by Kathrin Elmenhorst and Nils Aschenbruck titled “2BRobust – Overcoming TCP BBR Performance Degradation in Virtual Machines under CPU Contention” (see: https://arxiv.org/abs/2601.05665). In that paper, they point out that BBR achieves lower than nominal performance when running on VMs under high CPU load, and trace that to pacing issues. Pacing in an application process involves periodically waiting until the pacing system acquires enough tokens. If the CPU is highly loaded, the system call can take longer than the specified maximum wait time, and pacing thus slows the connection more than expected. In such conditions, they suggest increasing the programmed pacing rate above the nominal rate, and show that it helps performance.
www.privateoctopus.com