I never really thought about how processing of a single TCP ACK packet (tcp_clean_rtx_queue -> tcp_rtx_queue_unlink_and_free) might involve a bunch of main memory accesses for traversing the retransmit queue and freeing the SKBs in it and freeing the data pages that those SKBs point to... I guess in the optimal (and normal?) case the SKBs each cover something like 68 KiB, so the rbtree walk wouldn't be that long and most of the work would be freeing order-0 frag pages, which are pointed to by an array in the skb_shared_info, probably making the memory accesses for that somewhat more parallel?

0

If you have a fediverse account, you can quote this note from your own instance. Search https://infosec.exchange/users/jann/statuses/114684981355231716 on your instance and quote it. (Note that quoting is not supported in Mastodon.)