@vbabkaVlastimil Babka πŸ‡¨πŸ‡ΏπŸ‡ͺπŸ‡ΊπŸ‡ΊπŸ‡¦ @oleksandrOleksandr Natalenko, MSE :verified: @penguin42 Ah, you mean that RDTSCP forces all older instructions to retire before reading the counter. Well, that's not serializing, and based on what I observed in the Retired Instructions PMU, I don't think I could learn anything new from such an experiment.

Anyway, has been communicated to the customer, who agrees to follow up on this issue with AMD, so thank you all for your suggestions, but I'm not working on it any longer.

@vbabkaVlastimil Babka πŸ‡¨πŸ‡ΏπŸ‡ͺπŸ‡ΊπŸ‡ΊπŸ‡¦ @oleksandrOleksandr Natalenko, MSE :verified: @penguin42 As a side note, the RDTSCP instruction was added to the AMD64 instruction set based on early feedback from SUSE on the architecture white paper.

The reason was that TSC should be readable from user space (to allow implementing clock_gettime(2) as a vsyscall), but the TSC was (intentionally!) not synchronized across all cores, so the TSC value was useless unless you knew on which CPU it was read. However, if you execute a CPUID + RDTSC pair in user space, your process may happen to be preempted just between the two instructions and migrated to another CPU. The RDTSCP instruction solves the problem by being atomic.

Then, a few years later, TSC was decoupled from the instruction clock. Among other things, this change allowed to read the same TSC value from all cores. Had this been the case from the beginning, there wouldn't be a RDTSCP instruction.

0

If you have a fediverse account, you can quote this note from your own instance. Search https://infosec.exchange/users/ptesarik/statuses/115263649230661598 on your instance and quote it. (Note that quoting is not supported in Mastodon.)