On Thu, Jan 07, 2016 at 02:32:23PM -0800, H. Peter Anvin wrote:
On 01/07/16 14:29, H. Peter Anvin wrote:
I would be very interested in knowing if replacing the final clflushopt with a clflush would resolve your problems (in which case the last mb() shouldn't be necessary either.)
Nevermind. CLFLUSH is not ordered with regards to CLFLUSHOPT to the same cache line.
Could you add a sync_cpu(); call to the end (can replace the final mb()) and see if that helps your case?
s/sync_cpu()/sync_core()/
No. I still see failures on Baytrail and Braswell (Pineview is not affected) with the final mb() replaced with sync_core(). I can reproduce failures on Pineview by tweaking the clflush_cache_range() parameters, so I am fairly confident that it is validating the current code.
iirc sync_core() is cpuid, a heavy serialising instruction, an alternative to mfence. Is there anything that else I can infer about the nature of my bug from this result? -Chris