On Tue, 2014-11-11 at 20:46 +0100, Borislav Petkov wrote:
On Tue, Nov 11, 2014 at 12:40:00PM -0700, Ross Zwisler wrote:
Yep, it's weird, I know. :)
But sure, saving opcode space, makes sense to me.
Btw, I'd still be interested about this:
+static inline void clwb(volatile void *__p) +{
alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
Any particular reason for using 0x3e as a prefix to have the insns be the same size or is it simply because CLFLUSH can stomach it?
Ah, sorry, I was still responding to your first mail. :) Response copied here to save searching:
Essentially we need one additional byte at the beginning of the clflush so that we can flip it into a clflushopt by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain clflush, but I've been told that executing a clflush + prefix should be faster than executing a clflush + NOP.
I agree, this is useful info - I'll add it to the patch comments for v2.
Thank you for the feedback.
- Ross