Re: [PATCH 3/6] x86: Add support for the clwb instruction

11 Nov 2014

On Tue, 2014-11-11 at 20:12 +0100, Borislav Petkov wrote:
...
On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote:
...
Add support for the new clwb instruction.  This instruction was
announced in the document "Intel Architecture Instruction Set Extensions
Programming Reference" with reference number 319433-022.
https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
Here are some things of note:

As with the clflushopt patches before this, I'm assuming that the addressing
mode generated by the original clflush instruction will match the new
clflush instruction with the 0x66 prefix for clflushopt, and for the
xsaveopt instruction with the 0x66 prefix for clwb.  For all the test cases
that I've come up with and for the new clwb code generated by this patch
series, this has proven to be true on my test machine.

According to the SDM, xsaveopt has a form where it has a REX.W prefix.  I
believe that this prefix will not be generated by gcc in x86_64 kernel code.
Based on this, I don't believe I need to account for this extra prefix when
dealing with the assembly language created for clwb.  Please correct me if
I'm wrong.


Signed-off-by: Ross Zwisler ross.zwisler@linux.intel.com
Cc: H Peter Anvin h.peter.anvin@intel.com
Cc: Ingo Molnar mingo@kernel.org
Cc: Thomas Gleixner tglx@linutronix.de
Cc: David Airlie airlied@linux.ie
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org

arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/special_insns.h | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index b3e6b89..fbbed34 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -227,6 +227,7 @@
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 1709a2e..a328460 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,16 @@ static inline void clflushopt(volatile void *__p)
   	       "+m" (*(volatile char __force *)__p));
 }
+static inline void clwb(volatile void *__p)
+{

alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",

Any particular reason for using 0x3e as a prefix to have the insns be
the same size or is it simply because CLFLUSH can stomach it?
:-)
Essentially we need one additional byte at the beginning of the clflush so
that we can flip it into a clflushopt by changing that byte into a 0x66
prefix.  Two options are to either insert a 1 byte ASM_NOP1, or to add a 1
byte NOP_DS_PREFIX.  Both have no functional effect with the plain clflush,
but I've been told that executing a clflush + prefix should be faster than
executing a clflush + NOP.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 3/6] x86: Add support for the clwb instruction