On Sun, Oct 18, 2015 at 02:07:13PM +0100, Chris Wilson wrote:
I couldn't spot the difference either. I am beginning to suspect it is gcc as
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 6743ff7..c9097b5 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -130,11 +130,11 @@ drm_clflush_virt_range(void *addr, unsigned long length) { #if defined(CONFIG_X86) if (cpu_has_clflush) { const int size = boot_cpu_data.x86_clflush_size;
void *end = addr + length;
void *end = addr + length - 1; addr = (void *)(((unsigned long)addr) & -size); mb();
for (; addr < end; addr += size)
for (; addr <= end; addr += size) clflushopt(addr); mb(); return;
s/clflushopt/clflush/ works just as well.
Plot thickens. Current guess is that gcc doesn't see the constraints underneath the alternative()?
Adding a barrier() after clflushopt() in the loop is sufficient as well. Almost certain that alternative() is confusing gcc. -Chris