On Sat, Oct 17, 2015 at 11:03:19PM +0300, Imre Deak wrote:
On Fri, 2015-10-16 at 20:55 +0100, Chris Wilson wrote:
Fixes regression from
commit afcd950cafea6e27b739fe7772cbbeed37d05b8b Author: Chris Wilson chris@chris-wilson.co.uk Date: Wed Jun 10 15:58:01 2015 +0100
drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range()
I'm stumped. Looking at the loop we should be iterating over every cache line until we reach the start of the cacheline after the end of the virtual range. Evidence says otherwise.
More bizarely, I stored the last address to be clflushed and found it to be equal to the start of the cacheline containing the last byte. Doubly purplexed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92501 Testcase: gem_tiled_partial_pwrite_pread/reads Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Imre Deak imre.deak@intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch
drivers/gpu/drm/drm_cache.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 6743ff7dccfa..7c909bc8b68a 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -131,10 +131,13 @@ drm_clflush_virt_range(void *addr, unsigned long length) #if defined(CONFIG_X86) if (cpu_has_clflush) { const int size = boot_cpu_data.x86_clflush_size;
void *end = addr + length;
addr = (void *)(((unsigned long)addr) & -size);
void *end;
end = (void *)(((unsigned long)addr + length - 1) & -size);
addr = (void *)((unsigned long)addr & -size);
- mb();
for (; addr < end; addr += size)
for (; addr <= end; addr += size)
Hm, I can't see how could this make any difference. The old way still looks ok to me and the new version would flush the exact same cache lines as the old one using the same addresses (beginning of each cache line).
I couldn't spot the difference either. I am beginning to suspect it is gcc as
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 6743ff7..c9097b5 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -130,11 +130,11 @@ drm_clflush_virt_range(void *addr, unsigned long length) { #if defined(CONFIG_X86) if (cpu_has_clflush) { const int size = boot_cpu_data.x86_clflush_size; - void *end = addr + length; + void *end = addr + length - 1; addr = (void *)(((unsigned long)addr) & -size); mb(); - for (; addr < end; addr += size) + for (; addr <= end; addr += size) clflushopt(addr); mb(); return;
Also fixes gem_tiled_partial_pwrite (on byt and bsw). -Chris