Re: [PATCH 4/4] drm/nouveau: introduce CPU cache flushing macro

23 May 2014

On Mon, May 19, 2014 at 7:03 PM, Thierry Reding
thierry.reding@gmail.com wrote:
...
On Mon, May 19, 2014 at 11:22:11AM +0200, Lucas Stach wrote:
...
Am Montag, den 19.05.2014, 11:02 +0200 schrieb Thierry Reding:
...
On Mon, May 19, 2014 at 04:10:58PM +0900, Alexandre Courbot wrote:
...
Some architectures (e.g. ARM) need the CPU buffers to be explicitely
flushed for a memory write to take effect. Not doing so results in
synchronization issues, especially after writing to BOs.
It seems to me that the above is generally true for all architectures,
not just ARM.
No, on PCI coherent arches, like x86 and some PowerPCs, the GPU will
snoop the CPU caches and therefore an explicit cache flush is not
required.
I was criticizing the wording in the commit message. Perhaps it could be
enhanced with what you just said.
...
...
...

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
[...]
...
index 0886f47e5244..b9c9729c5733 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -437,8 +437,10 @@ nouveau_bo_wr16(struct nouveau_bo *nvbo, unsigned index, u16 val)
  mem = &mem[index];
  if (is_iomem)
          iowrite16_native(val, (void __force __iomem *)mem);

else


else {
        *mem = val;
    nv_cpu_cache_flush_area(mem, 2);


}

}
u32
@@ -461,8 +463,10 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, u32 val)
  mem = &mem[index];
  if (is_iomem)
          iowrite32_native(val, (void __force __iomem *)mem);

else


else {
        *mem = val;
    nv_cpu_cache_flush_area(mem, 4);


}

This looks rather like a sledgehammer to me. Effectively this turns nvbo
into an uncached buffer. With additional overhead of constantly flushing
caches. Wouldn't it make more sense to locate the places where these are
called and flush the cache after all the writes have completed?
I don't think the explicit flushing for those things makes sense. I
think it is a lot more effective to just map the BOs write-combined on
PCI non-coherent arches. This way any writes will be buffered. Reads
will be slow, but I don't think nouveau is reading back a lot from those
buffers.
Using the write-combining buffer doesn't need any additional
synchronization as it will get flushed on pushbuf kickoff anyways.
Sounds good to me.
I will need to wrap my head around TTM some more to understand how to
do this the right way, but it is true that brute-forcing in-memory BO
mappings to be WC make the addition of nv_cpu_cache_flush_area()
unneeded. Is that the direction we want to take with this?

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 4/4] drm/nouveau: introduce CPU cache flushing macro