On Mon, May 19, 2014 at 6:31 PM, Lucas Stach l.stach@pengutronix.de wrote:
Am Montag, den 19.05.2014, 16:10 +0900 schrieb Alexandre Courbot:
From: Lucas Stach dev@lynxeye.de
Signed-off-by: Lucas Stach dev@lynxeye.de [acourbot@nvidia.com: make conditional and platform-friendly] Signed-off-by: Alexandre Courbot acourbot@nvidia.com
drivers/gpu/drm/nouveau/nouveau_bo.c | 32 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_bo.h | 20 ++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_gem.c | 8 +++++++- 3 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index b6dc85c614be..0886f47e5244 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -407,6 +407,8 @@ nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible, { int ret;
nouveau_bo_sync_for_device(nvbo);
ret = ttm_bo_validate(&nvbo->bo, &nvbo->placement, interruptible, no_wait_gpu); if (ret)
@@ -487,6 +489,36 @@ nouveau_bo_invalidate_caches(struct ttm_bo_device *bdev, uint32_t flags) return 0; }
+#ifdef NOUVEAU_NEED_CACHE_SYNC
I don't like this ifdef at all. I know calling this functions will add a little overhead to x86 where it isn't strictly required, but I think it's negligible.
When I looked at them the dma_sync_single_for_[device|cpu] functions which are called here map out to just a drain of the PCI store buffer on x86, which should be fast enough to be done unconditionally. They won't so any time-consuming cache synchronization on PCI coherent arches.
If Ben agrees with it I am also perfectly fine with getting rid of this #ifdef.