move implemenation from ttm to amdgpu driver. (suggested by Christian) per-vm-lru is because of per-vm-bo, which has no chance to refresh lru, the nagtive effect is game performance isn't stable. so all per-vm-bo should have a default order, every per-vm-bo has its priority, relying on its creation index. When doing CS, if any normal bo is used, then all per-vm-bo should be used, so per-vm-bo prioirty >= normal bo priority.
Above is per-vm-lru starting point.
Chunming Zhou (13): ttm: abstruct evictable bo ttm: allow driver has own lru policy drm/amdgpu: add lru backend for amdgpu driver drm/amdgpu: init/fini vm lru drm/amdgpu: pass vm lru to buffer object drm/amdgpu: add amdgpu lru implementation drm/ttm: export ttm_bo_ref_bug drm/amdgpu: use RB tree instead of link list drm/amdgpu: add bo index counter drm/amdgpu: bulk move per vm bo ttm: export ttm_transfered_destroy drm/amdgpu: transferred bo doesn't use vm lru drm/amdgpu: free vm lru when vm fini
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 242 ++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 25 +++ drivers/gpu/drm/ttm/ttm_bo.c | 92 +++++++---- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- include/drm/ttm/ttm_bo_driver.h | 52 +++++++ 12 files changed, 419 insertions(+), 37 deletions(-)
Change-Id: Ie81985282fab1e564fc2948109fae2173613b465 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98e06f8bf23b..15506682a0be 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -704,22 +704,20 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, return ret; }
-static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object * +ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + int i;
- spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) continue;
if (place && !bdev->driver->eviction_valuable(bo, @@ -738,6 +736,21 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; }
+ return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_buffer_object *bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_get_evictable_bo(bdev, mem_type, place, ctx, &locked); if (!bo) { spin_unlock(&glob->lru_lock); return -EBUSY;
All of those changes are including a Change-Id that has no bearing in upstream patches and are missing a proper commit description explaining why a specific change is done.
Regards, Lucas
Am Mittwoch, den 09.05.2018, 14:45 +0800 schrieb Chunming Zhou:
Change-Id: Ie81985282fab1e564fc2948109fae2173613b465
Signed-off-by: Chunming Zhou david1.zhou@amd.com
drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98e06f8bf23b..15506682a0be 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -704,22 +704,20 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
return ret;
} -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object * +ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx,
bool *locked)
{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- unsigned i;
- int ret;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
- int i;
- spin_lock(&glob->lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) {
if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked))
if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked))
continue;
if (place && !bdev->driver->eviction_valuable(bo,
@@ -738,6 +736,21 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
bo = NULL; }
- return bo;
+}
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- int ret;
- spin_lock(&glob->lru_lock);
- bo = ttm_mem_get_evictable_bo(bdev, mem_type, place, ctx, &locked);
if (!bo) { spin_unlock(&glob->lru_lock); return -EBUSY;
On Wed, May 09, 2018 at 10:34:51AM +0200, Lucas Stach wrote:
All of those changes are including a Change-Id that has no bearing in upstream patches and are missing a proper commit description explaining why a specific change is done.
Imo the Change-Id: is ok if it makes people happy wrt internal tracking. Linus might blow up, but there's lots of random nonsense that Linus blows up on, so whatever.
Lack of real commit message that explains stuff is the real thing here I'd say. -Daniel
Regards, Lucas
Am Mittwoch, den 09.05.2018, 14:45 +0800 schrieb Chunming Zhou:
Change-Id: Ie81985282fab1e564fc2948109fae2173613b465
Signed-off-by: Chunming Zhou david1.zhou@amd.com
drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98e06f8bf23b..15506682a0be 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -704,22 +704,20 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
return ret;
} -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object * +ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx,
bool *locked)
{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- unsigned i;
- int ret;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
- int i;
- spin_lock(&glob->lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) {
if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked))
if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked))
continue;
if (place && !bdev->driver->eviction_valuable(bo,
@@ -738,6 +736,21 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
bo = NULL; }
- return bo;
+}
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- int ret;
- spin_lock(&glob->lru_lock);
- bo = ttm_mem_get_evictable_bo(bdev, mem_type, place, ctx, &locked);
if (!bo) { spin_unlock(&glob->lru_lock); return -EBUSY;
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2018年05月09日 17:50, Daniel Vetter wrote:
On Wed, May 09, 2018 at 10:34:51AM +0200, Lucas Stach wrote:
All of those changes are including a Change-Id that has no bearing in upstream patches and are missing a proper commit description explaining why a specific change is done.
Imo the Change-Id: is ok if it makes people happy wrt internal tracking. Linus might blow up, but there's lots of random nonsense that Linus blows up on, so whatever.
Yeah, Change-Id is just used internal, When upstreaming, it is removed. Alex, right? I'm not clear how you handle that when you upstream our internal patches.
Lack of real commit message that explains stuff is the real thing here I'd say.
Agree, lacking commit message is really bad, that could be because this is a big feature, I was busy with implementing before. If Christian agree with my this idea, I will update more commit for every patch when sending again.
Thanks, David Zhou
-Daniel
Regards, Lucas
Am Mittwoch, den 09.05.2018, 14:45 +0800 schrieb Chunming Zhou:
Change-Id: Ie81985282fab1e564fc2948109fae2173613b465
Signed-off-by: Chunming Zhou david1.zhou@amd.com
drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98e06f8bf23b..15506682a0be 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -704,22 +704,20 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
return ret;
}
-static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object * +ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx,
bool *locked)
{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- unsigned i;
- int ret;
- struct ttm_mem_type_manager *man = &bdev->man[mem_type];
- int i;
- spin_lock(&glob->lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) {
if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked))
if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked))
continue;
if (place && !bdev->driver->eviction_valuable(bo,
@@ -738,6 +736,21 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
bo = NULL; }
- return bo;
+}
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+{
- struct ttm_bo_global *glob = bdev->glob;
- struct ttm_buffer_object *bo = NULL;
- bool locked = false;
- int ret;
- spin_lock(&glob->lru_lock);
- bo = ttm_mem_get_evictable_bo(bdev, mem_type, place, ctx, &locked);
if (!bo) { spin_unlock(&glob->lru_lock); return -EBUSY;
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, May 9, 2018 at 6:06 AM, zhoucm1 zhoucm1@amd.com wrote:
On 2018年05月09日 17:50, Daniel Vetter wrote:
On Wed, May 09, 2018 at 10:34:51AM +0200, Lucas Stach wrote:
All of those changes are including a Change-Id that has no bearing in upstream patches and are missing a proper commit description explaining why a specific change is done.
Imo the Change-Id: is ok if it makes people happy wrt internal tracking. Linus might blow up, but there's lots of random nonsense that Linus blows up on, so whatever.
Yeah, Change-Id is just used internal, When upstreaming, it is removed. Alex, right? I'm not clear how you handle that when you upstream our internal patches.
Yes, I strip them off when we upstream the patches, but we use them internally. For patch review just ignore them.
Alex
Lack of real commit message that explains stuff is the real thing here I'd say.
Agree, lacking commit message is really bad, that could be because this is a big feature, I was busy with implementing before. If Christian agree with my this idea, I will update more commit for every patch when sending again.
Thanks, David Zhou
-Daniel
Regards, Lucas
Am Mittwoch, den 09.05.2018, 14:45 +0800 schrieb Chunming Zhou:
Change-Id: Ie81985282fab1e564fc2948109fae2173613b465
Signed-off-by: Chunming Zhou david1.zhou@amd.com
drivers/gpu/drm/ttm/ttm_bo.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98e06f8bf23b..15506682a0be 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -704,22 +704,20 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
return ret;
} -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object * +ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx,
bool *locked)
{
struct ttm_bo_global *glob = bdev->glob;
struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL;
bool locked = false;
unsigned i;
int ret;
struct ttm_mem_type_manager *man = &bdev->man[mem_type];
int i;
spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) {
if (!ttm_bo_evict_swapout_allowable(bo, ctx,
&locked))
if (!ttm_bo_evict_swapout_allowable(bo, ctx,
locked)) continue;
if (place &&
!bdev->driver->eviction_valuable(bo,
@@ -738,6 +736,21 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
bo = NULL; }
return bo;
+}
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
uint32_t mem_type,
const struct ttm_place *place,
struct ttm_operation_ctx *ctx)
+{
struct ttm_bo_global *glob = bdev->glob;
struct ttm_buffer_object *bo = NULL;
bool locked = false;
int ret;
spin_lock(&glob->lru_lock);
bo = ttm_mem_get_evictable_bo(bdev, mem_type, place, ctx,
&locked); if (!bo) { spin_unlock(&glob->lru_lock); return -EBUSY;
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
general ttm lru cannot statisfy amdgpu per-vm-bo requirement, we have to adapt it in amdgpu driver at least.
Change-Id: I92b2286ef507c2e055ad9101cf31279d5f8db475 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/ttm/ttm_bo.c | 54 ++++++++++++++++++++++++++++++----------- include/drm/ttm/ttm_bo_driver.h | 49 +++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 15506682a0be..98da2cf63c9b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -164,6 +164,8 @@ void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
reservation_object_assert_held(bo->resv);
+ if (bdev->driver->add_to_lru) + return bdev->driver->add_to_lru(bo); if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { BUG_ON(!list_empty(&bo->lru));
@@ -188,6 +190,8 @@ static void ttm_bo_ref_bug(struct kref *list_kref)
void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) { + struct ttm_bo_device *bdev = bo->bdev; + if (!list_empty(&bo->swap)) { list_del_init(&bo->swap); kref_put(&bo->list_kref, ttm_bo_ref_bug); @@ -201,6 +205,8 @@ void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) * TODO: Add a driver hook to delete from * driver-specific LRU's here. */ + if (bdev->driver->del_from_lru) + return bdev->driver->del_from_lru(bo); }
void ttm_bo_del_sub_from_lru(struct ttm_buffer_object *bo) @@ -215,10 +221,14 @@ EXPORT_SYMBOL(ttm_bo_del_sub_from_lru);
void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo) { + struct ttm_bo_device *bdev = bo->bdev; + reservation_object_assert_held(bo->resv);
ttm_bo_del_from_lru(bo); ttm_bo_add_to_lru(bo); + if (bdev->driver->move_to_lru_tail) + return bdev->driver->move_to_lru_tail(bo); } EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
@@ -685,8 +695,8 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * * b. Otherwise, trylock it. */ -static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) +bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, + struct ttm_operation_ctx *ctx, bool *locked) { bool ret = false;
@@ -703,6 +713,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
return ret; } +EXPORT_SYMBOL(ttm_bo_evict_swapout_allowable);
static struct ttm_buffer_object * ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev, @@ -736,6 +747,9 @@ ttm_mem_get_evictable_bo(struct ttm_bo_device *bdev, bo = NULL; }
+ if (!bo && bdev->driver->get_evictable_bo) + bo= bdev->driver->get_evictable_bo(bdev, mem_type, place, + ctx, locked); return bo; }
@@ -1311,6 +1325,21 @@ int ttm_bo_create(struct ttm_bo_device *bdev, } EXPORT_SYMBOL(ttm_bo_create);
+bool ttm_lru_empty(struct ttm_bo_device *bdev, unsigned mem_type) +{ + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + int i; + + for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { + if (!list_empty(&man->lru[i])) + return false; + } + if (bdev->driver->lru_empty) + return bdev->driver->lru_empty(bdev, mem_type); + + return true; +} + static int ttm_bo_force_list_clean(struct ttm_bo_device *bdev, unsigned mem_type) { @@ -1323,21 +1352,18 @@ static int ttm_bo_force_list_clean(struct ttm_bo_device *bdev, struct ttm_bo_global *glob = bdev->glob; struct dma_fence *fence; int ret; - unsigned i;
/* * Can't use standard list traversal since we're unlocking. */
spin_lock(&glob->lru_lock); - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { - while (!list_empty(&man->lru[i])) { - spin_unlock(&glob->lru_lock); - ret = ttm_mem_evict_first(bdev, mem_type, NULL, &ctx); - if (ret) - return ret; - spin_lock(&glob->lru_lock); - } + while (!ttm_lru_empty(bdev, mem_type)) { + spin_unlock(&glob->lru_lock); + ret = ttm_mem_evict_first(bdev, mem_type, NULL, &ctx); + if (ret) + return ret; + spin_lock(&glob->lru_lock); } spin_unlock(&glob->lru_lock);
@@ -1533,9 +1559,9 @@ int ttm_bo_device_release(struct ttm_bo_device *bdev) pr_debug("Delayed destroy list was clean\n");
spin_lock(&glob->lru_lock); - for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) - if (list_empty(&bdev->man[0].lru[0])) - pr_debug("Swap list %d was clean\n", i); + for (i = 0; i < TTM_NUM_MEM_TYPES; ++i) + if (ttm_lru_empty(bdev, i)) + pr_debug("lru list %d was clean\n", i); spin_unlock(&glob->lru_lock);
drm_vma_offset_manager_destroy(&bdev->vma_manager); diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 3234cc322e70..29339b0a2fd6 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -284,6 +284,52 @@ struct ttm_bo_driver { */ bool (*eviction_valuable)(struct ttm_buffer_object *bo, const struct ttm_place *place); + + /** + * struct ttm_bo_driver member get_evictable_bo + * + * @bdev: the buffer object device. + * @mem_type: memory type + * @place: placement we need room for + * @ctx: context for this evict with parameters + * @locked: return if the evictable bo is already locked. + * + * return an evictable bo for evicting. + */ + struct ttm_buffer_object *(*get_evictable_bo)(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + bool *locked); + + /** + * struct ttm_bo_driver member add_to_lru + * + * @bo: the buffer object to be add + * + * add bo to driver specific lru + */ + void (*add_to_lru)(struct ttm_buffer_object *bo); + + /** + * struct ttm_bo_driver member del_from_lru + * + * @bo: the buffer object to be add + * + * delete bo from driver specific lru + */ + void (*del_from_lru)(struct ttm_buffer_object *bo); + + /** + * struct ttm_bo_driver member move_to_lru_tail + * + * @bo: the buffer object to be add + * + * move to driver specific lru tail + */ + void (*move_to_lru_tail)(struct ttm_buffer_object *bo); + + bool (*lru_empty)(struct ttm_bo_device *bdev, unsigned mem_type); /** * struct ttm_bo_driver member evict_flags: * @@ -760,6 +806,9 @@ int ttm_mem_io_reserve(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem); void ttm_mem_io_free(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem); +bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, + struct ttm_operation_ctx *ctx, + bool *locked); /** * ttm_bo_move_ttm *
Change-Id: I4ee2abf1ddf5c0fe59c5803da51e99bb57388d05 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 25 +++++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++++ 3 files changed, 38 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index dfd22db13fb1..0bbb1dfdceff 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1279,6 +1279,10 @@ static struct ttm_bo_driver amdgpu_bo_driver = { .invalidate_caches = &amdgpu_invalidate_caches, .init_mem_type = &amdgpu_init_mem_type, .eviction_valuable = amdgpu_ttm_bo_eviction_valuable, + .get_evictable_bo = &amdgpu_vm_get_evictable_bo, + .add_to_lru = &amdgpu_vm_add_to_lru, + .del_from_lru = &amdgpu_vm_del_from_lru, + .move_to_lru_tail = &amdgpu_vm_move_to_lru_tail, .evict_flags = &amdgpu_evict_flags, .move = &amdgpu_bo_move, .verify_access = &amdgpu_verify_access, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 8e71d3984016..cc6093233ae7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -124,6 +124,31 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, spin_unlock(&vm->status_lock); }
+struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + bool *locked) +{ + +} + +void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo) +{ + +} + +void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo) +{ + +} + +void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo) +{ + +} + + /** * amdgpu_vm_level_shift - return the addr shift for each level * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 30f080364c97..0c965683faba 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -256,6 +256,15 @@ struct amdgpu_vm_manager { spinlock_t pasid_lock; };
+struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + bool *locked); +void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo); +void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo); +void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo); + void amdgpu_vm_manager_init(struct amdgpu_device *adev); void amdgpu_vm_manager_fini(struct amdgpu_device *adev); int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
Change-Id: Icba45a329e2e2094581ad6c4b8b9028a2e5c5faa Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 37 +++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 14 +++++++++++ 5 files changed, 55 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 2d7500921c0b..f186c8f29774 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1532,6 +1532,8 @@ struct amdgpu_device { dma_addr_t dummy_page_addr; struct amdgpu_vm_manager vm_manager; struct amdgpu_vmhub vmhub[AMDGPU_MAX_VMHUBS]; + struct amdgpu_vm_lru kernel_vm_lru; + struct list_head vm_lru_list;
/* memory management */ struct amdgpu_mman mman; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 887f7c9e84e0..feafcfa2633d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2266,6 +2266,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, spin_lock_init(&adev->audio_endpt_idx_lock); spin_lock_init(&adev->mm_stats.lock);
+ INIT_LIST_HEAD(&adev->vm_lru_list); INIT_LIST_HEAD(&adev->shadow_list); mutex_init(&adev->shadow_list_lock);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 0bbb1dfdceff..207f88f38b23 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1417,6 +1417,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev) return r; } adev->mman.initialized = true; + amdgpu_vm_lru_init(&adev->kernel_vm_lru, adev, NULL);
/* We opt to avoid OOM on system pages allocations */ adev->mman.bdev.no_retry = true; @@ -1537,6 +1538,7 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev) return;
amdgpu_ttm_debugfs_fini(adev); + amdgpu_vm_lru_fini(&adev->kernel_vm_lru, adev); amdgpu_ttm_fw_reserve_vram_fini(adev); if (adev->mman.aper_base_kaddr) iounmap(adev->mman.aper_base_kaddr); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index cc6093233ae7..72ff2d9c8686 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -124,6 +124,39 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, spin_unlock(&vm->status_lock); }
+int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, + struct reservation_object *resv) +{ + struct ttm_bo_global *glob = adev->mman.bdev.glob; + int i, j; + + INIT_LIST_HEAD(&vm_lru->vm_lru_list); + for (i = 0; i < TTM_NUM_MEM_TYPES; i++) { + for (j = 0; j < TTM_MAX_BO_PRIORITY; j++) { + INIT_LIST_HEAD(&vm_lru->fixed_lru[i][j]); + INIT_LIST_HEAD(&vm_lru->dynamic_lru[i][j]); + } + } + spin_lock(&glob->lru_lock); + list_add_tail(&vm_lru->vm_lru_list, &adev->vm_lru_list); + spin_unlock(&glob->lru_lock); + + vm_lru->resv = resv; + + return 0; +} + +int amdgpu_vm_lru_fini(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev) +{ + struct ttm_bo_global *glob = adev->mman.bdev.glob; + + spin_lock(&glob->lru_lock); + list_del(&vm_lru->vm_lru_list); + spin_unlock(&glob->lru_lock); + + return 0; +} + struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place, @@ -2413,6 +2446,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, uint64_t flags; int r, i;
+ amdgpu_vm_lru_init(&vm->vm_lru, adev, NULL); vm->va = RB_ROOT_CACHED; for (i = 0; i < AMDGPU_MAX_VMHUBS; i++) vm->reserved_vmid[i] = NULL; @@ -2468,7 +2502,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, r = amdgpu_bo_create(adev, &bp, &root); if (r) goto error_free_sched_entity; - + vm->vm_lru.resv = root->tbo.resv; r = amdgpu_bo_reserve(root, true); if (r) goto error_free_root; @@ -2672,6 +2706,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) adev->vm_manager.root_level); amdgpu_bo_unreserve(root); } + amdgpu_vm_lru_fini(&vm->vm_lru, adev); amdgpu_bo_unref(&root); dma_fence_put(vm->last_update); for (i = 0; i < AMDGPU_MAX_VMHUBS; i++) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 0c965683faba..66ee902614a2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -29,6 +29,7 @@ #include <linux/rbtree.h> #include <drm/gpu_scheduler.h> #include <drm/drm_file.h> +#include <drm/ttm/ttm_bo_driver.h>
#include "amdgpu_sync.h" #include "amdgpu_ring.h" @@ -135,6 +136,13 @@ enum amdgpu_vm_level { AMDGPU_VM_PTB };
+struct amdgpu_vm_lru { + struct list_head vm_lru_list; + struct list_head fixed_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; + struct list_head dynamic_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; + struct reservation_object *resv; +}; + /* base structure for tracking BO usage in a VM */ struct amdgpu_vm_bo_base { /* constant after initialization */ @@ -167,6 +175,7 @@ struct amdgpu_vm { /* tree of virtual addresses mapped */ struct rb_root_cached va;
+ struct amdgpu_vm_lru vm_lru; /* protecting invalidated */ spinlock_t status_lock;
@@ -256,6 +265,11 @@ struct amdgpu_vm_manager { spinlock_t pasid_lock; };
+int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, + struct reservation_object *resv); +int amdgpu_vm_lru_fini(struct amdgpu_vm_lru *vm_lru, + struct amdgpu_device *adev); + struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place,
Change-Id: I28351ad8e69c13038ccff40fd9f0369ddae91371 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 ++++++++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 +++++--- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 ++ 5 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index f186c8f29774..cec76cda79c5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -445,7 +445,8 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, int alignment, u32 initial_domain, u64 flags, enum ttm_bo_type type, struct reservation_object *resv, - struct drm_gem_object **obj); + struct drm_gem_object **obj, + struct amdgpu_vm_lru *lru);
int amdgpu_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c index bc5fd8ebab5d..b2e45e1314eb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c @@ -146,7 +146,7 @@ static int amdgpufb_create_pinned_object(struct amdgpu_fbdev *rfbdev, AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_VRAM_CLEARED, - true, NULL, &gobj); + true, NULL, &gobj, NULL); if (ret) { pr_err("failed to allocate framebuffer (%d)\n", aligned_size); return -ENOMEM; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 7d3dc229fa47..fac20d796db0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -45,7 +45,8 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, int alignment, u32 initial_domain, u64 flags, enum ttm_bo_type type, struct reservation_object *resv, - struct drm_gem_object **obj) + struct drm_gem_object **obj, + struct amdgpu_vm_lru *vm_lru) { struct amdgpu_bo *bo; struct amdgpu_bo_param bp; @@ -63,6 +64,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, bp.type = type; bp.resv = resv; bp.preferred_domain = initial_domain; + bp.vm_lru = vm_lru; retry: bp.flags = flags; bp.domain = initial_domain; @@ -257,7 +259,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
r = amdgpu_gem_object_create(adev, size, args->in.alignment, (u32)(0xffffffff & args->in.domains), - flags, false, resv, &gobj); + flags, false, resv, &gobj, &vm->vm_lru); if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) { if (!r) { struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj); @@ -285,6 +287,8 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, { struct ttm_operation_ctx ctx = { true, false }; struct amdgpu_device *adev = dev->dev_private; + struct amdgpu_fpriv *fpriv = filp->driver_priv; + struct amdgpu_vm *vm = &fpriv->vm; struct drm_amdgpu_gem_userptr *args = data; struct drm_gem_object *gobj; struct amdgpu_bo *bo; @@ -309,7 +313,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
/* create a gem object to contain this object in */ r = amdgpu_gem_object_create(adev, args->size, 0, AMDGPU_GEM_DOMAIN_CPU, - 0, 0, NULL, &gobj); + 0, 0, NULL, &gobj, &vm->vm_lru); if (r) return r;
@@ -747,6 +751,8 @@ int amdgpu_mode_dumb_create(struct drm_file *file_priv, struct drm_mode_create_dumb *args) { struct amdgpu_device *adev = dev->dev_private; + struct amdgpu_fpriv *fpriv = file_priv->driver_priv; + struct amdgpu_vm *vm = &fpriv->vm; struct drm_gem_object *gobj; uint32_t handle; int r; @@ -759,7 +765,7 @@ int amdgpu_mode_dumb_create(struct drm_file *file_priv, r = amdgpu_gem_object_create(adev, args->size, 0, AMDGPU_GEM_DOMAIN_VRAM, AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED, - false, NULL, &gobj); + false, NULL, &gobj, &vm->vm_lru); if (r) return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index e62153a86001..a457738c512c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -419,6 +419,11 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo->tbo.bdev = &adev->mman.bdev; amdgpu_ttm_placement_from_domain(bo, bp->domain); + bo->vm_lru = bp->vm_lru; + if (bp->type == ttm_bo_type_kernel) { + bo->tbo.priority = 1; + bo->vm_lru = &adev->kernel_vm_lru; + }
r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, bp->type, &bo->placement, page_align, &ctx, acc_size, @@ -434,9 +439,6 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, else amdgpu_cs_report_moved_bytes(adev, ctx.bytes_moved, 0);
- if (bp->type == ttm_bo_type_kernel) - bo->tbo.priority = 1; - if (bp->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED && bo->tbo.mem.placement & TTM_PL_FLAG_VRAM) { struct dma_fence *fence; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index 540e03fa159f..f04fc401327b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -41,6 +41,7 @@ struct amdgpu_bo_param { u64 flags; enum ttm_bo_type type; struct reservation_object *resv; + struct amdgpu_vm_lru *vm_lru; };
/* bo virtual addresses in a vm */ @@ -97,6 +98,7 @@ struct amdgpu_bo {
struct ttm_bo_kmap_obj dma_buf_vmap; struct amdgpu_mn *mn; + struct amdgpu_vm_lru *vm_lru;
union { struct list_head mn_list;
Change-Id: I023d3dd314e49bc9b1649468a82ecca6043e4317 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 59 ++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 72ff2d9c8686..27b3fdb6dd46 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -163,11 +163,70 @@ struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, struct ttm_operation_ctx *ctx, bool *locked) { + struct amdgpu_device *adev = amdgpu_ttm_adev(bdev); + struct ttm_buffer_object *bo = NULL; + struct amdgpu_vm_lru *vm_lru; + int i; + + for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { + list_for_each_entry(vm_lru, &adev->vm_lru_list, vm_lru_list) { + list_for_each_entry(bo, &vm_lru->dynamic_lru[mem_type][i], lru) { + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) + continue; + if (place && !bdev->driver->eviction_valuable(bo, place)) { + if (locked) + reservation_object_unlock(bo->resv); + continue; + } + break; + } + /* If the inner loop terminated early, we have our candidate */ + if (&bo->lru != &vm_lru->dynamic_lru[mem_type][i]) + break; + bo = NULL; + list_for_each_entry(bo, &vm_lru->fixed_lru[mem_type][i], lru) { + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) + continue; + if (place && !bdev->driver->eviction_valuable(bo, place)) { + if (locked) + reservation_object_unlock(bo->resv); + continue; + } + break; + } + /* If the inner loop terminated early, we have our candidate */ + if (&bo->lru != &vm_lru->fixed_lru[mem_type][i]) + break; + bo = NULL; + } + if (bo) + break; + } + + return bo;
}
void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo) { + struct ttm_bo_device *bdev = bo->bdev; + struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); + struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + + if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { + if (bo->resv == vm_lru->resv) + list_add_tail(&bo->lru, &vm_lru->fixed_lru[bo->mem.mem_type][bo->priority]); + else + list_add_tail(&bo->lru, &vm_lru->dynamic_lru[bo->mem.mem_type][bo->priority]); + kref_get(&bo->list_kref); + + if (bo->ttm && !(bo->ttm->page_flags & + (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SWAPPED))) { + list_add_tail(&bo->swap, + &bdev->glob->swap_lru[bo->priority]); + kref_get(&bo->list_kref); + } + }
}
Change-Id: I5b5f36b4c8af422b5c9d0eaf0c2d3b4db4d9cd0b Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/ttm/ttm_bo.c | 3 ++- include/drm/ttm/ttm_bo_driver.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 98da2cf63c9b..e232dadd5f79 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -183,10 +183,11 @@ void ttm_bo_add_to_lru(struct ttm_buffer_object *bo) } EXPORT_SYMBOL(ttm_bo_add_to_lru);
-static void ttm_bo_ref_bug(struct kref *list_kref) +void ttm_bo_ref_bug(struct kref *list_kref) { BUG(); } +EXPORT_SYMBOL(ttm_bo_ref_bug);
void ttm_bo_del_from_lru(struct ttm_buffer_object *bo) { diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 29339b0a2fd6..6847d4258db1 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -601,6 +601,8 @@ int ttm_bo_global_init(struct drm_global_reference *ref);
int ttm_bo_device_release(struct ttm_bo_device *bdev);
+void ttm_bo_ref_bug(struct kref *list_kref); + /** * ttm_bo_device_init *
Change-Id: Iaca5cdaccbc5beeb7a37c0f703cdfc97df4ece4f Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 85 +++++++++++++++++++++++++++--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 +- 4 files changed, 82 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index f04fc401327b..b6396230d30e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -82,6 +82,8 @@ struct amdgpu_bo { struct ttm_placement placement; struct ttm_buffer_object tbo; struct ttm_bo_kmap_obj kmap; + struct rb_node node; + u64 index; u64 flags; unsigned pin_count; u64 tiling_flags; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 207f88f38b23..a5d8f511b011 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1279,6 +1279,7 @@ static struct ttm_bo_driver amdgpu_bo_driver = { .invalidate_caches = &amdgpu_invalidate_caches, .init_mem_type = &amdgpu_init_mem_type, .eviction_valuable = amdgpu_ttm_bo_eviction_valuable, + .lru_empty = &amdgpu_vm_lru_empty, .get_evictable_bo = &amdgpu_vm_get_evictable_bo, .add_to_lru = &amdgpu_vm_add_to_lru, .del_from_lru = &amdgpu_vm_del_from_lru, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 27b3fdb6dd46..1a09c07bbf20 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -133,7 +133,7 @@ int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, INIT_LIST_HEAD(&vm_lru->vm_lru_list); for (i = 0; i < TTM_NUM_MEM_TYPES; i++) { for (j = 0; j < TTM_MAX_BO_PRIORITY; j++) { - INIT_LIST_HEAD(&vm_lru->fixed_lru[i][j]); + vm_lru->fixed_lru[i][j] = RB_ROOT; INIT_LIST_HEAD(&vm_lru->dynamic_lru[i][j]); } } @@ -157,6 +157,24 @@ int amdgpu_vm_lru_fini(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev) return 0; }
+bool amdgpu_vm_lru_empty(struct ttm_bo_device *bdev, unsigned mem_type) +{ + struct amdgpu_device *adev = amdgpu_ttm_adev(bdev); + struct amdgpu_vm_lru *vm_lru; + int i; + + for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { + list_for_each_entry(vm_lru, &adev->vm_lru_list, vm_lru_list) { + if (!list_empty(&vm_lru->dynamic_lru[mem_type][i])) + return false; + if (!RB_EMPTY_ROOT(&vm_lru->fixed_lru[mem_type][i])) + return false; + } + } + + return true; +} + struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, uint32_t mem_type, const struct ttm_place *place, @@ -165,11 +183,13 @@ struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, { struct amdgpu_device *adev = amdgpu_ttm_adev(bdev); struct ttm_buffer_object *bo = NULL; + struct amdgpu_bo *abo = NULL; struct amdgpu_vm_lru *vm_lru; int i;
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(vm_lru, &adev->vm_lru_list, vm_lru_list) { + struct rb_node *node; list_for_each_entry(bo, &vm_lru->dynamic_lru[mem_type][i], lru) { if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) continue; @@ -184,20 +204,22 @@ struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, if (&bo->lru != &vm_lru->dynamic_lru[mem_type][i]) break; bo = NULL; - list_for_each_entry(bo, &vm_lru->fixed_lru[mem_type][i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) + for (node = rb_first(&vm_lru->fixed_lru[mem_type][i]); + node; node = rb_next(node)) { + abo = rb_entry(node, struct amdgpu_bo, node); + bo = &abo->tbo; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked)) { + bo = NULL; continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { if (locked) reservation_object_unlock(bo->resv); + bo = NULL; continue; } break; } - /* If the inner loop terminated early, we have our candidate */ - if (&bo->lru != &vm_lru->fixed_lru[mem_type][i]) - break; - bo = NULL; } if (bo) break; @@ -207,6 +229,26 @@ struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev,
}
+static void amdgpu_vm_bo_add_to_rb(struct amdgpu_bo *bo, + struct rb_root *root) +{ + struct rb_node **new = &(root->rb_node), *parent = NULL; + + while (*new) { + struct amdgpu_bo *this = + container_of(*new, struct amdgpu_bo, node); + + parent = *new; + if (bo->index < this->index) + new = &((*new)->rb_left); + else + new = &((*new)->rb_right); + } + + rb_link_node(&bo->node, parent, new); + rb_insert_color(&bo->node, root); +} + void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo) { struct ttm_bo_device *bdev = bo->bdev; @@ -215,7 +257,7 @@ void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo)
if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { if (bo->resv == vm_lru->resv) - list_add_tail(&bo->lru, &vm_lru->fixed_lru[bo->mem.mem_type][bo->priority]); + amdgpu_vm_bo_add_to_rb(abo, &vm_lru->fixed_lru[bo->mem.mem_type][bo->priority]); else list_add_tail(&bo->lru, &vm_lru->dynamic_lru[bo->mem.mem_type][bo->priority]); kref_get(&bo->list_kref); @@ -230,9 +272,36 @@ void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo)
}
+static struct amdgpu_bo *amdgpu_vm_bo_rb_find(struct rb_root *root, u64 index) +{ + struct rb_node *node = root->rb_node; + + while (node) { + struct amdgpu_bo *bo = + container_of(node, struct amdgpu_bo, node); + + if (index < bo->index) + node = node->rb_left; + else if (index > bo->index) + node = node->rb_right; + else + return bo; + } + + return NULL; +} + void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo) { + struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); + struct amdgpu_vm_lru *vm_lru = abo->vm_lru;
+ if (amdgpu_vm_bo_rb_find(&vm_lru->fixed_lru[bo->mem.mem_type][bo->priority], + abo->index)) { + rb_erase(&abo->node, + &vm_lru->fixed_lru[abo->tbo.mem.mem_type][abo->tbo.priority]); + kref_put(&abo->tbo.list_kref, ttm_bo_ref_bug); + } }
void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 66ee902614a2..84400673d710 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -138,7 +138,7 @@ enum amdgpu_vm_level {
struct amdgpu_vm_lru { struct list_head vm_lru_list; - struct list_head fixed_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; + struct rb_root fixed_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; struct list_head dynamic_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; struct reservation_object *resv; }; @@ -269,6 +269,7 @@ int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, struct reservation_object *resv); int amdgpu_vm_lru_fini(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev); +bool amdgpu_vm_lru_empty(struct ttm_bo_device *bdev, unsigned mem_type);
struct ttm_buffer_object *amdgpu_vm_get_evictable_bo(struct ttm_bo_device *bdev, uint32_t mem_type,
Change-Id: Iaec4e12164124c155753fc7aea85f76fde8d1ed6 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 + 3 files changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index a457738c512c..63faa271a7d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -424,6 +424,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo->tbo.priority = 1; bo->vm_lru = &adev->kernel_vm_lru; } + bo->index = (u64)atomic64_inc_return(&bo->vm_lru->bo_index);
r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, bp->type, &bo->placement, page_align, &ctx, acc_size, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 1a09c07bbf20..5bef4ffa1c87 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -142,6 +142,7 @@ int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, spin_unlock(&glob->lru_lock);
vm_lru->resv = resv; + atomic64_set(&vm_lru->bo_index, 0);
return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 84400673d710..773f1bda2b98 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -141,6 +141,7 @@ struct amdgpu_vm_lru { struct rb_root fixed_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; struct list_head dynamic_lru[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY]; struct reservation_object *resv; + atomic64_t bo_index; };
/* base structure for tracking BO usage in a VM */
Change-Id: I0d5fa7e5e88568f79e836ff47f9c9132cb7d349e Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 5bef4ffa1c87..537f04d25535 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -307,7 +307,12 @@ void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo)
void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo) { + struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); + struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
+ if (bo->resv == vm_lru->resv) + list_move_tail(&vm_lru->vm_lru_list, &adev->vm_lru_list); }
driver will use it to check if the bo is transferred bo.
Change-Id: I6a4f3bc00621f9cb3fc24b3bc9d7d7a8ac6cd629 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 ++- include/drm/ttm/ttm_bo_driver.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index f3bf545a79cf..1dda99b4724a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -457,7 +457,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, } EXPORT_SYMBOL(ttm_bo_move_memcpy);
-static void ttm_transfered_destroy(struct ttm_buffer_object *bo) +void ttm_transfered_destroy(struct ttm_buffer_object *bo) { struct ttm_transfer_obj *fbo;
@@ -465,6 +465,7 @@ static void ttm_transfered_destroy(struct ttm_buffer_object *bo) ttm_bo_unref(&fbo->bo); kfree(fbo); } +EXPORT_SYMBOL(ttm_transfered_destroy);
/** * ttm_buffer_object_transfer diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 6847d4258db1..32cc054dfa99 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -855,6 +855,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, struct ttm_mem_reg *new_mem);
+void ttm_transfered_destroy(struct ttm_buffer_object *bo); /** * ttm_bo_free_old_node *
Change-Id: I1179a21aa3712b095fd50bed6956654e0f72e611 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 537f04d25535..a425d498f3fc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -255,9 +255,15 @@ void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo) struct ttm_bo_device *bdev = bo->bdev; struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + struct ttm_mem_type_manager *man;
if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { - if (bo->resv == vm_lru->resv) + if (bo->destroy == ttm_transfered_destroy) { + BUG_ON(!list_empty(&bo->lru)); + + man = &bdev->man[bo->mem.mem_type]; + list_add_tail(&bo->lru, &man->lru[bo->priority]); + } else if (bo->resv == vm_lru->resv) amdgpu_vm_bo_add_to_rb(abo, &vm_lru->fixed_lru[bo->mem.mem_type][bo->priority]); else list_add_tail(&bo->lru, &vm_lru->dynamic_lru[bo->mem.mem_type][bo->priority]); @@ -297,6 +303,8 @@ void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo) struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); struct amdgpu_vm_lru *vm_lru = abo->vm_lru;
+ if (bo->destroy == ttm_transfered_destroy) + return; if (amdgpu_vm_bo_rb_find(&vm_lru->fixed_lru[bo->mem.mem_type][bo->priority], abo->index)) { rb_erase(&abo->node, @@ -311,6 +319,8 @@ void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo) struct amdgpu_vm_lru *vm_lru = abo->vm_lru; struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
+ if (bo->destroy == ttm_transfered_destroy) + return; if (bo->resv == vm_lru->resv) list_move_tail(&vm_lru->vm_lru_list, &adev->vm_lru_list); }
That means bo isn't per vm bo when vm fini, back to normal bo instead.
Change-Id: Ida56abd0351422dd0b4a4393545c9cdb0e1a6818 Signed-off-by: Chunming Zhou david1.zhou@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 50 +++++++++++++++++++++++++++++----- 1 file changed, 43 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index a425d498f3fc..89c2cbbce436 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -150,10 +150,34 @@ int amdgpu_vm_lru_init(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev, int amdgpu_vm_lru_fini(struct amdgpu_vm_lru *vm_lru, struct amdgpu_device *adev) { struct ttm_bo_global *glob = adev->mman.bdev.glob; + struct ttm_buffer_object *bo = NULL; + struct amdgpu_bo *abo = NULL; + struct rb_node *node; + int i, j; + bool locked;
+ locked = reservation_object_trylock(vm_lru->resv); spin_lock(&glob->lru_lock); list_del(&vm_lru->vm_lru_list); + for (i = 0; i < TTM_MAX_BO_PRIORITY; i++) { + for (j = 0; j < TTM_NUM_MEM_TYPES; j++) { + list_for_each_entry(bo, &vm_lru->dynamic_lru[j][i], lru) { + struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); + + abo->vm_lru = NULL; + abo->index = 0; + } + for (node = rb_first(&vm_lru->fixed_lru[j][i]); + node; node = rb_next(node)) { + abo = rb_entry(node, struct amdgpu_bo, node); + abo->vm_lru = NULL; + abo->index = 0; + } + } + } spin_unlock(&glob->lru_lock); + if (locked) + reservation_object_unlock(vm_lru->resv);
return 0; } @@ -253,12 +277,16 @@ static void amdgpu_vm_bo_add_to_rb(struct amdgpu_bo *bo, void amdgpu_vm_add_to_lru(struct ttm_buffer_object *bo) { struct ttm_bo_device *bdev = bo->bdev; - struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); - struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + struct amdgpu_bo *abo; + struct amdgpu_vm_lru *vm_lru = NULL; struct ttm_mem_type_manager *man;
+ if (bo->destroy != ttm_transfered_destroy) { + abo = ttm_to_amdgpu_bo(bo); + vm_lru = abo->vm_lru; + } if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { - if (bo->destroy == ttm_transfered_destroy) { + if (bo->destroy == ttm_transfered_destroy || !vm_lru) { BUG_ON(!list_empty(&bo->lru));
man = &bdev->man[bo->mem.mem_type]; @@ -300,11 +328,15 @@ static struct amdgpu_bo *amdgpu_vm_bo_rb_find(struct rb_root *root, u64 index)
void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo) { - struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); - struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + struct amdgpu_bo *abo; + struct amdgpu_vm_lru *vm_lru;
if (bo->destroy == ttm_transfered_destroy) return; + abo = ttm_to_amdgpu_bo(bo); + vm_lru = abo->vm_lru; + if (!vm_lru) + return; if (amdgpu_vm_bo_rb_find(&vm_lru->fixed_lru[bo->mem.mem_type][bo->priority], abo->index)) { rb_erase(&abo->node, @@ -315,12 +347,16 @@ void amdgpu_vm_del_from_lru(struct ttm_buffer_object *bo)
void amdgpu_vm_move_to_lru_tail(struct ttm_buffer_object *bo) { - struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo); - struct amdgpu_vm_lru *vm_lru = abo->vm_lru; + struct amdgpu_bo *abo; + struct amdgpu_vm_lru *vm_lru; struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
if (bo->destroy == ttm_transfered_destroy) return; + abo = ttm_to_amdgpu_bo(bo); + vm_lru = abo->vm_lru; + if (!vm_lru) + return; if (bo->resv == vm_lru->resv) list_move_tail(&vm_lru->vm_lru_list, &adev->vm_lru_list); }
On 05/09/2018 02:45 PM, Chunming Zhou wrote:
move implemenation from ttm to amdgpu driver. (suggested by Christian) per-vm-lru is because of per-vm-bo, which has no chance to refresh lru, the nagtive effect is game performance isn't stable. so all per-vm-bo should have a default order, every per-vm-bo has its priority, relying on its creation index. When doing CS, if any normal bo is used, then all per-vm-bo should be used, so per-vm-bo prioirty >= normal bo priority.
Above is per-vm-lru starting point.
How do you think that we create the per vm bo as priority 1 and kernel bo as priority 2 accordingly? Will that help to make some improvement?
Jerry
Chunming Zhou (13): ttm: abstruct evictable bo ttm: allow driver has own lru policy drm/amdgpu: add lru backend for amdgpu driver drm/amdgpu: init/fini vm lru drm/amdgpu: pass vm lru to buffer object drm/amdgpu: add amdgpu lru implementation drm/ttm: export ttm_bo_ref_bug drm/amdgpu: use RB tree instead of link list drm/amdgpu: add bo index counter drm/amdgpu: bulk move per vm bo ttm: export ttm_transfered_destroy drm/amdgpu: transferred bo doesn't use vm lru drm/amdgpu: free vm lru when vm fini
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 242 ++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 25 +++ drivers/gpu/drm/ttm/ttm_bo.c | 92 +++++++---- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- include/drm/ttm/ttm_bo_driver.h | 52 +++++++ 12 files changed, 419 insertions(+), 37 deletions(-)
On 2018年05月10日 13:07, Zhang, Jerry (Junwei) wrote:
On 05/09/2018 02:45 PM, Chunming Zhou wrote:
move implemenation from ttm to amdgpu driver. (suggested by Christian) per-vm-lru is because of per-vm-bo, which has no chance to refresh lru, the nagtive effect is game performance isn't stable. so all per-vm-bo should have a default order, every per-vm-bo has its priority, relying on its creation index. When doing CS, if any normal bo is used, then all per-vm-bo should be used, so per-vm-bo prioirty >= normal bo priority.
Above is per-vm-lru starting point.
How do you think that we create the per vm bo as priority 1 and kernel bo as priority 2 accordingly?
Yeah, then how to fix per-vm-bo order? I think we need set priority for every per-vm-bo. Because of lacking bo list, the per-vm-bo lru can only depend on priority.
Regards, David Zhou
Will that help to make some improvement?
Jerry
Chunming Zhou (13): ttm: abstruct evictable bo ttm: allow driver has own lru policy drm/amdgpu: add lru backend for amdgpu driver drm/amdgpu: init/fini vm lru drm/amdgpu: pass vm lru to buffer object drm/amdgpu: add amdgpu lru implementation drm/ttm: export ttm_bo_ref_bug drm/amdgpu: use RB tree instead of link list drm/amdgpu: add bo index counter drm/amdgpu: bulk move per vm bo ttm: export ttm_transfered_destroy drm/amdgpu: transferred bo doesn't use vm lru drm/amdgpu: free vm lru when vm fini
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 242 ++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 25 +++ drivers/gpu/drm/ttm/ttm_bo.c | 92 +++++++---- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- include/drm/ttm/ttm_bo_driver.h | 52 +++++++ 12 files changed, 419 insertions(+), 37 deletions(-)
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
On 05/10/2018 04:45 PM, zhoucm1 wrote:
On 2018年05月10日 13:07, Zhang, Jerry (Junwei) wrote:
On 05/09/2018 02:45 PM, Chunming Zhou wrote:
move implemenation from ttm to amdgpu driver. (suggested by Christian) per-vm-lru is because of per-vm-bo, which has no chance to refresh lru, the nagtive effect is game performance isn't stable. so all per-vm-bo should have a default order, every per-vm-bo has its priority, relying on its creation index. When doing CS, if any normal bo is used, then all per-vm-bo should be used, so per-vm-bo prioirty >= normal bo priority.
Above is per-vm-lru starting point.
How do you think that we create the per vm bo as priority 1 and kernel bo as priority 2 accordingly?
Yeah, then how to fix per-vm-bo order? I think we need set priority for every per-vm-bo. Because of lacking bo list, the per-vm-bo lru can only depend on priority.
Mmm, maybe we can use RB tree for per vm bo priority lru. But that's actually per dev rather than per vm, I think.
Seems we need a per vm lru as your patches for bo management.
Jerry
Regards, David Zhou
Will that help to make some improvement?
Jerry
Chunming Zhou (13): ttm: abstruct evictable bo ttm: allow driver has own lru policy drm/amdgpu: add lru backend for amdgpu driver drm/amdgpu: init/fini vm lru drm/amdgpu: pass vm lru to buffer object drm/amdgpu: add amdgpu lru implementation drm/ttm: export ttm_bo_ref_bug drm/amdgpu: use RB tree instead of link list drm/amdgpu: add bo index counter drm/amdgpu: bulk move per vm bo ttm: export ttm_transfered_destroy drm/amdgpu: transferred bo doesn't use vm lru drm/amdgpu: free vm lru when vm fini
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 242 ++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 25 +++ drivers/gpu/drm/ttm/ttm_bo.c | 92 +++++++---- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- include/drm/ttm/ttm_bo_driver.h | 52 +++++++ 12 files changed, 419 insertions(+), 37 deletions(-)
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
On 05/10/2018 04:45 PM, zhoucm1 wrote:
On 2018年05月10日 13:07, Zhang, Jerry (Junwei) wrote:
On 05/09/2018 02:45 PM, Chunming Zhou wrote:
move implemenation from ttm to amdgpu driver. (suggested by Christian) per-vm-lru is because of per-vm-bo, which has no chance to refresh lru, the nagtive effect is game performance isn't stable. so all per-vm-bo should have a default order, every per-vm-bo has its priority, relying on its creation index. When doing CS, if any normal bo is used, then all per-vm-bo should be used, so per-vm-bo prioirty >= normal bo priority.
Above is per-vm-lru starting point.
How do you think that we create the per vm bo as priority 1 and kernel bo as priority 2 accordingly?
Yeah, then how to fix per-vm-bo order? I think we need set priority for every per-vm-bo. Because of lacking bo list, the per-vm-bo lru can only depend on priority.
Mmm, maybe use RB tree for BOs in priority 1 lru. but it actually works for per dev rather than per vm.
Seems we need per vm lru as the patches for per vm bo management.
Jerry
Regards, David Zhou
Will that help to make some improvement?
Jerry
Chunming Zhou (13): ttm: abstruct evictable bo ttm: allow driver has own lru policy drm/amdgpu: add lru backend for amdgpu driver drm/amdgpu: init/fini vm lru drm/amdgpu: pass vm lru to buffer object drm/amdgpu: add amdgpu lru implementation drm/ttm: export ttm_bo_ref_bug drm/amdgpu: use RB tree instead of link list drm/amdgpu: add bo index counter drm/amdgpu: bulk move per vm bo ttm: export ttm_transfered_destroy drm/amdgpu: transferred bo doesn't use vm lru drm/amdgpu: free vm lru when vm fini
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 14 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 242 ++++++++++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 25 +++ drivers/gpu/drm/ttm/ttm_bo.c | 92 +++++++---- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- include/drm/ttm/ttm_bo_driver.h | 52 +++++++ 12 files changed, 419 insertions(+), 37 deletions(-)
amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
dri-devel@lists.freedesktop.org