Am 23.03.22 um 12:59 schrieb Daniel Vetter:
On Mon, Mar 21, 2022 at 02:25:56PM +0100, Christian König wrote:
This way we finally fix the problem that new resource are not immediately evict-able after allocation.
That has caused numerous problems including OOM on GDS handling and not being able to use TTM as general resource manager.
v2: stop assuming in ttm_resource_fini that res->bo is still valid. v3: cleanup kerneldoc, add more lockdep annotation v4: consistently use res->num_pages
Signed-off-by: Christian König christian.koenig@amd.com Tested-by: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl +/**
- struct ttm_lru_bulk_move
- @tt: first/last lru entry for resources in the TT domain
- @vram: first/last lru entry for resources in the VRAM domain
- Helper structure for bulk moves on the LRU list.
- */
+struct ttm_lru_bulk_move {
- struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
- struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
Not really needed, just a thought: Should we track the associated dma_resv object here to make sure the locking is all done correctly (and also check that the bulk move bo have the same dma_resv)? It wouldn't really be any overhead for the !CONFIG_LOCKDEP case and we could sprinkle a lot more dma_resv_held all over the place.
You made a similar comment on the last revision and I already tried to play around with that idea a bit.
But I've completely abandoned that idea after realizing that the BOs in the bulk move actually don't need to have the same dma_resv object, nor do they all need to be locked.
It just happens that amdgpu is currently using it that way, but I can't see any technical necessarily to restrict the bulk move like that.
Regards, Christian.
-Daniel