Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
BTW: What is vmap() still used for?
Regards, Christian.
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
On Wed, Nov 20, 2019 at 1:02 PM Christian König christian.koenig@amd.com wrote:
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
Yeah right now everyone seems to have an attachment, and that's how we get away with all this. But the interface isn't supposed to work like that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you can call it directly on the struct dma_buf, no need for an attachment). Also I don't think non-dynamic drivers should ever call pin/unpin, not their job, holding onto a mapping should be enough to get things pinned.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
Well that's why I asked where the pin/unpin pair is. If you lock&pin, then you do know that the pointer will stay around. But looks like the original patch from Dave for ttm based drivers played fast&loose here with what should be done.
BTW: What is vmap() still used for?
udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but not stuff we can drop. -Daniel
Regards, Christian.
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
On Wed, Nov 20, 2019 at 1:09 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
On Wed, Nov 20, 2019 at 1:02 PM Christian König christian.koenig@amd.com wrote:
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
Yeah right now everyone seems to have an attachment, and that's how we get away with all this. But the interface isn't supposed to work like that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you can call it directly on the struct dma_buf, no need for an attachment). Also I don't think non-dynamic drivers should ever call pin/unpin, not their job, holding onto a mapping should be enough to get things pinned.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
Well that's why I asked where the pin/unpin pair is. If you lock&pin, then you do know that the pointer will stay around. But looks like the original patch from Dave for ttm based drivers played fast&loose here with what should be done.
BTW: What is vmap() still used for?
udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but not stuff we can drop.
If we're unlucky we'll actually have a problem with these now. For some of these the attached device is not dma-capable, so dma_map_sg will go boom. With the cached mapping logic we now have this might go sideways for dynamic exporters. Did you test your dynamic dma-buf support for amdgpu with udl? Worst case we need to get rid of the fake attachment, fix the vmap locking/pinning, and maybe some more headaches to sort this out. -Daniel
-Daniel
Regards, Christian.
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Am 20.11.19 um 13:19 schrieb Daniel Vetter:
On Wed, Nov 20, 2019 at 1:09 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
On Wed, Nov 20, 2019 at 1:02 PM Christian König christian.koenig@amd.com wrote:
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
Yeah right now everyone seems to have an attachment, and that's how we get away with all this. But the interface isn't supposed to work like that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you can call it directly on the struct dma_buf, no need for an attachment). Also I don't think non-dynamic drivers should ever call pin/unpin, not their job, holding onto a mapping should be enough to get things pinned.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
Well that's why I asked where the pin/unpin pair is. If you lock&pin, then you do know that the pointer will stay around. But looks like the original patch from Dave for ttm based drivers played fast&loose here with what should be done.
BTW: What is vmap() still used for?
udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but not stuff we can drop.
If we're unlucky we'll actually have a problem with these now. For some of these the attached device is not dma-capable, so dma_map_sg will go boom. With the cached mapping logic we now have this might go sideways for dynamic exporters. Did you test your dynamic dma-buf support for amdgpu with udl?
Short answer no, not at all. Long one: What the heck is udl? And how is it not dma-capable?
Worst case we need to get rid of the fake attachment, fix the vmap locking/pinning, and maybe some more headaches to sort this out.
Well of hand we could require that vmap will also pin a DMA-buf and start fixing amgpu/nouveau/radeon/qxl.
Christian.
-Daniel
-Daniel
Regards, Christian.
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Wed, Nov 20, 2019 at 1:24 PM Christian König christian.koenig@amd.com wrote:
Am 20.11.19 um 13:19 schrieb Daniel Vetter:
On Wed, Nov 20, 2019 at 1:09 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
On Wed, Nov 20, 2019 at 1:02 PM Christian König christian.koenig@amd.com wrote:
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
Yeah right now everyone seems to have an attachment, and that's how we get away with all this. But the interface isn't supposed to work like that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you can call it directly on the struct dma_buf, no need for an attachment). Also I don't think non-dynamic drivers should ever call pin/unpin, not their job, holding onto a mapping should be enough to get things pinned.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
Well that's why I asked where the pin/unpin pair is. If you lock&pin, then you do know that the pointer will stay around. But looks like the original patch from Dave for ttm based drivers played fast&loose here with what should be done.
BTW: What is vmap() still used for?
udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but not stuff we can drop.
If we're unlucky we'll actually have a problem with these now. For some of these the attached device is not dma-capable, so dma_map_sg will go boom. With the cached mapping logic we now have this might go sideways for dynamic exporters. Did you test your dynamic dma-buf support for amdgpu with udl?
Short answer no, not at all. Long one: What the heck is udl? And how is it not dma-capable?
usb display thing. The data gets shoveled over the wire with cpu vmaps. A bunch of the tiny drivers we have have similar issues, where the controller might fall back to PIO instead of dma. Either way the struct device itself is not sitting on a bus that can do dma directly - I guess the devices could (maybe they do already, didn't check that) attach to the bus master, but sometimes that's fairly opaque and hidden in the subsystem plumbing.
Entire different case is fully fake stuff like vgem/vkms. If that still works we should be good (at least on x86, dma-api works differently on other platforms). And again you'd need to test with your dynamic dma-buf exporter to verify, otherwise I don't think we'll hit the critical path.
Worst case we need to get rid of the fake attachment, fix the vmap locking/pinning, and maybe some more headaches to sort this out.
Well of hand we could require that vmap will also pin a DMA-buf and start fixing amgpu/nouveau/radeon/qxl.
Yeah that's what I meant with lock/pin fixing. But it's only one side, we'd also need to get rid of the dma_map_sg for all these drivers that don't dma (whether virtual or on a bus without dma). -Daniel
Christian.
-Daniel
-Daniel
Regards, Christian.
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
What am I missing?
Thanks, Daniel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On 11/20/19 1:24 PM, Christian König wrote:
Am 20.11.19 um 13:19 schrieb Daniel Vetter:
On Wed, Nov 20, 2019 at 1:09 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
On Wed, Nov 20, 2019 at 1:02 PM Christian König christian.koenig@amd.com wrote:
What am I missing?
The assumption is that when you want to create a vmap of a DMA-buf buffer the buffer needs to be pinned somehow.
E.g. without dynamic dma-buf handling you would need to have an active attachment. With dynamic handling the requirements could be lowered to using the pin()/unpin() callbacks.
Yeah right now everyone seems to have an attachment, and that's how we get away with all this. But the interface isn't supposed to work like that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you can call it directly on the struct dma_buf, no need for an attachment). Also I don't think non-dynamic drivers should ever call pin/unpin, not their job, holding onto a mapping should be enough to get things pinned.
You also can't lock/unlock inside your vmap callback because you don't have any guarantee that the pointer stays valid as soon as your drop your lock.
Well that's why I asked where the pin/unpin pair is. If you lock&pin, then you do know that the pointer will stay around. But looks like the original patch from Dave for ttm based drivers played fast&loose here with what should be done.
BTW: What is vmap() still used for?
udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but not stuff we can drop.
If we're unlucky we'll actually have a problem with these now. For some of these the attached device is not dma-capable, so dma_map_sg will go boom. With the cached mapping logic we now have this might go sideways for dynamic exporters. Did you test your dynamic dma-buf support for amdgpu with udl?
Short answer no, not at all. Long one: What the heck is udl? And how is it not dma-capable?
Worst case we need to get rid of the fake attachment, fix the vmap locking/pinning, and maybe some more headaches to sort this out.
Well of hand we could require that vmap will also pin a DMA-buf and start fixing amgpu/nouveau/radeon/qxl.
Perhaps with dynamic dma-bufs it might be possible to do something similar to vmwgfx (and recently other?) fbdev:
The map is cached, but may be invalidated as soon as we release dma_resv / unpin. (move_notify() unmaps if needed).
So each time it's needed we make sure we're locked / pinned and then call a map_validate() function. Typically the map is still around. If it isn't, the map_validate() function sets it up again.
Saves a bunch of vmap() calls or the need for persistent pinning for performance reasons.
/Thomas
Christian.
Hi
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
You have to pin and unpin storage before and after mapping.
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
There's a ref counter for pinning [1] and there's a counter for mapping. [2] Are you looking for something else?
Best regards Thomas
[1] https://cgit.freedesktop.org/drm/drm-tip/tree/include/drm/drm_gem_vram_helpe... [2] https://cgit.freedesktop.org/drm/drm-tip/tree/include/drm/drm_gem_vram_helpe...
What am I missing?
Thanks, Daniel
On Thu, Nov 21, 2019 at 8:59 AM Thomas Zimmermann tzimmermann@suse.de wrote:
Hi
Am 20.11.19 um 12:47 schrieb Daniel Vetter:
Hi all,
I've been looking at dma_buf_v(un)map, with a goal to standardize locking for at least dynamic dma-buf exporters/importers, most likely by requiring dma_resv_lock. And I got questions around how this is supposed to work, since a big chunk of drivers seem to entirely lack locking around ttm_bo_kmap. Two big ones:
- ttm_bo_kmap looks at bo->mem to figure out what/where to kmap to get
at that buffer. bo->mem is supposed to be protected with dma_resv_lock, but at least amgpu/nouveau/radeon/qxl don't grab that in their prime vmap function.
- between the vmap and vunmap something needs to make sure the backing
storage doesn't move around. I didn't find that either anywhere, ttm_bo_kmap simply seems to set up the mapping, leaving locking and refcounting to callers.
You have to pin and unpin storage before and after mapping.
Yeah, but you = the exporter, not the importer. I.e. the vmap callback should pin and vunmap should unpin. It's not a huge deal since currently all users of dma_buf_vmap I've found do a dma_buf_attach (even if they never use the attachment directly), and with drm prime helpers the attach results in a pin. Or that's at least the story I told myself to believe this is all not a totally fireworks show right now :-)
- vram helpers have at least locking, but I'm still missing the
refcounting. vmwgfx doesn't bother with vmap.
There's a ref counter for pinning [1] and there's a counter for mapping. [2] Are you looking for something else?
vram helpers actually looked pretty good in this regard. If you look at other callers of ttm_bo_kmap, they're a lot more fast&loose here. -Daniel
Best regards Thomas
[1] https://cgit.freedesktop.org/drm/drm-tip/tree/include/drm/drm_gem_vram_helpe... [2] https://cgit.freedesktop.org/drm/drm-tip/tree/include/drm/drm_gem_vram_helpe...
What am I missing?
Thanks, Daniel
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
dri-devel@lists.freedesktop.org