Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
Thank you,
Oleksandr
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
BR, -R
Hi, Rob
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory,
yes, this is the case. but, I can "map" buffers between host and guests
then TTM might be useful.
I was looking into it, but it seems to be an overkill in my case And isn't it that GEM should be used for new drivers, not TTM?
Otherwise this sounds like the right approach.
Thank you. Actually, I am playing with alloc_pages + remap_pfn_range now, but what DRM provides (_get_pages + shmem_read) seem to be more portable and generic. So, I'll probably stick to it
BR, -R
Thank you for helping, Oleksandr Andrushchenko
On Sat, Mar 18, 2017 at 9:25 AM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hi, Rob
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory,
yes, this is the case. but, I can "map" buffers between host and guests
if you need to physically copy (transfer), like a discreet gpu with vram, then TTM makes sense. If you can map the pages directly into the guest then TTM is probably overkill.
then TTM might be useful.
I was looking into it, but it seems to be an overkill in my case And isn't it that GEM should be used for new drivers, not TTM?
Not really, it's just that (other than amdgpu which uses TTM) all of the newer drivers have been unified memory. A driver for a new GPU that had vram of some sort should still use TTM.
BR, -R
Otherwise this sounds like the right approach.
Thank you. Actually, I am playing with alloc_pages + remap_pfn_range now, but what DRM provides (_get_pages + shmem_read) seem to be more portable and generic. So, I'll probably stick to it
BR, -R
Thank you for helping, Oleksandr Andrushchenko
On 03/18/2017 04:06 PM, Rob Clark wrote:
On Sat, Mar 18, 2017 at 9:25 AM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hi, Rob
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory,
yes, this is the case. but, I can "map" buffers between host and guests
if you need to physically copy (transfer), like a discreet gpu with vram, then TTM makes sense. If you can map the pages directly into the guest then TTM is probably overkill.
We have zero copy from guest to host/HW, this is why I'm not considering TTM
then TTM might be useful.
I was looking into it, but it seems to be an overkill in my case And isn't it that GEM should be used for new drivers, not TTM?
Not really, it's just that (other than amdgpu which uses TTM) all of the newer drivers have been unified memory.
Good to know, thank you
A driver for a new GPU that had vram of some sort should still use TTM.
our virtual GPU support is done on hypervisor level, so no changes to existing GPU drivers. So, the only thing to care about is that the buffers our DRM driver provides can be imported and used by that GPU (there are other issues related to memory, e.g. if real GPU/firware can see the memory of the guest, but this is another story)
BR, -R
Otherwise
this sounds like the right approach.
Thank you. Actually, I am playing with alloc_pages + remap_pfn_range now, but what DRM provides (_get_pages + shmem_read) seem to be more portable and generic. So, I'll probably stick to it
BR, -R
Thank you for helping, Oleksandr Andrushchenko
Ok, then I'll drop my alloc_pages + remap_pfn_range in favor of drm_gem_get_pages + shmem_read_mapping_page_gfp
Thank you
On Sat, Mar 18, 2017 at 10:44 AM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
then TTM might be useful.
I was looking into it, but it seems to be an overkill in my case And isn't it that GEM should be used for new drivers, not TTM?
Not really, it's just that (other than amdgpu which uses TTM) all of the newer drivers have been unified memory.
Good to know, thank you
A driver for a new GPU that had vram of some sort should still use TTM.
our virtual GPU support is done on hypervisor level, so no changes to existing GPU drivers. So, the only thing to care about is that the buffers our DRM driver provides can be imported and used by that GPU (there are other issues related to memory, e.g. if real GPU/firware can see the memory of the guest, but this is another story)
jfwiw, it might be useful to have a look at the intel GVT stuff.. they have recently (4.10) added para-virt support to i915
BR, -R
On 03/18/2017 04:50 PM, Rob Clark wrote:
On Sat, Mar 18, 2017 at 10:44 AM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
then TTM might be useful.
I was looking into it, but it seems to be an overkill in my case And isn't it that GEM should be used for new drivers, not TTM?
Not really, it's just that (other than amdgpu which uses TTM) all of the newer drivers have been unified memory.
Good to know, thank you
A driver for a new GPU
that had vram of some sort should still use TTM.
our virtual GPU support is done on hypervisor level, so no changes to existing GPU drivers. So, the only thing to care about is that the buffers our DRM driver provides can be imported and used by that GPU (there are other issues related to memory, e.g. if real GPU/firware can see the memory of the guest, but this is another story)
jfwiw, it might be useful to have a look at the intel GVT stuff.. they have recently (4.10) added para-virt support to i915
hm, thank you, I'll have a look at it (what is more when I'm not using ARM I'm playing with x86+i915, so it can be handy)
BR, -R
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
1. modetest 1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
2. kmscube (Rob, thanks for that :) + PowerVR SGX 6250 2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
BR, -R
Thank you, Oleksandr
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them.. modetest "renders" the frame on the CPU but kmscube does it on gpu. So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
BR, -R
BR, -R
Thank you, Oleksandr
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages* + *shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
BR, -R
BR, -R
Thank you, Oleksandr
On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages*
*shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
so, I suppose with pvr there is a whole host of potential pain... *but*..
if alloc_pages path actually works, then perhaps the issue is the deferred allocation. Ie. most drivers don't drm_gem_get_pages() until the buffer is passed to hw or until it is faulted in. You should make sure it ends up getting called (if it hasn't been called already) somewhere in gem_prime_pin.
BR, -R
On 03/20/2017 08:17 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
Hello, I am writing a para-virtualized DRM driver for Xen hypervisor and it now works with DRM CMA helpers, but I would also like to make it work with non-contigous memory: virtual machine that the driver runs in can't guarantee that CMA is actually physically contigous (that is not a problem because of IPMMU and other means, the only constraint I have is that I cannot mmap with pgprot == noncached). So, I am planning to use *drm_gem_get_pages*
*shmem_read_mapping_page_gfp* to allocate memory for GEM objects (scanout buffers + dma-bufs shared with virtual GPU)
Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
so, I suppose with pvr there is a whole host of potential pain... *but*..
if alloc_pages path actually works, then perhaps the issue is the deferred allocation. Ie. most drivers don't drm_gem_get_pages() until the buffer is passed to hw or until it is faulted in. You should make sure it ends up getting called (if it hasn't been called already) somewhere in gem_prime_pin.
I call drm_gem_get_pages as part of dumb creation, because I need to pass the pages to the host OS. So, probably, this is not because of the late allocation, but something else
BR, -R
Thank you!
On Mon, Mar 20, 2017 at 2:25 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 08:17 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote:
On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote: > > Hello, > I am writing a para-virtualized DRM driver for Xen hypervisor > and it now works with DRM CMA helpers, but I would also like > to make it work with non-contigous memory: virtual machine > that the driver runs in can't guarantee that CMA is actually > physically contigous (that is not a problem because of IPMMU > and other means, the only constraint I have is that I cannot mmap > with pgprot == noncached). So, I am planning to use > *drm_gem_get_pages* > + > *shmem_read_mapping_page_gfp* to allocate memory for GEM objects > (scanout buffers + dma-bufs shared with virtual GPU) > > Do you think this is the right approach to take?
I guess if you had some case where you needed to "migrate" buffers between host and guest memory, then TTM might be useful. Otherwise this sounds like the right approach.
Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
so, I suppose with pvr there is a whole host of potential pain... *but*..
if alloc_pages path actually works, then perhaps the issue is the deferred allocation. Ie. most drivers don't drm_gem_get_pages() until the buffer is passed to hw or until it is faulted in. You should make sure it ends up getting called (if it hasn't been called already) somewhere in gem_prime_pin.
I call drm_gem_get_pages as part of dumb creation, because I need to pass the pages to the host OS. So, probably, this is not because of the late allocation, but something else
hmm, well all the pvr gpu's that I've had to deal with in the past have MMUs, so there shouldn't be any specific issue with where the pages come from. But I guess you have to poke around the kernel module to see where things go wrong with dmabuf import (or if it even gets that far)
BR, -R
BR, -R
Thank you!
On 03/20/2017 08:52 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:25 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 08:17 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote: > On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko > andr2000@gmail.com wrote: >> Hello, >> I am writing a para-virtualized DRM driver for Xen hypervisor >> and it now works with DRM CMA helpers, but I would also like >> to make it work with non-contigous memory: virtual machine >> that the driver runs in can't guarantee that CMA is actually >> physically contigous (that is not a problem because of IPMMU >> and other means, the only constraint I have is that I cannot mmap >> with pgprot == noncached). So, I am planning to use >> *drm_gem_get_pages* >> + >> *shmem_read_mapping_page_gfp* to allocate memory for GEM objects >> (scanout buffers + dma-bufs shared with virtual GPU) >> >> Do you think this is the right approach to take? > I guess if you had some case where you needed to "migrate" buffers > between host and guest memory, then TTM might be useful. Otherwise > this sounds like the right approach. Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
so, I suppose with pvr there is a whole host of potential pain... *but*..
if alloc_pages path actually works, then perhaps the issue is the deferred allocation. Ie. most drivers don't drm_gem_get_pages() until the buffer is passed to hw or until it is faulted in. You should make sure it ends up getting called (if it hasn't been called already) somewhere in gem_prime_pin.
I call drm_gem_get_pages as part of dumb creation, because I need to pass the pages to the host OS. So, probably, this is not because of the late allocation, but something else
hmm, well all the pvr gpu's that I've had to deal with in the past have MMUs, so there shouldn't be any specific issue with where the pages come from.
that is true in my case as well (I am accessing that MMU from hypervisor code to virtualize GPU)
But I guess you have to poke around the kernel module to see where things go wrong with dmabuf import (or if it even gets that far)
Strange thing is that if I use DRM CMA helpers in my driver (I have an option to either use CMA or cook the buffers myself), then kmscube does work. So, probably the problem is not on pvr side wrt to dma-buf handling. Anyways, I'll get deeper into pvr to see if I can get any further
BR, -R
BR, -R
Thank you!
Thank you, Oleksandr
On 03/20/2017 08:52 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:25 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 08:17 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/20/2017 07:38 PM, Rob Clark wrote:
On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko andr2000@gmail.com wrote:
On 03/18/2017 02:22 PM, Rob Clark wrote: > On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko > andr2000@gmail.com wrote: >> Hello, >> I am writing a para-virtualized DRM driver for Xen hypervisor >> and it now works with DRM CMA helpers, but I would also like >> to make it work with non-contigous memory: virtual machine >> that the driver runs in can't guarantee that CMA is actually >> physically contigous (that is not a problem because of IPMMU >> and other means, the only constraint I have is that I cannot mmap >> with pgprot == noncached). So, I am planning to use >> *drm_gem_get_pages* >> + >> *shmem_read_mapping_page_gfp* to allocate memory for GEM objects >> (scanout buffers + dma-bufs shared with virtual GPU) >> >> Do you think this is the right approach to take? > I guess if you had some case where you needed to "migrate" buffers > between host and guest memory, then TTM might be useful. Otherwise > this sounds like the right approach. Tried that today (drm_gem_get_pages), the result is interesting:
- modetest
1.1. Runs, I can see page flips 1.2. vm_operations_struct.fault is called, I can vm_insert_page
- kmscube (Rob, thanks for that :) + PowerVR SGX 6250
2.1. Cannot initialize EGL 2.2. vm_operations_struct.fault is NOT called
jfwiw, pages will only get faulted in when CPU accesses them..
indeed, good catch
modetest "renders" the frame on the CPU but kmscube does it on gpu.
yes, I have already learned that modetest only renders once and then just flips
So not seeing vm_operations_struct.fault is normal. The EGL fail is not..
In both cases 2 dumbs are created and successfully mmaped, in case of kmscube there are also handle_to_fd IOCTLs issued and no DRM errors observed. No DMA-BUF mmap attempt seen
I re-checked 2) with alloc_pages + remap_pfn_range and it works (it cannot unmap cleanly, but it could be because I didn't call split_pages after alloc_pages), thus the setup is still good
Can it be that the buffer allocated with drm_gem_get_pages doesn't suit PowerVR for some reason?
I've no idea what the state of things is w/ pvr as far as gbm support (not required/used by modetest, but anything that uses the gpu on "bare metal" needs it). Or what the state of dmabuf-import is with pvr.
Do you think there could be DMA related problems with the buffer allocated with drm_gem_get_pages and DMA mapping, use? So GPU is not able to handle those?
The only source of knowledge at the moment I have is publicly available pvrsrvkm kernel module. But there are other unknowns, e.g. user-space libraries, firmware which are in binary form: thus kernel driver is mostly a bridge between FW and libs. That being said, do you think I have to get deeper into GPU use-case or should I switch back to alloc_pages+ remap_pfn_range? ;)
so, I suppose with pvr there is a whole host of potential pain... *but*..
if alloc_pages path actually works, then perhaps the issue is the deferred allocation. Ie. most drivers don't drm_gem_get_pages() until the buffer is passed to hw or until it is faulted in. You should make sure it ends up getting called (if it hasn't been called already) somewhere in gem_prime_pin.
I call drm_gem_get_pages as part of dumb creation, because I need to pass the pages to the host OS. So, probably, this is not because of the late allocation, but something else
hmm, well all the pvr gpu's that I've had to deal with in the past have MMUs, so there shouldn't be any specific issue with where the pages come from. But I guess you have to poke around the kernel module to see where things go wrong with dmabuf import (or if it even gets that far)
well, if I do vm_insert_page on .mmap for the whole buffer, then everything is ok for both GPU and CPU, so probably I'll leave it that way. I also removed .fault handler as it seems to be not needed if we mmap the whole thing at once
BR, -R
BR, -R
Thank you!
Thank you for helping!
dri-devel@lists.freedesktop.org