Hello all, is the kernel driver configured to support reads/writes to LLC (last level cache i.e. L3) on SNB? Cheers, Ben
On 2010.12.06 15:18:02 -0800, Segovia, Benjamin wrote:
Hello all, is the kernel driver configured to support reads/writes to LLC (last level cache i.e. L3) on SNB?
Now it's under limited use for the buffer that is sure to be cached, e.g hw status page, etc. code lives in drivers/char/agp/intel-gtt.c.
Thanks for the answer
I actually have no a rather simple prototype doing shared memory between CPU and GPU. Basically: 1/ I create a buffer B 2/ I map B and get a pointer to it 3/ I exec a batch buffer with B still mapped (ugly) and B pointed via a surface state internally 4/ CPU and GPU are communicating throught B
To make things cleaner, I would like to let a non-root user pin the buffer in GTT (so change the ioctl rights) and write one more ioctl to make the CPU pages unevictable in shmfs (if I am right)
I would like to write an external module on top of i915 for these two functions. However, I am not sure I can get symbols from i915 properly. Is a i915 patch the only solution?
Cheers, Ben
________________________________________ From: Zhenyu Wang [zhenyuw@linux.intel.com] Sent: Monday, December 06, 2010 11:00 PM To: Segovia, Benjamin Cc: DRI Subject: Re: Intel DRM driver for SNB
On 2010.12.06 15:18:02 -0800, Segovia, Benjamin wrote:
Hello all, is the kernel driver configured to support reads/writes to LLC (last level cache i.e. L3) on SNB?
Now it's under limited use for the buffer that is sure to be cached, e.g hw status page, etc. code lives in drivers/char/agp/intel-gtt.c.
-- Open Source Technology Center, Intel ltd.
$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
I have been keeping on reading DRM driver. I am not sure I need to modify the driver (I would like to avoid it)
The application does the following: 1/ I pin a buffer in gtt space 2/ I map the buffer in user space using MMAP ioctl or dri_bo_map 3/ I directly use the GTT offset with no relocation when I am referencing the buffer in any states 4/ I exec a buffer with some surface states pointing to that mapped and pinned buffer
* Now, is there anything bad that may happen if I read and write from the CPU this mapped and pinned buffer? * Am I sure that both CPU and GPU will read and write the same _physical_ piece of memory while Gen is processing the batch buffer?
The software I write for both EUs and CPUs will carefully handle memory coherency (flush of Gen caches in particular in Gen code). I just need to be sure that both are reading the same physical bits during the time GPU is processing the batch buffer.
Am I doing something correct?
Ben
-----Original Message----- From: dri-devel-bounces+benjamin.segovia=intel.com@lists.freedesktop.org [mailto:dri-devel-bounces+benjamin.segovia=intel.com@lists.freedesktop.org] On Behalf Of Segovia, Benjamin Sent: Friday, December 10, 2010 8:15 PM To: Zhenyu Wang Cc: DRI Subject: RE: Intel DRM driver for SNB
Thanks for the answer
I actually have no a rather simple prototype doing shared memory between CPU and GPU. Basically: 1/ I create a buffer B 2/ I map B and get a pointer to it 3/ I exec a batch buffer with B still mapped (ugly) and B pointed via a surface state internally 4/ CPU and GPU are communicating throught B
To make things cleaner, I would like to let a non-root user pin the buffer in GTT (so change the ioctl rights) and write one more ioctl to make the CPU pages unevictable in shmfs (if I am right)
I would like to write an external module on top of i915 for these two functions. However, I am not sure I can get symbols from i915 properly. Is a i915 patch the only solution?
Cheers, Ben
________________________________________ From: Zhenyu Wang [zhenyuw@linux.intel.com] Sent: Monday, December 06, 2010 11:00 PM To: Segovia, Benjamin Cc: DRI Subject: Re: Intel DRM driver for SNB
On 2010.12.06 15:18:02 -0800, Segovia, Benjamin wrote:
Hello all, is the kernel driver configured to support reads/writes to LLC (last level cache i.e. L3) on SNB?
Now it's under limited use for the buffer that is sure to be cached, e.g hw status page, etc. code lives in drivers/char/agp/intel-gtt.c.
-- Open Source Technology Center, Intel ltd.
$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
To be more explicit, my concern is that I read that Chris Wilson proposed a patch preventing the VM to swap pages still in GTT. I did not see any trace of this patch in the main line yet.
So, it is anyway a bug which will be solved at some point? Can I be sure that once pinned in GTT and mapped in user space, my buffer can be safely used by both CPUs and GPUs and that the VM will never swap it?
Ben
----Original Message----- From: Segovia, Benjamin Sent: Monday, December 13, 2010 8:20 PM To: Segovia, Benjamin; Zhenyu Wang Cc: DRI Subject: RE: Intel DRM driver for SNB
I have been keeping on reading DRM driver. I am not sure I need to modify the driver (I would like to avoid it)
The application does the following: 1/ I pin a buffer in gtt space 2/ I map the buffer in user space using MMAP ioctl or dri_bo_map 3/ I directly use the GTT offset with no relocation when I am referencing the buffer in any states 4/ I exec a buffer with some surface states pointing to that mapped and pinned buffer
* Now, is there anything bad that may happen if I read and write from the CPU this mapped and pinned buffer? * Am I sure that both CPU and GPU will read and write the same _physical_ piece of memory while Gen is processing the batch buffer?
The software I write for both EUs and CPUs will carefully handle memory coherency (flush of Gen caches in particular in Gen code). I just need to be sure that both are reading the same physical bits during the time GPU is processing the batch buffer.
Am I doing something correct?
Ben
-----Original Message----- From: dri-devel-bounces+benjamin.segovia=intel.com@lists.freedesktop.org [mailto:dri-devel-bounces+benjamin.segovia=intel.com@lists.freedesktop.org] On Behalf Of Segovia, Benjamin Sent: Friday, December 10, 2010 8:15 PM To: Zhenyu Wang Cc: DRI Subject: RE: Intel DRM driver for SNB
Thanks for the answer
I actually have no a rather simple prototype doing shared memory between CPU and GPU. Basically: 1/ I create a buffer B 2/ I map B and get a pointer to it 3/ I exec a batch buffer with B still mapped (ugly) and B pointed via a surface state internally 4/ CPU and GPU are communicating throught B
To make things cleaner, I would like to let a non-root user pin the buffer in GTT (so change the ioctl rights) and write one more ioctl to make the CPU pages unevictable in shmfs (if I am right)
I would like to write an external module on top of i915 for these two functions. However, I am not sure I can get symbols from i915 properly. Is a i915 patch the only solution?
Cheers, Ben
________________________________________ From: Zhenyu Wang [zhenyuw@linux.intel.com] Sent: Monday, December 06, 2010 11:00 PM To: Segovia, Benjamin Cc: DRI Subject: Re: Intel DRM driver for SNB
On 2010.12.06 15:18:02 -0800, Segovia, Benjamin wrote:
Hello all, is the kernel driver configured to support reads/writes to LLC (last level cache i.e. L3) on SNB?
Now it's under limited use for the buffer that is sure to be cached, e.g hw status page, etc. code lives in drivers/char/agp/intel-gtt.c.
-- Open Source Technology Center, Intel ltd.
$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, 13 Dec 2010 20:32:42 -0800, "Segovia, Benjamin" benjamin.segovia@intel.com wrote:
To be more explicit, my concern is that I read that Chris Wilson proposed a patch preventing the VM to swap pages still in GTT. I did not see any trace of this patch in the main line yet.
No, because the mm maintainers corrected me by pointing out that a page will not be swapped whilst a reference is held and it remains off the zone LRU lists. (That said there remains a persistent nagging doubt that something smells fishy...)
But as for what you want to do, it is possible with the current module. However as it deliberately evades the minimal security provisions we have, it should only be done from a privileged application.
In principle, there should be sufficient knowledge within the kernel to maintain coherency for you. I'm interested in knowing how you plan to manage coherency to make sure we have not missed something. (Give or take the fixes for completely handing coherency of uncached/cached CPU mappings.)
With ppGTT, we can be much more permissive and allow individual apps to reserve portions of the GTT without impacting other users of the system. -Chris
Perfect. It is still a good temporary solution. For coherency, my goal is top ensure it without the need of MI_FLUSH or pipe control. So, EUs must be able to do it themselves For ILK * on CPU, I used _mm_clflush (as the kernel does btw) to go through memory * on EUs, I used the render cache flush command
For SNB * nothing special on CPU since LLC snoops cores caches * on SNB, the flush render cache command is gone. I therefore manually handled the flush abusing the policy and the associativity of the render cache
As for the cachability on Gen, you can override page descriptors using surface descriptors.
Cheers, Ben
________________________________________ From: Chris Wilson [chris@chris-wilson.co.uk] Sent: Tuesday, December 14, 2010 2:59 AM To: Segovia, Benjamin; Segovia, Benjamin; Zhenyu Wang Cc: DRI Subject: RE: Intel DRM driver for SNB
On Mon, 13 Dec 2010 20:32:42 -0800, "Segovia, Benjamin" benjamin.segovia@intel.com wrote:
To be more explicit, my concern is that I read that Chris Wilson proposed a patch preventing the VM to swap pages still in GTT. I did not see any trace of this patch in the main line yet.
No, because the mm maintainers corrected me by pointing out that a page will not be swapped whilst a reference is held and it remains off the zone LRU lists. (That said there remains a persistent nagging doubt that something smells fishy...)
But as for what you want to do, it is possible with the current module. However as it deliberately evades the minimal security provisions we have, it should only be done from a privileged application.
In principle, there should be sufficient knowledge within the kernel to maintain coherency for you. I'm interested in knowing how you plan to manage coherency to make sure we have not missed something. (Give or take the fixes for completely handing coherency of uncached/cached CPU mappings.)
With ppGTT, we can be much more permissive and allow individual apps to reserve portions of the GTT without impacting other users of the system. -Chris
-- Chris Wilson, Intel Open Source Technology Centre
On Tue, 14 Dec 2010 12:09:49 -0800, "Segovia, Benjamin" benjamin.segovia@intel.com wrote:
Perfect. It is still a good temporary solution. For coherency, my goal is top ensure it without the need of MI_FLUSH or pipe control. So, EUs must be able to do it themselves For ILK
- on CPU, I used _mm_clflush (as the kernel does btw) to go through memory
- on EUs, I used the render cache flush command
Fo ILK:
Note that for writes, if you're poking into memory (drm_intel_bo_map), then clflush is not sufficient, you also need to flush the GWB. Check out the agp chipset flush code. If you're going through GTT (drm_intel_gem_bo_map_gtt), then you don't need clfush, just a posting read.
If you're trying to get streaming write performance, the best way is probably the GTT. If you're trying to get streaming read performance, the best way is likely also the GTT with movntdqa.
Also, does i915 actually use uncacheable pages for CPU memory? When going to cpu -> gpu domain for gem buffers, I saw explicit clflush but nothing seems to show use of uncacheable pages. Did I miss something?
Ben
-----Original Message----- From: Chris Wilson [mailto:chris@chris-wilson.co.uk] Sent: Tuesday, December 14, 2010 2:00 AM To: Segovia, Benjamin; Segovia, Benjamin; Zhenyu Wang Cc: DRI Subject: RE: Intel DRM driver for SNB
On Mon, 13 Dec 2010 20:32:42 -0800, "Segovia, Benjamin" benjamin.segovia@intel.com wrote:
To be more explicit, my concern is that I read that Chris Wilson proposed a patch preventing the VM to swap pages still in GTT. I did not see any trace of this patch in the main line yet.
No, because the mm maintainers corrected me by pointing out that a page will not be swapped whilst a reference is held and it remains off the zone LRU lists. (That said there remains a persistent nagging doubt that something smells fishy...)
But as for what you want to do, it is possible with the current module. However as it deliberately evades the minimal security provisions we have, it should only be done from a privileged application.
In principle, there should be sufficient knowledge within the kernel to maintain coherency for you. I'm interested in knowing how you plan to manage coherency to make sure we have not missed something. (Give or take the fixes for completely handing coherency of uncached/cached CPU mappings.)
With ppGTT, we can be much more permissive and allow individual apps to reserve portions of the GTT without impacting other users of the system. -Chris
dri-devel@lists.freedesktop.org