Hi all,
drm/i915 has write/read paths to upload/download data to/from gpu buffer objects. For a bunch of reasons we have special fastpaths with decent setup costs, so when we fall back to the slow-path we don't fully recover to the fastest fast-path when grabbing our locks again. This is also in parts due to that we have multiple fallbacks to slower paths, so control-flow in our code would get really ugly.
As part of a larger rewrite and re-tuning of these functions we've noticed that the prefault helpers in pagemap.h only prefault up to PAGE_SIZE because that's all the other users need. The follow-up patch extends this to abritary sizes so that we can fully exploit our special fastpaths in more cases (we typically see reads and writes of a few pages up to a few mb).
I'd like to get this in for 3.4. There's no functional dependency between this patch and the drm/i915 rework (only performance effects), so this can go in through -mm without causing merge issues.
If this or something similar isn't acceptable, plan B is to hand-roll these 2 functions in drm/i915. But I don't like nih these things in driver code much.
Comments highly welcome.
Yours, Daniel
For reference, the drm/i915 read/write rework is avaialable at:
http://cgit.freedesktop.org/~danvet/drm/log/?h=pwrite-pread
Unfortunately cgit.fd.o is currently on hiatus.
Daniel Vetter (1): mm: extend prefault helpers to fault in more than PAGE_SIZE
include/linux/pagemap.h | 28 ++++++++++++++++++---------- 1 files changed, 18 insertions(+), 10 deletions(-)
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
Cc: linux-mm@kvack.org Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch --- include/linux/pagemap.h | 28 ++++++++++++++++++---------- 1 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cfaaa69..689527d 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret; + char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0; @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); + while (uaddr <= end) { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } if (ret == 0) { - char __user *end = uaddr + size - 1; - /* * If the page was already mapped, this will get a cache miss * for sure, so try to avoid doing it. */ - if (((unsigned long)uaddr & PAGE_MASK) != + if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; } @@ -435,17 +439,21 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size) { volatile char c; int ret; + const char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
- ret = __get_user(c, uaddr); + while (uaddr <= end) { + ret = __get_user(c, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } if (ret == 0) { - const char __user *end = uaddr + size - 1; - - if (((unsigned long)uaddr & PAGE_MASK) != + if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK)) { - ret = __get_user(c, end); + ret = __get_user(c, end); (void)c; } }
On Thu, Feb 16, 2012 at 8:01 PM, Daniel Vetter daniel.vetter@ffwll.ch wrote:
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
- ret = __put_user(0, uaddr);
- if (ret != 0)
- return ret;
- uaddr += PAGE_SIZE;
- }
What if uaddr & ~PAGE_MASK == PAGE_SIZE -3 && end & ~PAGE_MASK == 2
On Thu, Feb 16, 2012 at 09:32:08PM +0800, Hillf Danton wrote:
On Thu, Feb 16, 2012 at 8:01 PM, Daniel Vetter daniel.vetter@ffwll.ch wrote:
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
- ret = __put_user(0, uaddr);
- if (ret != 0)
- return ret;
- uaddr += PAGE_SIZE;
- }
What if uaddr & ~PAGE_MASK == PAGE_SIZE -3 && end & ~PAGE_MASK == 2
I don't quite follow - can you elaborate upon which issue you're seeing? -Daniel
On Thu, 16 Feb 2012 13:01:36 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
...
--- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
The callsites in filemap.c are pretty hot paths, which is why this thing remains explicitly inlined. I think it would be worth adding a bit of code here to avoid adding a pointless test-n-branch and larger cache footprint to read() and write().
A way of doing that is to add another argument to these functions, say "bool multipage". Change the code to do
if (multipage) { while (uaddr <= end) { ... } }
and change the callsites to pass in constant "true" or "false". Then compile it up and manually check that the compiler completely removed the offending code from the filemap.c callsites.
Wanna have a think about that? If it all looks OK then please be sure to add code comments explaining why we did this.
if (ret == 0) {
char __user *end = uaddr + size - 1;
- /*
*/
- If the page was already mapped, this will get a cache miss
- for sure, so try to avoid doing it.
if (((unsigned long)uaddr & PAGE_MASK) !=
if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK))
Maybe I'm having a dim day, but I don't immediately see why != got turned into ==.
Once we have this settled I'd suggest that the patch be carried in whatever-git-tree-needs-it.
On Thu, Feb 23, 2012 at 02:36:58PM -0800, Andrew Morton wrote:
On Thu, 16 Feb 2012 13:01:36 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
...
--- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
The callsites in filemap.c are pretty hot paths, which is why this thing remains explicitly inlined. I think it would be worth adding a bit of code here to avoid adding a pointless test-n-branch and larger cache footprint to read() and write().
A way of doing that is to add another argument to these functions, say "bool multipage". Change the code to do
if (multipage) { while (uaddr <= end) { ... } }
and change the callsites to pass in constant "true" or "false". Then compile it up and manually check that the compiler completely removed the offending code from the filemap.c callsites.
Wanna have a think about that? If it all looks OK then please be sure to add code comments explaining why we did this.
I wasn't really happy with the added branch either, but failed to come up with a trick to avoid it. Imho adding new _multipage variants of these functions instead of adding a constant argument is simpler because the functions don't really share much thanks to the block below. I'll see what it looks like (and obviously add a comment explaining what's going on).
if (ret == 0) {
char __user *end = uaddr + size - 1;
- /*
*/
- If the page was already mapped, this will get a cache miss
- for sure, so try to avoid doing it.
if (((unsigned long)uaddr & PAGE_MASK) !=
if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK))
Maybe I'm having a dim day, but I don't immediately see why != got turned into ==.
Because of the loop uaddr will now point one page beyond the last prefaulted page. To check whether end spilled into a new page we therefore need to check whether uaddr and end are in the same pfn. Before uaddr wasn't changed and hence the checking for a different pfn worked correctly.
Once we have this settled I'd suggest that the patch be carried in whatever-git-tree-needs-it.
Thanks for the comments.
Yours, Daniel
On Fri, 24 Feb 2012 14:34:31 +0100 Daniel Vetter daniel@ffwll.ch wrote:
--- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
The callsites in filemap.c are pretty hot paths, which is why this thing remains explicitly inlined. I think it would be worth adding a bit of code here to avoid adding a pointless test-n-branch and larger cache footprint to read() and write().
A way of doing that is to add another argument to these functions, say "bool multipage". Change the code to do
if (multipage) { while (uaddr <= end) { ... } }
and change the callsites to pass in constant "true" or "false". Then compile it up and manually check that the compiler completely removed the offending code from the filemap.c callsites.
Wanna have a think about that? If it all looks OK then please be sure to add code comments explaining why we did this.
I wasn't really happy with the added branch either, but failed to come up with a trick to avoid it. Imho adding new _multipage variants of these functions instead of adding a constant argument is simpler because the functions don't really share much thanks to the block below. I'll see what it looks like (and obviously add a comment explaining what's going on).
well... that's just syntactic sugar:
static inline int __fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { ... }
static inline int fault_in_pages_writeable(char __user *uaddr, int size) { return __fault_in_pages_writeable(uaddr, size, false); }
static inline int fault_in_multipages_writeable(char __user *uaddr, int size) { return __fault_in_pages_writeable(uaddr, size, true); }
which I don't think is worth bothering with given the very small number of callsites.
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed.
Cc: linux-mm@kvack.org Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 4 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- fs/pipe.c | 4 +- fs/splice.c | 2 +- include/linux/pagemap.h | 39 ++++++++++++++++++--------- mm/filemap.c | 4 +- 6 files changed, 34 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 544e528..9b200f4e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -436,7 +436,7 @@ i915_gem_shmem_pread(struct drm_device *dev, mutex_unlock(&dev->struct_mutex);
if (!prefaulted) { - ret = fault_in_pages_writeable(user_data, remain); + ret = fault_in_pages_writeable(user_data, remain, true); /* Userspace is tricking us, but we've already clobbered * its pages with the prefault and promised to write the * data up to the first fault. Hence ignore any errors @@ -823,7 +823,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, return -EFAULT;
ret = fault_in_pages_readable((char __user *)(uintptr_t)args->data_ptr, - args->size); + args->size, true); if (ret) return -EFAULT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 81687af..5f0b685 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -955,7 +955,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, if (!access_ok(VERIFY_WRITE, ptr, length)) return -EFAULT;
- if (fault_in_pages_readable(ptr, length)) + if (fault_in_pages_readable(ptr, length, true)) return -EFAULT; }
diff --git a/fs/pipe.c b/fs/pipe.c index a932ced..b29f71c 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -167,7 +167,7 @@ static int iov_fault_in_pages_write(struct iovec *iov, unsigned long len) unsigned long this_len;
this_len = min_t(unsigned long, len, iov->iov_len); - if (fault_in_pages_writeable(iov->iov_base, this_len)) + if (fault_in_pages_writeable(iov->iov_base, this_len, false)) break;
len -= this_len; @@ -189,7 +189,7 @@ static void iov_fault_in_pages_read(struct iovec *iov, unsigned long len) unsigned long this_len;
this_len = min_t(unsigned long, len, iov->iov_len); - fault_in_pages_readable(iov->iov_base, this_len); + fault_in_pages_readable(iov->iov_base, this_len, false); len -= this_len; iov++; } diff --git a/fs/splice.c b/fs/splice.c index 1ec0493..e919d78 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -1491,7 +1491,7 @@ static int pipe_to_user(struct pipe_inode_info *pipe, struct pipe_buffer *buf, * See if we can use the atomic maps, by prefaulting in the * pages and doing an atomic copy */ - if (!fault_in_pages_writeable(sd->u.userptr, sd->len)) { + if (!fault_in_pages_writeable(sd->u.userptr, sd->len, false)) { src = buf->ops->map(pipe, buf, 1); ret = __copy_to_user_inatomic(sd->u.userptr, src + buf->offset, sd->len); diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cfaaa69..60ac5c5 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -403,11 +403,14 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); * Fault a userspace page into pagetables. Return non-zero on a fault. * * This assumes that two userspace pages are always sufficient. That's - * not true if PAGE_CACHE_SIZE > PAGE_SIZE. + * not true if PAGE_CACHE_SIZE > PAGE_SIZE. If more than PAGE_SIZE needs to be + * prefaulted, set multipage to true. */ -static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int fault_in_pages_writeable(char __user *uaddr, int size, + bool multipage) { int ret; + char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0; @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); - if (ret == 0) { - char __user *end = uaddr + size - 1; + do { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } while (multipage && uaddr <= end);
+ if (ret == 0) { /* * If the page was already mapped, this will get a cache miss * for sure, so try to avoid doing it. */ - if (((unsigned long)uaddr & PAGE_MASK) != + if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; }
-static inline int fault_in_pages_readable(const char __user *uaddr, int size) +static inline int fault_in_pages_readable(const char __user *uaddr, int size, + bool multipage) { volatile char c; int ret; + const char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
- ret = __get_user(c, uaddr); - if (ret == 0) { - const char __user *end = uaddr + size - 1; + do { + ret = __get_user(c, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } while (multipage && uaddr <= end);
- if (((unsigned long)uaddr & PAGE_MASK) != + if (ret == 0) { + if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK)) { - ret = __get_user(c, end); + ret = __get_user(c, end); (void)c; } } diff --git a/mm/filemap.c b/mm/filemap.c index 97f49ed..af2cad5 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1317,7 +1317,7 @@ int file_read_actor(read_descriptor_t *desc, struct page *page, * Faults on the destination of a read are common, so do it before * taking the kmap. */ - if (!fault_in_pages_writeable(desc->arg.buf, size)) { + if (!fault_in_pages_writeable(desc->arg.buf, size, false)) { kaddr = kmap_atomic(page, KM_USER0); left = __copy_to_user_inatomic(desc->arg.buf, kaddr + offset, size); @@ -2138,7 +2138,7 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes) { char __user *buf = i->iov->iov_base + i->iov_offset; bytes = min(bytes, i->iov->iov_len - i->iov_offset); - return fault_in_pages_readable(buf, bytes); + return fault_in_pages_readable(buf, bytes, false); } EXPORT_SYMBOL(iov_iter_fault_in_readable);
On Wed, 29 Feb 2012 15:03:31 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed.
I don't think this produced a very good result :(
...
-static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int fault_in_pages_writeable(char __user *uaddr, int size,
bool multipage)
{ int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- if (ret == 0) {
char __user *end = uaddr + size - 1;
do {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
} while (multipage && uaddr <= end);
if (ret == 0) { /*
- If the page was already mapped, this will get a cache miss
- for sure, so try to avoid doing it.
*/
if (((unsigned long)uaddr & PAGE_MASK) !=
if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK))
ret = __put_user(0, end);
} return ret;ret = __put_user(0, end);
}
One effect of this change for the filemap.c callsite is that `uaddr' now gets incremented by PAGE_SIZE. That happens to not break anything because we then mask `uaddr' with PAGE_MASK, and if gcc were really smart, it could remove that addition. But it's a bit ugly.
Ideally the patch would have no effect upon filemap.o size, but with an allmodconfig config I'm seeing
text data bss dec hex filename 22876 118 7344 30338 7682 mm/filemap.o (before) 22925 118 7392 30435 76e3 mm/filemap.o (after)
so we are adding read()/write() overhead, and bss mysteriously got larger.
Can we improve on this? Even if it's some dumb
static inline int fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { if (multipage) { do-this } else { do-that } }
the code duplication between do-this and do-that is regrettable, but at least it's all in the same place in the same file, so we won't accidentally introduce skew later on.
Alternatively, add a separate fault_in_multi_pages_writeable() to pagemap.h. I have a bad feeling this is what your original patch did!
(But we *should* be able to make this work! Why did this version of the patch go so wrong?)
On Wed, Feb 29, 2012 at 03:01:46PM -0800, Andrew Morton wrote:
On Wed, 29 Feb 2012 15:03:31 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed.
I don't think this produced a very good result :(
And I halfway expected this mail here ;-)
...
-static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int fault_in_pages_writeable(char __user *uaddr, int size,
bool multipage)
{ int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- if (ret == 0) {
char __user *end = uaddr + size - 1;
do {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
} while (multipage && uaddr <= end);
if (ret == 0) { /*
- If the page was already mapped, this will get a cache miss
- for sure, so try to avoid doing it.
*/
if (((unsigned long)uaddr & PAGE_MASK) !=
if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK))
ret = __put_user(0, end);
} return ret;ret = __put_user(0, end);
}
One effect of this change for the filemap.c callsite is that `uaddr' now gets incremented by PAGE_SIZE. That happens to not break anything because we then mask `uaddr' with PAGE_MASK, and if gcc were really smart, it could remove that addition. But it's a bit ugly.
Yep, gcc is not clever enough to reap the addl on uaddr (and change the check for 'do we need to fault the 2nd page to' from jne to je again). I've checked that before submitting - maybe should have mentioned this.
Ideally the patch would have no effect upon filemap.o size, but with an allmodconfig config I'm seeing
text data bss dec hex filename 22876 118 7344 30338 7682 mm/filemap.o (before) 22925 118 7392 30435 76e3 mm/filemap.o (after)
so we are adding read()/write() overhead, and bss mysteriously got larger.
Can we improve on this? Even if it's some dumb
static inline int fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { if (multipage) { do-this } else { do-that } }
the code duplication between do-this and do-that is regrettable, but at least it's all in the same place in the same file, so we won't accidentally introduce skew later on.
Alternatively, add a separate fault_in_multi_pages_writeable() to pagemap.h. I have a bad feeling this is what your original patch did!
(But we *should* be able to make this work! Why did this version of the patch go so wrong?)
Well, I couldn't reconcile the non-multipage with the multipage versions of these functions - at least not without changing them slightly (like this patch here does). Which is why I've asked you whether I should just add a new multipage version of these. I personally deem your proposal of using and if (multipage) with no shared code too ugly. But you've shot at it a bit, so I've figured that this version here is what you want.
I'll redo this patch by adding _multipage versions of these 2 functions for i915.
Yours, Daniel
On Thu, 1 Mar 2012 00:14:53 +0100 Daniel Vetter daniel@ffwll.ch wrote:
I'll redo this patch by adding _multipage versions of these 2 functions for i915.
OK, but I hope "for i915" doesn't mean "private to"! Put 'em in pagemap.h, for maintenance reasons and because they are generic.
Making them inline is a bit sad, but that's OK for now - they have few callsites.
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
Add new functions in filemap.h to make that possible.
Also kill a copy&pasted spurious space in both functions while at it.
v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed.
v3: Becaus I couldn't find a way around adding a uaddr += PAGE_SIZE to the filemap.c hotpaths (that the compiler couldn't remove again), let's go with separate new functions for the multipage use-case.
Cc: linux-mm@kvack.org Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 6 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- include/linux/pagemap.h | 62 +++++++++++++++++++++++++++- 3 files changed, 64 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 544e528..3e631fc 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -436,7 +436,7 @@ i915_gem_shmem_pread(struct drm_device *dev, mutex_unlock(&dev->struct_mutex);
if (!prefaulted) { - ret = fault_in_pages_writeable(user_data, remain); + ret = fault_in_multipages_writeable(user_data, remain); /* Userspace is tricking us, but we've already clobbered * its pages with the prefault and promised to write the * data up to the first fault. Hence ignore any errors @@ -822,8 +822,8 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, args->size)) return -EFAULT;
- ret = fault_in_pages_readable((char __user *)(uintptr_t)args->data_ptr, - args->size); + ret = fault_in_multipages_readable((char __user *)(uintptr_t)args->data_ptr, + args->size); if (ret) return -EFAULT;
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 81687af..ef87f52 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -955,7 +955,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, if (!access_ok(VERIFY_WRITE, ptr, length)) return -EFAULT;
- if (fault_in_pages_readable(ptr, length)) + if (fault_in_multipages_readable(ptr, length)) return -EFAULT; }
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cfaaa69..0a91375 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -426,7 +426,7 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) */ if (((unsigned long)uaddr & PAGE_MASK) != ((unsigned long)end & PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; } @@ -445,13 +445,71 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size)
if (((unsigned long)uaddr & PAGE_MASK) != ((unsigned long)end & PAGE_MASK)) { - ret = __get_user(c, end); + ret = __get_user(c, end); (void)c; } } return ret; }
+/* Multipage variants of the above prefault helpers, useful if more than + * PAGE_SIZE of date needs to be prefaulted. These are separate from the above + * functions (which only handle up to PAGE_SIZE) to avoid clobbering the + * filemap.c hotpaths. */ +static inline int fault_in_multipages_writeable(char __user *uaddr, int size) +{ + int ret; + const char __user *end = uaddr + size - 1; + + if (unlikely(size == 0)) + return 0; + + /* + * Writing zeroes into userspace here is OK, because we know that if + * the zero gets there, we'll be overwriting it. + */ + while (uaddr <= end) { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } + + /* Check whether the range spilled into the next page. */ + if (((unsigned long)uaddr & PAGE_MASK) == + ((unsigned long)end & PAGE_MASK)) + ret = __put_user(0, end); + + return ret; +} + +static inline int fault_in_multipages_readable(const char __user *uaddr, + int size) +{ + volatile char c; + int ret; + const char __user *end = uaddr + size - 1; + + if (unlikely(size == 0)) + return 0; + + while (uaddr <= end) { + ret = __get_user(c, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } + + /* Check whether the range spilled into the next page. */ + if (((unsigned long)uaddr & PAGE_MASK) == + ((unsigned long)end & PAGE_MASK)) { + ret = __get_user(c, end); + (void)c; + } + + return ret; +} + int add_to_page_cache_locked(struct page *page, struct address_space *mapping, pgoff_t index, gfp_t gfp_mask); int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
On Thu, 1 Mar 2012 20:22:59 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
Add new functions in filemap.h to make that possible.
Also kill a copy&pasted spurious space in both functions while at it.
...
+/* Multipage variants of the above prefault helpers, useful if more than
- PAGE_SIZE of date needs to be prefaulted. These are separate from the above
- functions (which only handle up to PAGE_SIZE) to avoid clobbering the
- filemap.c hotpaths. */
Like this please:
/* * Multipage variants of the above prefault helpers, useful if more than * PAGE_SIZE of date needs to be prefaulted. These are separate from the above * functions (which only handle up to PAGE_SIZE) to avoid clobbering the * filemap.c hotpaths. */
and s/date/data/
+static inline int fault_in_multipages_writeable(char __user *uaddr, int size) +{
- int ret;
- const char __user *end = uaddr + size - 1;
- if (unlikely(size == 0))
return 0;
- /*
* Writing zeroes into userspace here is OK, because we know that if
* the zero gets there, we'll be overwriting it.
*/
Yeah, like that.
- while (uaddr <= end) {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
- /* Check whether the range spilled into the next page. */
- if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK))
ret = __put_user(0, end);
- return ret;
+}
+static inline int fault_in_multipages_readable(const char __user *uaddr,
int size)
+{
- volatile char c;
- int ret;
- const char __user *end = uaddr + size - 1;
- if (unlikely(size == 0))
return 0;
- while (uaddr <= end) {
ret = __get_user(c, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
- /* Check whether the range spilled into the next page. */
- if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK)) {
ret = __get_user(c, end);
(void)c;
- }
- return ret;
+}
Please merge it via the DRI tree.
On Thu, Mar 01, 2012 at 12:15:57PM -0800, Andrew Morton wrote:
On Thu, 1 Mar 2012 20:22:59 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
Add new functions in filemap.h to make that possible.
Also kill a copy&pasted spurious space in both functions while at it.
...
+/* Multipage variants of the above prefault helpers, useful if more than
- PAGE_SIZE of date needs to be prefaulted. These are separate from the above
- functions (which only handle up to PAGE_SIZE) to avoid clobbering the
- filemap.c hotpaths. */
Like this please:
/*
- Multipage variants of the above prefault helpers, useful if more than
- PAGE_SIZE of date needs to be prefaulted. These are separate from the above
- functions (which only handle up to PAGE_SIZE) to avoid clobbering the
- filemap.c hotpaths.
*/
and s/date/data/
...
Please merge it via the DRI tree.
Ok, I've queued this up 3.5 (it missed the 3.4 merge because a few of the drm/i915 patches from that series haven't been reviewed in time) with the comment fixed up and your Acked-by on the commit message. I hope the later is ok, otherwise please yell.
Thanks for reviewing this. -Daniel
On Thu, Mar 1, 2012 at 20:22, Daniel Vetter daniel.vetter@ffwll.ch wrote:
+/* Multipage variants of the above prefault helpers, useful if more than
- PAGE_SIZE of date needs to be prefaulted. These are separate from the above
- functions (which only handle up to PAGE_SIZE) to avoid clobbering the
- filemap.c hotpaths. */
+static inline int fault_in_multipages_writeable(char __user *uaddr, int size) +{
- int ret;
- const char __user *end = uaddr + size - 1;
Please drop the const.
- if (unlikely(size == 0))
- return 0;
- /*
- * Writing zeroes into userspace here is OK, because we know that if
- * the zero gets there, we'll be overwriting it.
- */
- while (uaddr <= end) {
- ret = __put_user(0, uaddr);
- if (ret != 0)
- return ret;
- uaddr += PAGE_SIZE;
- }
- /* Check whether the range spilled into the next page. */
- if (((unsigned long)uaddr & PAGE_MASK) ==
- ((unsigned long)end & PAGE_MASK))
- ret = __put_user(0, end);
include/linux/pagemap.h:483:3: error: read-only location '*end' used as 'asm' output
Now in -next:
http://kisskb.ellerman.id.au/kisskb/buildresult/6100650/ http://kisskb.ellerman.id.au/kisskb/buildresult/6100673/ http://kisskb.ellerman.id.au/kisskb/buildresult/6100860/
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
This regression has been introduced in
commit f56f821feb7b36223f309e0ec05986bb137ce418 Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Sun Mar 25 19:47:41 2012 +0200
mm: extend prefault helpers to fault in more than PAGE_SIZE
I have failed to notice this because x86 asm seems to happily compile things as-is.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org Signed-Off-by: Daniel Vetter daniel.vetter@ffwll.ch --- include/linux/pagemap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index c93a9a9..efa26b4 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -461,7 +461,7 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size) static inline int fault_in_multipages_writeable(char __user *uaddr, int size) { int ret; - const char __user *end = uaddr + size - 1; + char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
dri-devel@lists.freedesktop.org