On Thu, Feb 23, 2012 at 02:36:58PM -0800, Andrew Morton wrote:
On Thu, 16 Feb 2012 13:01:36 +0100 Daniel Vetter daniel.vetter@ffwll.ch wrote:
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes.
I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE.
Also kill a copy&pasted spurious space in both functions while at it.
...
--- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0)) return 0;
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */
- ret = __put_user(0, uaddr);
- while (uaddr <= end) {
ret = __put_user(0, uaddr);
if (ret != 0)
return ret;
uaddr += PAGE_SIZE;
- }
The callsites in filemap.c are pretty hot paths, which is why this thing remains explicitly inlined. I think it would be worth adding a bit of code here to avoid adding a pointless test-n-branch and larger cache footprint to read() and write().
A way of doing that is to add another argument to these functions, say "bool multipage". Change the code to do
if (multipage) { while (uaddr <= end) { ... } }
and change the callsites to pass in constant "true" or "false". Then compile it up and manually check that the compiler completely removed the offending code from the filemap.c callsites.
Wanna have a think about that? If it all looks OK then please be sure to add code comments explaining why we did this.
I wasn't really happy with the added branch either, but failed to come up with a trick to avoid it. Imho adding new _multipage variants of these functions instead of adding a constant argument is simpler because the functions don't really share much thanks to the block below. I'll see what it looks like (and obviously add a comment explaining what's going on).
if (ret == 0) {
char __user *end = uaddr + size - 1;
- /*
*/
- If the page was already mapped, this will get a cache miss
- for sure, so try to avoid doing it.
if (((unsigned long)uaddr & PAGE_MASK) !=
if (((unsigned long)uaddr & PAGE_MASK) == ((unsigned long)end & PAGE_MASK))
Maybe I'm having a dim day, but I don't immediately see why != got turned into ==.
Because of the loop uaddr will now point one page beyond the last prefaulted page. To check whether end spilled into a new page we therefore need to check whether uaddr and end are in the same pfn. Before uaddr wasn't changed and hence the checking for a different pfn worked correctly.
Once we have this settled I'd suggest that the patch be carried in whatever-git-tree-needs-it.
Thanks for the comments.
Yours, Daniel