In
commit ebc0808fa2da0548a78e715858024cb81cd732bc Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 18 13:02:51 2016 +0100
drm/i915: Restrict pagefault disabling to just around copy_from_user()
we entirely missed that there's a slow path call to eb_relocate_entry (or i915_gem_execbuffer_relocate_entry as it was called back then) which was left fully wrapped by pagefault_disable/enable() calls. Previously any issues with blocking calls where handled by the following code:
/* we can't wait for rendering with pagefaults disabled */ if (pagefault_disabled() && !object_is_idle(obj)) return -EFAULT;
Now at this point the prefaulting was still around, which means in normal applications it was very hard to hit this bug. No idea why the regressions in igts weren't caught.
Now this all changed big time with 2 patches merged closely together.
First
commit 2889caa9232109afc8881f29a2205abeb5709d0c Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:19 2017 +0100
drm/i915: Eliminate lots of iterations over the execobjects array
removes the prefaulting from the first relocation path, pushing it into the first slowpath (of which this patch added a total of 3 escalation levels). This would have really quickly uncovered the above bug, were it not for immediate adding a duct-tape on top with
commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:24 2017 +0100
drm/i915: Async GPU relocation processing
by pushing all all the relocation patching to the gpu if the buffer was busy, which avoided all the possible blocking calls.
The entire slowpath was then furthermore ditched in
commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d Author: Chris Wilson chris@chris-wilson.co.uk Date: Wed Mar 11 16:03:10 2020 +0000
drm/i915/gem: Drop relocation slowpath
and resurrected in
commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871 Author: Maarten Lankhorst maarten.lankhorst@linux.intel.com Date: Wed Aug 19 16:08:43 2020 +0200
Revert "drm/i915/gem: Drop relocation slowpath".
but this did not further impact what's going on.
Since pagefault_disable/enable is an atomic section, any sleeping in there is prohibited, and we definitely do that without gpu relocations since we have to wait for the gpu usage to finish before we can patch up the relocations.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Thomas Hellström" thomas.hellstrom@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: Dave Airlie airlied@redhat.com Cc: Jason Ekstrand jason@jlekstrand.net --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6539b82dda54..7ff2fc3c0b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
list_for_each_entry(ev, &eb->relocs, reloc_link) { if (!have_copy) { - pagefault_disable(); err = eb_relocate_vma(eb, ev); - pagefault_enable(); if (err) break; } else {
On 18/06/2021 22:45, Daniel Vetter wrote:
In
commit ebc0808fa2da0548a78e715858024cb81cd732bc Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 18 13:02:51 2016 +0100
drm/i915: Restrict pagefault disabling to just around copy_from_user()
we entirely missed that there's a slow path call to eb_relocate_entry (or i915_gem_execbuffer_relocate_entry as it was called back then) which was left fully wrapped by pagefault_disable/enable() calls. Previously any issues with blocking calls where handled by the following code:
/* we can't wait for rendering with pagefaults disabled */ if (pagefault_disabled() && !object_is_idle(obj)) return -EFAULT;
Now at this point the prefaulting was still around, which means in normal applications it was very hard to hit this bug. No idea why the regressions in igts weren't caught.
Now this all changed big time with 2 patches merged closely together.
First
commit 2889caa9232109afc8881f29a2205abeb5709d0c Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:19 2017 +0100
drm/i915: Eliminate lots of iterations over the execobjects array
removes the prefaulting from the first relocation path, pushing it into the first slowpath (of which this patch added a total of 3 escalation levels). This would have really quickly uncovered the above bug, were it not for immediate adding a duct-tape on top with
commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:24 2017 +0100
drm/i915: Async GPU relocation processing
by pushing all all the relocation patching to the gpu if the buffer was busy, which avoided all the possible blocking calls.
The entire slowpath was then furthermore ditched in
commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d Author: Chris Wilson chris@chris-wilson.co.uk Date: Wed Mar 11 16:03:10 2020 +0000
drm/i915/gem: Drop relocation slowpath
and resurrected in
commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871 Author: Maarten Lankhorst maarten.lankhorst@linux.intel.com Date: Wed Aug 19 16:08:43 2020 +0200
Revert "drm/i915/gem: Drop relocation slowpath".
but this did not further impact what's going on.
Since pagefault_disable/enable is an atomic section, any sleeping in there is prohibited, and we definitely do that without gpu relocations since we have to wait for the gpu usage to finish before we can patch up the relocations.
Why do we also need the __copy_from_user_inatomic in eb_relocate_vma()?
Reviewed-by: Matthew Auld matthew.auld@intel.com
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Thomas Hellström" thomas.hellstrom@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: Dave Airlie airlied@redhat.com Cc: Jason Ekstrand jason@jlekstrand.net
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6539b82dda54..7ff2fc3c0b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
list_for_each_entry(ev, &eb->relocs, reloc_link) { if (!have_copy) {
pagefault_disable(); err = eb_relocate_vma(eb, ev);
} else {pagefault_enable(); if (err) break;
Op 21-06-2021 om 11:33 schreef Matthew Auld:
On 18/06/2021 22:45, Daniel Vetter wrote:
In
commit ebc0808fa2da0548a78e715858024cb81cd732bc Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 18 13:02:51 2016 +0100
drm/i915: Restrict pagefault disabling to just around copy_from_user()
we entirely missed that there's a slow path call to eb_relocate_entry (or i915_gem_execbuffer_relocate_entry as it was called back then) which was left fully wrapped by pagefault_disable/enable() calls. Previously any issues with blocking calls where handled by the following code:
/* we can't wait for rendering with pagefaults disabled */ if (pagefault_disabled() && !object_is_idle(obj)) return -EFAULT;
Now at this point the prefaulting was still around, which means in normal applications it was very hard to hit this bug. No idea why the regressions in igts weren't caught.
Now this all changed big time with 2 patches merged closely together.
First
commit 2889caa9232109afc8881f29a2205abeb5709d0c Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:19 2017 +0100
drm/i915: Eliminate lots of iterations over the execobjects array
removes the prefaulting from the first relocation path, pushing it into the first slowpath (of which this patch added a total of 3 escalation levels). This would have really quickly uncovered the above bug, were it not for immediate adding a duct-tape on top with
commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:24 2017 +0100
drm/i915: Async GPU relocation processing
by pushing all all the relocation patching to the gpu if the buffer was busy, which avoided all the possible blocking calls.
The entire slowpath was then furthermore ditched in
commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d Author: Chris Wilson chris@chris-wilson.co.uk Date: Wed Mar 11 16:03:10 2020 +0000
drm/i915/gem: Drop relocation slowpath
and resurrected in
commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871 Author: Maarten Lankhorst maarten.lankhorst@linux.intel.com Date: Wed Aug 19 16:08:43 2020 +0200
Revert "drm/i915/gem: Drop relocation slowpath".
but this did not further impact what's going on.
Since pagefault_disable/enable is an atomic section, any sleeping in there is prohibited, and we definitely do that without gpu relocations since we have to wait for the gpu usage to finish before we can patch up the relocations.
Why do we also need the __copy_from_user_inatomic in eb_relocate_vma()?
Reviewed-by: Matthew Auld matthew.auld@intel.com
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Thomas Hellström" thomas.hellstrom@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: Dave Airlie airlied@redhat.com Cc: Jason Ekstrand jason@jlekstrand.net
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6539b82dda54..7ff2fc3c0b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, list_for_each_entry(ev, &eb->relocs, reloc_link) { if (!have_copy) { - pagefault_disable(); err = eb_relocate_vma(eb, ev); - pagefault_enable(); if (err) break; } else {
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
On Mon, Jun 21, 2021 at 04:30:50PM +0200, Maarten Lankhorst wrote:
Op 21-06-2021 om 11:33 schreef Matthew Auld:
On 18/06/2021 22:45, Daniel Vetter wrote:
In
commit ebc0808fa2da0548a78e715858024cb81cd732bc Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 18 13:02:51 2016 +0100
drm/i915: Restrict pagefault disabling to just around copy_from_user()
we entirely missed that there's a slow path call to eb_relocate_entry (or i915_gem_execbuffer_relocate_entry as it was called back then) which was left fully wrapped by pagefault_disable/enable() calls. Previously any issues with blocking calls where handled by the following code:
/* we can't wait for rendering with pagefaults disabled */ if (pagefault_disabled() && !object_is_idle(obj)) return -EFAULT;
Now at this point the prefaulting was still around, which means in normal applications it was very hard to hit this bug. No idea why the regressions in igts weren't caught.
Now this all changed big time with 2 patches merged closely together.
First
commit 2889caa9232109afc8881f29a2205abeb5709d0c Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:19 2017 +0100
drm/i915: Eliminate lots of iterations over the execobjects array
removes the prefaulting from the first relocation path, pushing it into the first slowpath (of which this patch added a total of 3 escalation levels). This would have really quickly uncovered the above bug, were it not for immediate adding a duct-tape on top with
commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jun 16 15:05:24 2017 +0100
drm/i915: Async GPU relocation processing
by pushing all all the relocation patching to the gpu if the buffer was busy, which avoided all the possible blocking calls.
The entire slowpath was then furthermore ditched in
commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d Author: Chris Wilson chris@chris-wilson.co.uk Date: Wed Mar 11 16:03:10 2020 +0000
drm/i915/gem: Drop relocation slowpath
and resurrected in
commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871 Author: Maarten Lankhorst maarten.lankhorst@linux.intel.com Date: Wed Aug 19 16:08:43 2020 +0200
Revert "drm/i915/gem: Drop relocation slowpath".
but this did not further impact what's going on.
Since pagefault_disable/enable is an atomic section, any sleeping in there is prohibited, and we definitely do that without gpu relocations since we have to wait for the gpu usage to finish before we can patch up the relocations.
Why do we also need the __copy_from_user_inatomic in eb_relocate_vma()?
Reviewed-by: Matthew Auld matthew.auld@intel.com
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Thomas Hellström" thomas.hellstrom@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: Dave Airlie airlied@redhat.com Cc: Jason Ekstrand jason@jlekstrand.net
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6539b82dda54..7ff2fc3c0b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, list_for_each_entry(ev, &eb->relocs, reloc_link) { if (!have_copy) { - pagefault_disable(); err = eb_relocate_vma(eb, ev); - pagefault_enable(); if (err) break; } else {
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
Pushed to drm-intel-gt-next, thanks to both of you for taking a look. -Daniel
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
dri-devel@lists.freedesktop.org