drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper] <snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com --- include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) { + if (!obj) + return; + kref_put(&obj->refcount, drm_gem_object_free); }
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
- if (!obj)
return;
- kref_put(&obj->refcount, drm_gem_object_free); }
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()") -Chris
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Christian.
-Chris
On Wed, May 20, 2020 at 02:54:55PM +0200, Christian König wrote:
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Hm how did we even get to a place where one of the _put functions had a NULL check and the other didn't?
I do expect the compiler to clean up the entire mess, only place where I can think of NULL checks is dumb cleanup code when driver load failed halfway through. In all other places the compiler should have some evidence that the pointer isn't NULL. But would be good to check that's the case and we're not doing something stupid here ... -Daniel
Am 20.05.20 um 15:00 schrieb Daniel Vetter:
On Wed, May 20, 2020 at 02:54:55PM +0200, Christian König wrote:
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Hm how did we even get to a place where one of the _put functions had a NULL check and the other didn't?
No idea.
I do expect the compiler to clean up the entire mess, only place where I can think of NULL checks is dumb cleanup code when driver load failed halfway through. In all other places the compiler should have some evidence that the pointer isn't NULL. But would be good to check that's the case and we're not doing something stupid here ...
Well Nirmoy is blocked, so we need a solution fast. Auditing all call sites is not an option.
Revert or this one here I would say, either one is fine with me.
Christian.
-Daniel
Quoting Christian König (2020-05-20 13:54:55)
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Just to put this in a tiny bit of perspective, for i915.ko
add/remove: 0/0 grow/shrink: 141/20 up/down: 2211/-525 (1686)
We've had flame wars for less. (Because it's easy to argue over little changes.) Now this is just from this patch and not the revert... The revert has no effect on code size. -Chris
On Wed, 20 May 2020 at 14:17, Chris Wilson chris@chris-wilson.co.uk wrote:
Quoting Christian König (2020-05-20 13:54:55)
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Just to put this in a tiny bit of perspective, for i915.ko
add/remove: 0/0 grow/shrink: 141/20 up/down: 2211/-525 (1686)
We've had flame wars for less. (Because it's easy to argue over little changes.) Now this is just from this patch and not the revert... The revert has no effect on code size.
If we play the revert game thing will never end with having it fixed :-( I'd suggest sticking with the NULL check, maybe even a WARN to aid debug the 240 usecases.
For the patch: Fixes: b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()") Reviewed-by: Emil Velikov emil.velikov@collabora.com
-Emil
On Wed, 20 May 2020 at 14:30, Emil Velikov emil.l.velikov@gmail.com wrote:
On Wed, 20 May 2020 at 14:17, Chris Wilson chris@chris-wilson.co.uk wrote:
Quoting Christian König (2020-05-20 13:54:55)
Am 20.05.20 um 14:49 schrieb Chris Wilson:
Quoting Christian König (2020-05-20 13:19:42)
Am 20.05.20 um 14:14 schrieb Nirmoy Das:
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Signed-off-by: Nirmoy Das nirmoy.das@amd.com
Fixes: .... missing here. Apart from that Reviewed-by: Christian König christian.koenig@amd.com.
Please resend with the tag added, then I'm going to push this to drm-misc-next asap.
Christian.
include/drm/drm_gem.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..a13510346a9b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -372,6 +372,9 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) {
if (!obj)
return;
This adds several thousand NULL checks where there were previously none. I doubt the compiler eliminates them all.
I'd suggest reverting b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()")
I didn't looked into this patch in detail, but allowing to call *_put() functions with NULL and nothing bad happens is rather common practice.
On the other hand I agree that this might cause a performance problem. How many sites do we have which could call drm_gem_object_put() with NULL?
Just to put this in a tiny bit of perspective, for i915.ko
add/remove: 0/0 grow/shrink: 141/20 up/down: 2211/-525 (1686)
We've had flame wars for less. (Because it's easy to argue over little changes.) Now this is just from this patch and not the revert... The revert has no effect on code size.
If we play the revert game thing will never end with having it fixed :-( I'd suggest sticking with the NULL check, maybe even a WARN to aid debug the 240 usecases.
For the patch: Fixes: b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()") Reviewed-by: Emil Velikov emil.velikov@collabora.com
Alternatively, if you give me 30 minutes I should be able to finish auditing the users ;-)
-Emil
Some users want to pass NULL to drm_gem_object_put(), but those using __drm_gem_object_put() did not. Compromise, have both and let the compiler sort it out.
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper] <snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Reported-by: Nirmoy Das nirmoy.das@amd.com Fixes: b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Nirmoy Das nirmoy.das@amd.com Cc: Emil Velikov emil.velikov@collabora.com Cc: Christian König christian.koenig@amd.com. Acked-by: Nirmoy Das nirmoy.das@amd.com --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- include/drm/drm_gem.h | 10 +++++++++- 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index aba7517c2837..2faa481cc18f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -105,7 +105,7 @@ __attribute__((nonnull)) static inline void i915_gem_object_put(struct drm_i915_gem_object *obj) { - drm_gem_object_put(&obj->base); + __drm_gem_object_put(&obj->base); }
#define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv) diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 52173abdf500..2410ff0a8e86 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -363,6 +363,13 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) kref_get(&obj->refcount); }
+__attribute__((nonnull)) +static inline void +__drm_gem_object_put(struct drm_gem_object *obj) +{ + kref_put(&obj->refcount, drm_gem_object_free); +} + /** * drm_gem_object_put - drop a GEM buffer object reference * @obj: GEM buffer object @@ -372,7 +379,8 @@ static inline void drm_gem_object_get(struct drm_gem_object *obj) static inline void drm_gem_object_put(struct drm_gem_object *obj) { - kref_put(&obj->refcount, drm_gem_object_free); + if (obj) + __drm_gem_object_put(obj); }
void drm_gem_object_put_locked(struct drm_gem_object *obj);
On Wed, 20 May 2020 at 15:24, Chris Wilson chris@chris-wilson.co.uk wrote:
Some users want to pass NULL to drm_gem_object_put(), but those using __drm_gem_object_put() did not. Compromise, have both and let the compiler sort it out.
drm_gem_fb_destroy() calls drm_gem_object_put() with NULL obj causing: [ 11.584209] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 11.584213] #PF: supervisor write access in kernel mode [ 11.584215] #PF: error_code(0x0002) - not-present page [ 11.584216] PGD 0 P4D 0 [ 11.584220] Oops: 0002 [#1] SMP NOPTI [ 11.584223] CPU: 7 PID: 1571 Comm: gnome-shell Tainted: G E 5.7.0-rc1-1-default+ #27 [ 11.584225] Hardware name: Micro-Star International Co., Ltd. MS-7A31/X370 XPOWER GAMING TITANIUM (MS-7A31), BIOS 1.MR 12/03/2019 [ 11.584237] RIP: 0010:drm_gem_fb_destroy+0x28/0x70 [drm_kms_helper]
<snip> [ 11.584256] Call Trace: [ 11.584279] drm_mode_rmfb+0x189/0x1c0 [drm] [ 11.584299] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584314] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 11.584329] drm_ioctl+0x1ff/0x3b0 [drm] [ 11.584347] ? drm_mode_rmfb+0x1c0/0x1c0 [drm] [ 11.584421] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 11.584427] ksys_ioctl+0x87/0xc0 [ 11.584430] __x64_sys_ioctl+0x16/0x20 [ 11.584434] do_syscall_64+0x5f/0x240 [ 11.584438] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.584440] RIP: 0033:0x7f0ef80f7227
Reported-by: Nirmoy Das nirmoy.das@amd.com Fixes: b5d250744ccc ("drm/gem: fold drm_gem_object_put_unlocked and __drm_gem_object_put()") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Nirmoy Das nirmoy.das@amd.com Cc: Emil Velikov emil.velikov@collabora.com Cc: Christian König christian.koenig@amd.com. Acked-by: Nirmoy Das nirmoy.das@amd.com
Reviewed-by: Emil Velikov emil.velikov@collabora.com
I'm half way through checking the callers and I've noticed a handful of bugs. Will send the series in due time, although your patch is a perfect intermediate solution.
Thank you Emil
Quoting Emil Velikov (2020-05-20 15:25:05)
Reviewed-by: Emil Velikov emil.velikov@collabora.com
I'm half way through checking the callers and I've noticed a handful of bugs. Will send the series in due time, although your patch is a perfect intermediate solution.
Pushed the compromise patch. That should keep us all happy within our own little bubbles for the moment. Have fun! -Chris
dri-devel@lists.freedesktop.org