Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Then, on the error path in still in nouveau_drm_open, it is followed by a call to nouveau_cli_destroy. But that one calls nouveau_vm_ref -> nouveau_mm_fini -> nouveau_vm_del -> nouveau_mm_fini which triggers the warning.
Any ideas?
thanks,
On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Then, on the error path in still in nouveau_drm_open, it is followed by a call to nouveau_cli_destroy. But that one calls nouveau_vm_ref -> nouveau_mm_fini -> nouveau_vm_del -> nouveau_mm_fini which triggers the warning.
Any ideas?
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Tomorrow I'll post a patch for page allocation failure.
Marcin
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Then, on the error path in still in nouveau_drm_open, it is followed by a call to nouveau_cli_destroy. But that one calls nouveau_vm_ref -> nouveau_mm_fini -> nouveau_vm_del -> nouveau_mm_fini which triggers the warning.
Any ideas?
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Isn't this a stable-3.7 candidate?
Tomorrow I'll post a patch for page allocation failure.
What do you mean -- what kind of patch?
On Tue, 2013-02-19 at 00:43 +0100, Jiri Slaby wrote:
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Hi Jiri,
I had the order 4 allocation failure and nouveau crash back in November on next-20121129. Bugzillas here:
Allocation failure: https://bugzilla.kernel.org/show_bug.cgi?id=51301
Nouveau bug: https://bugzilla.kernel.org/show_bug.cgi?id=51291 https://bugs.freedesktop.org/show_bug.cgi?id=58087
IMO, the 32k allocation failure is the more serious bug. Check out the slab info from your report:
Feb 06 13:16:15 desdemona kernel: Node 0 DMA32: 13378*4kB 5026*8kB 1823*16kB 135*32kB 5*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 129576kB Feb 06 13:16:15 desdemona kernel: Node 0 Normal: 1946*4kB 831*8kB 3*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 16496kB
The pages are there: why did the allocation fail?
I think this is related to all that kswapd mess. In my case, the machine really was OOM -- which made no sense. Completely out of page blocks larger than 32k on a 10gb machine with a bunch of emacs and terminal windows open for 3 days, just doing code, build, code, build, code, build?
Regards, Peter Hurley
On Tue, Feb 19, 2013 at 12:43:06AM +0100, Jiri Slaby wrote:
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Then, on the error path in still in nouveau_drm_open, it is followed by a call to nouveau_cli_destroy. But that one calls nouveau_vm_ref -> nouveau_mm_fini -> nouveau_vm_del -> nouveau_mm_fini which triggers the warning.
Any ideas?
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Yes.
Isn't this a stable-3.7 candidate?
Should have been :/.
Tomorrow I'll post a patch for page allocation failure.
What do you mean -- what kind of patch?
A patch which will change pgt allocation to use vmalloc.
Marcin
On Tue, Feb 19, 2013 at 08:07:44AM +0100, Marcin Slusarz wrote:
On Tue, Feb 19, 2013 at 12:43:06AM +0100, Jiri Slaby wrote:
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
Hi,
we have a report of WARNING from 3.7.6 in nouveau at drivers/gpu/drm/nouveau/core/core/mm.c:242 here: https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
There is an order 4 allocation failure in nouveau_drm_open -> nouveau_vm_create, i.e. this one failed: vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
Then, on the error path in still in nouveau_drm_open, it is followed by a call to nouveau_cli_destroy. But that one calls nouveau_vm_ref -> nouveau_mm_fini -> nouveau_vm_del -> nouveau_mm_fini which triggers the warning.
Any ideas?
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Yes.
Isn't this a stable-3.7 candidate?
Should have been :/.
Tomorrow I'll post a patch for page allocation failure.
What do you mean -- what kind of patch?
A patch which will change pgt allocation to use vmalloc.
--- From: Marcin Slusarz marcin.slusarz@gmail.com Subject: [PATCH] drm/nouveau: use vmalloc for pgt allocation
Page tables on nv50 take 48kB, which can be hard to allocate in one piece. Let's use vmalloc.
Signed-off-by: Marcin Slusarz marcin.slusarz@gmail.com Cc: stable@vger.kernel.org --- drivers/gpu/drm/nouveau/core/subdev/vm/base.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index 77c67fc..e66fb77 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -362,7 +362,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, vm->fpde = offset >> (vmm->pgt_bits + 12); vm->lpde = (offset + length - 1) >> (vmm->pgt_bits + 12);
- vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL); + vm->pgt = vzalloc((vm->lpde - vm->fpde + 1) * sizeof(*vm->pgt)); if (!vm->pgt) { kfree(vm); return -ENOMEM; @@ -371,7 +371,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, ret = nouveau_mm_init(&vm->mm, mm_offset >> 12, mm_length >> 12, block >> 12); if (ret) { - kfree(vm->pgt); + vfree(vm->pgt); kfree(vm); return ret; } @@ -446,7 +446,7 @@ nouveau_vm_del(struct nouveau_vm *vm) }
nouveau_mm_fini(&vm->mm); - kfree(vm->pgt); + vfree(vm->pgt); kfree(vm); }
--
On 02/19/2013 11:32 PM, Marcin Slusarz wrote:
On Tue, Feb 19, 2013 at 08:07:44AM +0100, Marcin Slusarz wrote:
On Tue, Feb 19, 2013 at 12:43:06AM +0100, Jiri Slaby wrote:
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
Tomorrow I'll post a patch for page allocation failure.
What do you mean -- what kind of patch?
A patch which will change pgt allocation to use vmalloc.
It's still not in -next. Any plans on this?
From: Marcin Slusarz marcin.slusarz@gmail.com Subject: [PATCH] drm/nouveau: use vmalloc for pgt allocation
Page tables on nv50 take 48kB, which can be hard to allocate in one piece. Let's use vmalloc.
Signed-off-by: Marcin Slusarz marcin.slusarz@gmail.com Cc: stable@vger.kernel.org
drivers/gpu/drm/nouveau/core/subdev/vm/base.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index 77c67fc..e66fb77 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -362,7 +362,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, vm->fpde = offset >> (vmm->pgt_bits + 12); vm->lpde = (offset + length - 1) >> (vmm->pgt_bits + 12);
- vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
- vm->pgt = vzalloc((vm->lpde - vm->fpde + 1) * sizeof(*vm->pgt)); if (!vm->pgt) { kfree(vm); return -ENOMEM;
@@ -371,7 +371,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, ret = nouveau_mm_init(&vm->mm, mm_offset >> 12, mm_length >> 12, block >> 12); if (ret) {
kfree(vm->pgt);
kfree(vm); return ret; }vfree(vm->pgt);
@@ -446,7 +446,7 @@ nouveau_vm_del(struct nouveau_vm *vm) }
nouveau_mm_fini(&vm->mm);
- kfree(vm->pgt);
- vfree(vm->pgt); kfree(vm);
}
On 03/13/2013 11:36 AM, Jiri Slaby wrote:
On 02/19/2013 11:32 PM, Marcin Slusarz wrote:
On Tue, Feb 19, 2013 at 08:07:44AM +0100, Marcin Slusarz wrote:
On Tue, Feb 19, 2013 at 12:43:06AM +0100, Jiri Slaby wrote:
On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
Tomorrow I'll post a patch for page allocation failure.
What do you mean -- what kind of patch?
A patch which will change pgt allocation to use vmalloc.
It's still not in -next. Any plans on this?
Ping...
From: Marcin Slusarz marcin.slusarz@gmail.com Subject: [PATCH] drm/nouveau: use vmalloc for pgt allocation
Page tables on nv50 take 48kB, which can be hard to allocate in one piece. Let's use vmalloc.
Signed-off-by: Marcin Slusarz marcin.slusarz@gmail.com Cc: stable@vger.kernel.org
drivers/gpu/drm/nouveau/core/subdev/vm/base.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index 77c67fc..e66fb77 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -362,7 +362,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, vm->fpde = offset >> (vmm->pgt_bits + 12); vm->lpde = (offset + length - 1) >> (vmm->pgt_bits + 12);
- vm->pgt = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);
- vm->pgt = vzalloc((vm->lpde - vm->fpde + 1) * sizeof(*vm->pgt)); if (!vm->pgt) { kfree(vm); return -ENOMEM;
@@ -371,7 +371,7 @@ nouveau_vm_create(struct nouveau_vmmgr *vmm, u64 offset, u64 length, ret = nouveau_mm_init(&vm->mm, mm_offset >> 12, mm_length >> 12, block >> 12); if (ret) {
kfree(vm->pgt);
kfree(vm); return ret; }vfree(vm->pgt);
@@ -446,7 +446,7 @@ nouveau_vm_del(struct nouveau_vm *vm) }
nouveau_mm_fini(&vm->mm);
- kfree(vm->pgt);
- vfree(vm->pgt); kfree(vm);
}
On 02/19/2013 08:07 AM, Marcin Slusarz wrote:
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Yes.
Isn't this a stable-3.7 candidate?
Should have been :/.
Ok, if you have no objections: stable fellows, please consider the commit above for stable inclusion. As far as I can tell this is only applicable for 3.7 -- Marcin?
On Wed, Feb 20, 2013 at 03:47:05PM +0100, Jiri Slaby wrote:
On 02/19/2013 08:07 AM, Marcin Slusarz wrote:
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Yes.
Isn't this a stable-3.7 candidate?
Should have been :/.
Ok, if you have no objections: stable fellows, please consider the commit above for stable inclusion. As far as I can tell this is only applicable for 3.7 -- Marcin?
Correct. Please apply it for 3.7.x.
Marcin
On Wed, Feb 20, 2013 at 07:57:32PM +0100, Marcin Slusarz wrote:
On Wed, Feb 20, 2013 at 03:47:05PM +0100, Jiri Slaby wrote:
On 02/19/2013 08:07 AM, Marcin Slusarz wrote:
Crash/warning should be fixed by commit cfd376b6bfccf33782a0748a9c70f7f752f8b869 "drm/nouveau/vm: fix memory corruption when pgt allocation fails".
Oh, thanks for the pointer. Could that bug cause real "memory corruption"? As we're hunting one there...
Yes.
Isn't this a stable-3.7 candidate?
Should have been :/.
Ok, if you have no objections: stable fellows, please consider the commit above for stable inclusion. As far as I can tell this is only applicable for 3.7 -- Marcin?
Correct. Please apply it for 3.7.x.
Now applied, thanks.
greg k-h
dri-devel@lists.freedesktop.org