Hello, Linus.
On 08/01/2010 08:01 PM, Linus Torvalds wrote:
This has a proposed patch. I don't know what the status of it is, though. Jens?
http://marc.info/?l=linux-kernel&m=127950018204029&w=2
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16393 Subject : kernel BUG at fs/block_dev.c:765! Submitter : Markus Trippelsdorf markus@trippelsdorf.de Date : 2010-07-14 13:52 (19 days old) Message-ID : 20100714135217.GA1797@arch.tripp.de References : http://marc.info/?l=linux-kernel&m=127911564213748&w=2
This one is interesting. And I think I perhaps see where it's coming from.
bd_start_claiming() (through bd_prepare_to_claim()) has two separate success cases: either there was no holder (bd_claiming is NULL) or the new holder was already claiming it (bd_claiming == holder).
Note in particular the case of the holder _already_ holding it. What happens is:
- bd_start_claiming() succeeds because we had _already_ claimed it
with the same holder
- then some error happens, and we call bd_abort_claiming(), which
does whole->bd_claiming = NULL;
the original holder thinks it still holds the bd, but it has been released!
a new claimer comes in, and succeeds because bd_claiming is now NULL.
we now have two "owners" of the bd, but bd_claiming only points to
the second one.
I think bd_start_claiming() needs to do some kind of refcount for the nested holder case, and bd_abort_claiming() needs to decrement the refcount and only clear the bd_claiming field when it goes down to zero.
I dunno. Maybe there's something else going on, but it does look suspicious, and the above would explain the BUG_ON().
Yeah, that definitely sounds plausible. I think the condition check in bd_prepare_to_claim() should have been "if (whole->bd_claiming)" instead of "if (whole->bd_claiming && whole->bd_claiming != holder)". It doesn't make much sense to allow multiple parallel claiming operations anyway and the comment above already says - "This function fails if @bdev is already claimed by another holder and waits if another claiming is in progress."
I'll try to build a test case and verify it.
Thank you.
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=16393
__bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block.
This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk.
stable: only applicable to v2.6.35
Signed-off-by: Tejun Heo tj@kernel.org Reported-by: Maciej Rutecki maciej.rutecki@gmail.com Cc: stable@kernel.org --- fs/block_dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c index 99d6af8..b3171fb 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -681,8 +681,8 @@ retry: if (!bd_may_claim(bdev, whole, holder)) return -EBUSY;
- /* if someone else is claiming, wait for it to finish */ - if (whole->bd_claiming && whole->bd_claiming != holder) { + /* if claiming is already in progress, wait for it to finish */ + if (whole->bd_claiming) { wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0); DEFINE_WAIT(wait);
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=16393
__bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block.
This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk.
stable: only applicable to v2.6.35
Signed-off-by: Tejun Heo tj@kernel.org Reported-by: Markus Trippelsdorf markus@trippelsdorf.de Cc: stable@kernel.org --- Oops, had the wrong reported-by credit. Updated.
Thanks.
fs/block_dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c index 99d6af8..b3171fb 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -681,8 +681,8 @@ retry: if (!bd_may_claim(bdev, whole, holder)) return -EBUSY;
- /* if someone else is claiming, wait for it to finish */ - if (whole->bd_claiming && whole->bd_claiming != holder) { + /* if claiming is already in progress, wait for it to finish */ + if (whole->bd_claiming) { wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0); DEFINE_WAIT(wait);
On 2010-08-04 17:59, Tejun Heo wrote:
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=16393
__bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block.
This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk.
stable: only applicable to v2.6.35
Signed-off-by: Tejun Heo tj@kernel.org Reported-by: Markus Trippelsdorf markus@trippelsdorf.de Cc: stable@kernel.org
Thanks Tejun, applied.
On Thu, Aug 05, 2010 at 11:02:43AM +0200, Jens Axboe wrote:
On 2010-08-04 17:59, Tejun Heo wrote:
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=16393
__bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block.
This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk.
stable: only applicable to v2.6.35
Signed-off-by: Tejun Heo tj@kernel.org Reported-by: Markus Trippelsdorf markus@trippelsdorf.de Cc: stable@kernel.org
Thanks Tejun, applied.
It's already in mainline: e75aa85892b2ee78c79edac720868cbef16e62eb
On 2010-08-05 11:17, Markus Trippelsdorf wrote:
On Thu, Aug 05, 2010 at 11:02:43AM +0200, Jens Axboe wrote:
On 2010-08-04 17:59, Tejun Heo wrote:
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug.
https://bugzilla.kernel.org/show_bug.cgi?id=16393
__bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block.
This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk.
stable: only applicable to v2.6.35
Signed-off-by: Tejun Heo tj@kernel.org Reported-by: Markus Trippelsdorf markus@trippelsdorf.de Cc: stable@kernel.org
Thanks Tejun, applied.
It's already in mainline: e75aa85892b2ee78c79edac720868cbef16e62eb
Irk, had not noticed yet, my for-2.6.36 branch isn't fully merged up yet. Thanks for the heads-up.
dri-devel@lists.freedesktop.org