On Wed, Jan 30, 2013 at 10:52 PM, Russell King rmk@arm.linux.org.uk wrote:
Also adding Greg and Daniel to this as Daniel introduced the lockdep checking.
This looks extremely horrid to be to solve - the paths are rather deep where the dependency occurs. The two paths between the locks are:
console_lock+0x5c/0x70 register_con_driver+0x44/0x150 take_over_console+0x24/0x3b4 fbcon_takeover+0x70/0xd4 fbcon_event_notify+0x7c8/0x818 notifier_call_chain+0x4c/0x8c __blocking_notifier_call_chain+0x50/0x68 blocking_notifier_call_chain+0x20/0x28
and
__blocking_notifier_call_chain+0x34/0x68 blocking_notifier_call_chain+0x20/0x28 fb_notifier_call_chain+0x20/0x28 fb_blank+0x40/0xac fbcon_blank+0x1f4/0x29c do_blank_screen+0x1b8/0x270 console_callback+0x74/0x138
You want Dave Airlie's pile of locking reworks, which fixes all currently known offenders around console_lock and fb_notifier. Patches won't go into 3.9 since it took a few rounds until they did not cause regression by making these deadlocks easier to hit.
http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes
Long term solution would be to abolish the fb_notifier, at least for the purpose of linking fbdevs up with the fbcon and just replace those with direct function calls. But that requires that we no longer allow fbdev drivers and the fbcon to be loaded in any arbitrary order. Or just force fbcon to be built-in if enabled, imo the sane choice (no one's bothering with config_vt=m either, after all).
Cheers, Daniel
On Wed, Jan 30, 2013 at 11:07:16PM +0100, Daniel Vetter wrote:
On Wed, Jan 30, 2013 at 10:52 PM, Russell King rmk@arm.linux.org.uk wrote:
Also adding Greg and Daniel to this as Daniel introduced the lockdep checking.
This looks extremely horrid to be to solve - the paths are rather deep where the dependency occurs. The two paths between the locks are:
console_lock+0x5c/0x70 register_con_driver+0x44/0x150 take_over_console+0x24/0x3b4 fbcon_takeover+0x70/0xd4 fbcon_event_notify+0x7c8/0x818 notifier_call_chain+0x4c/0x8c __blocking_notifier_call_chain+0x50/0x68 blocking_notifier_call_chain+0x20/0x28
and
__blocking_notifier_call_chain+0x34/0x68 blocking_notifier_call_chain+0x20/0x28 fb_notifier_call_chain+0x20/0x28 fb_blank+0x40/0xac fbcon_blank+0x1f4/0x29c do_blank_screen+0x1b8/0x270 console_callback+0x74/0x138
You want Dave Airlie's pile of locking reworks, which fixes all currently known offenders around console_lock and fb_notifier. Patches won't go into 3.9 since it took a few rounds until they did not cause regression by making these deadlocks easier to hit.
http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes
Long term solution would be to abolish the fb_notifier, at least for the purpose of linking fbdevs up with the fbcon and just replace those with direct function calls. But that requires that we no longer allow fbdev drivers and the fbcon to be loaded in any arbitrary order. Or just force fbcon to be built-in if enabled, imo the sane choice (no one's bothering with config_vt=m either, after all).
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
On Wed, Jan 30, 2013 at 11:19 PM, Russell King rmk@arm.linux.org.uk wrote:
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
Linus was the guy who shot down the patches for 3.9 since one of the earlier iterations caused instant deadlocks - I've been pushing them to merge them ;-) -Daniel
On Thu, Jan 31, 2013 at 9:19 AM, Russell King rmk@arm.linux.org.uk wrote:
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
Oh, we're extremely aware of it. And it's not a new issue, the locking problem have apparently been around forever, although I'm not sure why the lockdep splat itself started happening only recently.
They'll make it into 3.9, it's 3.8 that won't have them. The patches initially caused way *worse* behavior than just a lockdep splat - they caused actual hard lockups (and that was *after* the initial series of fixes). That got fixed (hopefully for the last case!) fairly recently, and I'm not willing to take the scary patch-series that has had several problem cases.
LInus
On Thu, Jan 31, 2013 at 9:52 AM, Linus Torvalds torvalds@linux-foundation.org wrote:
On Thu, Jan 31, 2013 at 9:19 AM, Russell King rmk@arm.linux.org.uk wrote:
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
Oh, we're extremely aware of it. And it's not a new issue, the locking problem have apparently been around forever, although I'm not sure why the lockdep splat itself started happening only recently.
They'll make it into 3.9, it's 3.8 that won't have them. The patches initially caused way *worse* behavior than just a lockdep splat - they caused actual hard lockups (and that was *after* the initial series of fixes). That got fixed (hopefully for the last case!) fairly recently, and I'm not willing to take the scary patch-series that has had several problem cases.
Well we didn't have any lock validation support before Daniel added it a couple of kernels back, so instead of hidden locking problems we've had from time began, we now have lockdep detectable locking problems.
Dave.
On Thu, Jan 31, 2013 at 10:04:05AM +1000, Dave Airlie wrote:
On Thu, Jan 31, 2013 at 9:52 AM, Linus Torvalds torvalds@linux-foundation.org wrote:
On Thu, Jan 31, 2013 at 9:19 AM, Russell King rmk@arm.linux.org.uk wrote:
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
Oh, we're extremely aware of it. And it's not a new issue, the locking problem have apparently been around forever, although I'm not sure why the lockdep splat itself started happening only recently.
They'll make it into 3.9, it's 3.8 that won't have them. The patches initially caused way *worse* behavior than just a lockdep splat - they caused actual hard lockups (and that was *after* the initial series of fixes). That got fixed (hopefully for the last case!) fairly recently, and I'm not willing to take the scary patch-series that has had several problem cases.
Well we didn't have any lock validation support before Daniel added it a couple of kernels back, so instead of hidden locking problems we've had from time began, we now have lockdep detectable locking problems.
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Linus
On Thu, Jan 31, 2013 at 11:26:53AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Reverting the patch is fine with me. Just let me know so I can queue it up again for 3.9.
thanks,
greg k-h
On Thu, Jan 31, 2013 at 6:40 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 11:26:53AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Yeah, sounds good.
Reverting the patch is fine with me. Just let me know so I can queue it up again for 3.9.
Can you please also pick up the (currently) three locking fixups around fbcon? Just so that we don't repeat the same fun where people complain about lockdep splats, but the fixes are stuck somewhere. And I guess Dave would be happy to not end up as fbcon maintainer ;-) He has a git branch with them at http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes though I have a small bikeshed on his last patch pending. -Daniel
On Thu, Jan 31, 2013 at 09:21:16AM +0100, Daniel Vetter wrote:
On Thu, Jan 31, 2013 at 6:40 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 11:26:53AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Yeah, sounds good.
Reverting the patch is fine with me. Just let me know so I can queue it up again for 3.9.
Can you please also pick up the (currently) three locking fixups around fbcon? Just so that we don't repeat the same fun where people complain about lockdep splats, but the fixes are stuck somewhere. And I guess Dave would be happy to not end up as fbcon maintainer ;-) He has a git branch with them at http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes though I have a small bikeshed on his last patch pending.
Care to just send me the patches through email when you all get done bikesheding? And for some reason I thought Andrew was going to handle these fbcon patches, but if not, I'll be glad to take them.
thanks,
greg k-h
On Thu, Jan 31, 2013 at 10:21 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
Can you please also pick up the (currently) three locking fixups around fbcon? Just so that we don't repeat the same fun where people complain about lockdep splats, but the fixes are stuck somewhere. And I guess Dave would be happy to not end up as fbcon maintainer ;-) He has a git branch with them at http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes though I have a small bikeshed on his last patch pending.
Care to just send me the patches through email when you all get done bikesheding? And for some reason I thought Andrew was going to handle these fbcon patches, but if not, I'll be glad to take them.
Yeah, I'll annoy people again with patches until they're merged ;-) Iirc Andrew picked them due to lack of an fbcon maintainer, and everyone else likes to pass the bucket. Having looked through the code a bit, imo understandable ...
btw, I've started to look into how we could fix the locking madness around fbcon for good instead of with duct-tape [1]. I'll try to discuss this with a few fbdev guys at fosdem (some at least should be there). Certainly a long-term thing, but comments from whomever gets volunteered to shepherd fbcon would be great.
Cheers, Daniel
1: http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg33535.html
On Thu, Jan 31, 2013 at 8:21 PM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 09:21:16AM +0100, Daniel Vetter wrote:
On Thu, Jan 31, 2013 at 6:40 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 11:26:53AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Yeah, sounds good.
Reverting the patch is fine with me. Just let me know so I can queue it up again for 3.9.
Can you please also pick up the (currently) three locking fixups around fbcon? Just so that we don't repeat the same fun where people complain about lockdep splats, but the fixes are stuck somewhere. And I guess Dave would be happy to not end up as fbcon maintainer ;-) He has a git branch with them at http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes though I have a small bikeshed on his last patch pending.
Care to just send me the patches through email when you all get done bikesheding? And for some reason I thought Andrew was going to handle these fbcon patches, but if not, I'll be glad to take them.
I'll ship them via my tree at this point I think, since I now need to queue a revert of the revert on top.
I have a few vgacon/fbcon fixes that I need to go in this cycle.
Dave.
On Thu, Jan 31, 2013 at 10:51:27PM +1100, Dave Airlie wrote:
On Thu, Jan 31, 2013 at 8:21 PM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 09:21:16AM +0100, Daniel Vetter wrote:
On Thu, Jan 31, 2013 at 6:40 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jan 31, 2013 at 11:26:53AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 11:13 AM, Russell King rmk@arm.linux.org.uk wrote:
Which may or may not be a good thing depending how you look at it; it means that once your kernel blanks, you get a lockdep dump. At that point you lose lockdep checking for everything else because lockdep disables itself after the first dump.
Fair enough, we may want to revert the lockdep checking for console_lock, and make re-enabling it part of the patch-series that fixes the locking.
Daniel/Dave? Does that sound reasonable?
Yeah, sounds good.
Reverting the patch is fine with me. Just let me know so I can queue it up again for 3.9.
Can you please also pick up the (currently) three locking fixups around fbcon? Just so that we don't repeat the same fun where people complain about lockdep splats, but the fixes are stuck somewhere. And I guess Dave would be happy to not end up as fbcon maintainer ;-) He has a git branch with them at http://cgit.freedesktop.org/~airlied/linux/log/?h=fbcon-locking-fixes though I have a small bikeshed on his last patch pending.
Care to just send me the patches through email when you all get done bikesheding? And for some reason I thought Andrew was going to handle these fbcon patches, but if not, I'll be glad to take them.
I'll ship them via my tree at this point I think, since I now need to queue a revert of the revert on top.
I have a few vgacon/fbcon fixes that I need to go in this cycle.
Ok, I'll gladly let you handle this :)
thanks,
greg k-h
On Thu, Jan 31, 2013 at 10:51:27PM +1100, Dave Airlie wrote:
I'll ship them via my tree at this point I think, since I now need to queue a revert of the revert on top.
I have a few vgacon/fbcon fixes that I need to go in this cycle.
Great, thanks.
On Thu, Jan 31, 2013 at 10:52:51AM +1100, Linus Torvalds wrote:
On Thu, Jan 31, 2013 at 9:19 AM, Russell King rmk@arm.linux.org.uk wrote:
So... what you seem to be telling me is that 3.9 is going to be a release which issues lockdep complaints when the console blanks, and you think that's acceptable?
Adding Linus and Andrew so they're aware of this issue...
Oh, we're extremely aware of it. And it's not a new issue, the locking problem have apparently been around forever, although I'm not sure why the lockdep splat itself started happening only recently.
Well, the reason the splat started happening recently is because of:
commit daee779718a319ff9f83e1ba3339334ac650bb22 Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Sat Sep 22 19:52:11 2012 +0200
console: implement lockdep support for console_lock
Dave Airlie recently discovered a locking bug in the fbcon layer, where a timer_del_sync (for the blinking cursor) deadlocks with the timer itself, since both (want to) hold the console_lock:
https://lkml.org/lkml/2012/8/21/36
which, if I'm looking at the git history right, appears to have come in during the last merge window?
Yes, the locking may be wrong, but we've lived with that locking for a long time without problem.
Can we at least silence these warnings by temporarily disabling the lockdep tracking added by the above commit for this lock, until the fixes for this are merged during the next merge window?
dri-devel@lists.freedesktop.org