On Tue 2021-11-09 12:06:48, Sultan Alsawaf wrote:
Hi,
I encountered a printk deadlock on 5.13 which appears to still affect the latest kernel. The deadlock occurs due to printk being used while having the current CPU's runqueue locked, and the underlying framebuffer console attempting to lock the same runqueue when printk tries to flush the log buffer.
I'm not sure what the *correct* solution is here (don't use printk while having a runqueue locked? don't use schedule_work() from the fbcon path? tell printk to use one of its lock-less backends?), so I've cc'd all the relevant folks.
At the moment, printk_deferred() could be used here. It defers the console handling via irq_work().
There is no deferred variant for WARN() at the moment. The following might work:
#define WARN_DEFERRED(condition, format...) ({ \ unsigned long flags; \ \ printk_safe_enter_irqsave(flags); \ WARN(condition, format...) \ printk_safe_exit_irqrestore(flags); \ })
, where printk_safe_enter_irqsave()/printk_safe_exit_irqrestore(flags) are currently used only internally by printk() code and defined in the local kernel/printk/internal.h
Be ware that using the deferred variants is a whack a mole approach. There are many printk() callers that might be called indirectly and eventually cause deadlocks.
As already said, the plan is to upstream -rt solution and offload the console work into kthreads. But it goes slowly. We want to make it a clean way and prevent regressions as much as possible.
Best Regards, Petr