On 01/10/2021 16:48, Peter Zijlstra wrote:
On Fri, Oct 01, 2021 at 11:32:16AM +0100, Tvrtko Ursulin wrote:
On 01/10/2021 10:04, Tvrtko Ursulin wrote:
Hi Peter,
On 30/09/2021 19:33, Peter Zijlstra wrote:
On Thu, Sep 30, 2021 at 06:15:47PM +0100, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running; - int old_prio; + int old_prio, ret; struct rq_flags rf; struct rq *rq; @@ -6913,6 +6945,9 @@ void set_user_nice(struct task_struct *p, long nice) */ p->sched_class->prio_changed(rq, p, old_prio); + ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p); + WARN_ON_ONCE(ret != NOTIFY_DONE);
out_unlock: task_rq_unlock(rq, p, &rf); }
No, we're not going to call out to exported, and potentially unbounded, functions under scheduler locks.
Agreed, that's another good point why it is even more hairy, as I have generally alluded in the cover letter.
Do you have any immediate thoughts on possible alternatives?
Like for instance if I did a queue_work from set_user_nice and then ran a notifier chain async from a worker? I haven't looked at yet what repercussion would that have in terms of having to cancel the pending workers when tasks exit. I can try and prototype that and see how it would look.
Hm or I simply move calling the notifier chain to after task_rq_unlock? That would leave it run under the tasklist lock so probably still quite bad.
Hmm? That's for normalize_rt_tasks() only, right? Just don't have it call the notifier in that special case (that's a magic sysrq thing anyway).
You mean my talk about tasklist_lock? No, it is also on the syscall part I am interested in as well. Call chain looks like this:
sys_setpriority() { ... rcu_read_lock(); read_lock(&tasklist_lock); ... set_one_prio() set_user_nice() { ... task_rq_lock(); -> my notifier from this RFC [1] task_rq_unlock(); -> I can move the notifier here for _some_ improvement [2] } ... read_unlock(&tasklist_lock); rcu_read_unlock(); }
So this RFC had the notifier call chain at [1], which I understood was the thing you initially pointed was horrible, being under a scheduler lock.
I can trivially move it to [2] but that still leaves it under the tasklist lock. I don't have a good feel how much better that would be. If not good enough then I will look for a smarter solution with less opportunity for global impact.
Regards,
Tvrtko