From: Tvrtko Ursulin tvrtko.ursulin@intel.com
This is a somewhat early sketch of one of my ideas intended for early feedback from the core scheduler experts. First and last two patches in the series are the most interesting ones for people outside of i915. (Note I did not copy everyone on all patches but just the cover letter for context and the rest should be available from the mailing list.)
General idea is that current processing landscape seems to be more and more composed of pipelines where computations are done on multiple hardware devices. Furthermore some of the non-CPU devices, like in this case many GPUs supported by the i915 driver, actually support priority based scheduling which is currently rather inaccesible to the user (in terms of being able to control it from the outside).
From these two statements a question arises on how to allow for a simple,
effective and consolidated user experience. In other words why user would not be able to do something like:
$ nice ffmmpeg ...transcode my videos... $ my-favourite-game
And have the nice hint apply to GPU parts of the transcode pipeline as well?
Another reason why I started thinking about this is that I noticed Chrome browser for instance uses nice to de-prioritise background tabs. So again, having that decision propagate to the GPU rendering pipeline sounds like a big plus to the overall user experience.
This RFC implements this idea with the hairy part being the notifier chain I added to enable dynamic adjustments. It is a global notifier which raises a few questions so I am very curious what experts will think here. Please see the opens in the first patch for more on this.
Last patch ("drm/i915: Connect with the process nice change notifier") demonstrates how i915 can use the notifier. With a bit of notable tracking being required and addedd in "drm/i915: Keep track of registered clients indexed by task struct".
On a more positive note the thing seems to even work as is. For instance I roughly simulated the above scenario by running a GPU hog at three nice levels and a GfxBench TRex in parallel (as a game proxy). This is what I got:
GPU hog nice | TRex fps --------------+--------------- not running | 48.9 0 | 42.7 10 | 47.9 -10 | 29.0
When re-niced the background GPU hog has a much smaller effect on the performance of the game user is running in the foreground. So it appears the feature can indeed improve the user experience. Question is just if people are happy with this method of implementing it.
v2: * Moved notifier outside task_rq_lock. * Some improvements and restructuring on the i915 side of the series.
Cc: Ingo Molnar mingo@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Juri Lelli juri.lelli@redhat.com Cc: Vincent Guittot vincent.guittot@linaro.org
Tvrtko Ursulin (8): sched: Add nice value change notifier drm/i915: Explicitly track DRM clients drm/i915: Make GEM contexts track DRM clients drm/i915: Track all user contexts per client drm/i915: Keep track of registered clients indexed by task struct drm/i915: Make some recently added vfuncs use full scheduling attribute drm/i915: Inherit process nice for context scheduling priority drm/i915: Connect with the process nice change notifier
drivers/gpu/drm/i915/Makefile | 5 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 20 +++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 6 + .../drm/i915/gt/intel_execlists_submission.c | 6 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +- drivers/gpu/drm/i915/i915_drm_client.c | 130 ++++++++++++++++++ drivers/gpu/drm/i915/i915_drm_client.h | 71 ++++++++++ drivers/gpu/drm/i915/i915_drv.c | 6 + drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_gem.c | 21 ++- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 5 + drivers/gpu/drm/i915/i915_scheduler.c | 16 ++- drivers/gpu/drm/i915/i915_scheduler.h | 14 ++ drivers/gpu/drm/i915/i915_scheduler_types.h | 12 +- include/linux/sched.h | 5 + kernel/sched/core.c | 37 ++++- 17 files changed, 346 insertions(+), 18 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Implement a simple notifier chain via which interested parties can track when process nice value changes. Simple because it is global so each user would have to track which tasks it is interested in.
First intended use case are GPU drivers using task nice as priority hint when scheduling GPU contexts belonging to respective clients.
To use register_user_nice_notifier and unregister_user_nice_notifier functions are provided and new nice value and pointer to task_struct being modified passed to the callbacks.
v2: * Move the notifier chain outside task_rq_lock. (Peter)
Opens: * Security. Would some sort of a per process mechanism be better and feasible? x Peter Zijlstra thinks it may be passable now that it is outside core scheduler locks. * Put it all behind kconfig to be selected by interested drivers?
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Juri Lelli juri.lelli@redhat.com Cc: Vincent Guittot vincent.guittot@linaro.org --- include/linux/sched.h | 5 +++++ kernel/sched/core.c | 37 ++++++++++++++++++++++++++++++++++++- 2 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h index c1a927ddec64..1fcec88e5dbc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2309,4 +2309,9 @@ static inline void sched_core_free(struct task_struct *tsk) { } static inline void sched_core_fork(struct task_struct *p) { } #endif
+struct notifier_block; + +extern int register_user_nice_notifier(struct notifier_block *); +extern int unregister_user_nice_notifier(struct notifier_block *); + #endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1bba4128a3e6..fc90b603bb6f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6864,10 +6864,42 @@ static inline int rt_effective_prio(struct task_struct *p, int prio) } #endif
+ATOMIC_NOTIFIER_HEAD(user_nice_notifier_list); + +/** + * register_user_nice_notifier - Register function to be called when task nice changes + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions to be called when task nice + * value changes. + * + * Currently always returns zero, as atomic_notifier_chain_register() + * always returns zero. + */ +int register_user_nice_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_register(&user_nice_notifier_list, nb); +} +EXPORT_SYMBOL(register_user_nice_notifier); + +/** + * unregister_user_nice_notifier - Unregister previously registered user nice notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered user nice notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_user_nice_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_unregister(&user_nice_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_user_nice_notifier); + void set_user_nice(struct task_struct *p, long nice) { bool queued, running; - int old_prio; + int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf); + + ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p); + WARN_ON_ONCE(ret != NOTIFY_DONE); } EXPORT_SYMBOL(set_user_nice);
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running;
- int old_prio;
- int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
- ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
- WARN_ON_ONCE(ret != NOTIFY_DONE); }
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
1. Decoupled with fair scheduelr. In our use case, high priority tasks often use rt scheduler. 2. The range of value don't need to be bound to -20~19 or 0~139
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running;
int old_prio;
int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
}WARN_ON_ONCE(ret != NOTIFY_DONE);
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
- Decoupled with fair scheduelr. In our use case, high priority tasks often use rt scheduler.
Is it possible to tell GPU RT as we are telling them CFS nice?
- The range of value don't need to be bound to -20~19 or 0~139
could build a mapping between the priorities of process and GPU. It seems not a big deal.
Thanks barry
Hi,
On 06/10/2021 08:58, Barry Song wrote:
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running;
int old_prio;
int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
}WARN_ON_ONCE(ret != NOTIFY_DONE);
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
Thought crossed my mind but I couldn't see the practicality of a 3rd nice concept. I mean even to start with I struggle a bit with the usefulness of existing ionice vs nice. Like coming up with practical examples of usecases where it makes sense to decouple the two priorities.
From a different angle I did think inheriting CPU nice makes sense for GPU workloads. This is because today, and more so in the future, computations on a same data set do flow from one to the other.
Like maybe a simple example of batch image processing where CPU decodes, GPU does a transform and then CPU encodes. Or a different mix, doesn't really matter, since the main point it is one computing pipeline from users point of view.
In this example perhaps everything could be handled in userspace so that's another argument to be had. Userspace could query the current scheduling attributes before submitting work to the processing pipeline and adjust using respective uapi.
Downside would be inability to react to changes after the work is already running which may not be too serious limitation outside the world of multi-minute compute workloads. And latter are probably special case enough that would be configured explicitly.
- Decoupled with fair scheduelr. In our use case, high priority tasks often use rt scheduler.
Is it possible to tell GPU RT as we are telling them CFS nice?
Yes of course. We could create a common notification "data packet" which would be sent from both entry points and provide more data than just the nice value. Consumers (of the notifier chain) could then decide for themselves what they want to do with the data.
Regards,
Tvrtko
- The range of value don't need to be bound to -20~19 or 0~139
could build a mapping between the priorities of process and GPU. It seems not a big deal.
Thanks barry
On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
Hi,
On 06/10/2021 08:58, Barry Song wrote:
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running;
int old_prio;
int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
}WARN_ON_ONCE(ret != NOTIFY_DONE);
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
Thought crossed my mind but I couldn't see the practicality of a 3rd nice concept. I mean even to start with I struggle a bit with the usefulness of existing ionice vs nice. Like coming up with practical examples of usecases where it makes sense to decouple the two priorities.
From a different angle I did think inheriting CPU nice makes sense for GPU workloads. This is because today, and more so in the future, computations on a same data set do flow from one to the other.
Like maybe a simple example of batch image processing where CPU decodes, GPU does a transform and then CPU encodes. Or a different mix, doesn't really matter, since the main point it is one computing pipeline from users point of view.
I am on it. but I am also seeing two problems here: 1. nice is not global in linux. For example, if you have two cgroups, cgroup A has more quota then cgroup B. Tasks in B won't win even if it has a lower nice. cgroups will run proportional-weight time-based division of CPU.
2. Historically, we had dynamic nice which was adjusted based on the average sleep/running time; right now, we don't have dynamic nice, but virtual time still make tasks which sleep more preempt other tasks with the same nice or even lower nice. virtual time += physical time/weight by nice so, static nice number doesn't always make sense to decide preemption.
So it seems your patch only works under some simple situation for example no cgroups, tasks have similar sleep/running time.
In this example perhaps everything could be handled in userspace so that's another argument to be had. Userspace could query the current scheduling attributes before submitting work to the processing pipeline and adjust using respective uapi.
Downside would be inability to react to changes after the work is already running which may not be too serious limitation outside the world of multi-minute compute workloads. And latter are probably special case enough that would be configured explicitly.
- Decoupled with fair scheduelr. In our use case, high priority tasks often use rt scheduler.
Is it possible to tell GPU RT as we are telling them CFS nice?
Yes of course. We could create a common notification "data packet" which would be sent from both entry points and provide more data than just the nice value. Consumers (of the notifier chain) could then decide for themselves what they want to do with the data.
RT should have the same problem with CFS once we have cgroups.
Regards,
Tvrtko
- The range of value don't need to be bound to -20~19 or 0~139
could build a mapping between the priorities of process and GPU. It seems not a big deal.
Thanks barry
Thanks barry
On 06/10/2021 21:21, Barry Song wrote:
On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
Hi,
On 06/10/2021 08:58, Barry Song wrote:
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running;
int old_prio;
int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p);
}WARN_ON_ONCE(ret != NOTIFY_DONE);
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
Thought crossed my mind but I couldn't see the practicality of a 3rd nice concept. I mean even to start with I struggle a bit with the usefulness of existing ionice vs nice. Like coming up with practical examples of usecases where it makes sense to decouple the two priorities.
From a different angle I did think inheriting CPU nice makes sense for GPU workloads. This is because today, and more so in the future, computations on a same data set do flow from one to the other.
Like maybe a simple example of batch image processing where CPU decodes, GPU does a transform and then CPU encodes. Or a different mix, doesn't really matter, since the main point it is one computing pipeline from users point of view.
I am on it. but I am also seeing two problems here:
- nice is not global in linux. For example, if you have two cgroups, cgroup A
has more quota then cgroup B. Tasks in B won't win even if it has a lower nice. cgroups will run proportional-weight time-based division of CPU.
- Historically, we had dynamic nice which was adjusted based on the average
sleep/running time; right now, we don't have dynamic nice, but virtual time still make tasks which sleep more preempt other tasks with the same nice or even lower nice. virtual time += physical time/weight by nice so, static nice number doesn't always make sense to decide preemption.
So it seems your patch only works under some simple situation for example no cgroups, tasks have similar sleep/running time.
Yes, I broadly agree with your assessment. Although there are plans for adding cgroup support to i915 scheduling, I doubt as fine grained control and exact semantics as there are on the CPU side will happen.
Mostly because the drive seems to be for more micro-controller managed scheduling which adds further challenges in connecting the two sides together.
But when you say it is a problem, I would characterize it more a weakness in terms of being only a subset of possible control. It is still richer (better?) than what currently exists and as demonstrated with benchmarks in my cover letter it can deliver improvements in user experience. If in the mid term future we can extend it with cgroup support then the concept should still apply and get closer to how you described nice works in the CPU world.
Main question in my mind is whether the idea of adding the sched_attr/priority notifier to the kernel can be justified. Because as mentioned before, everything apart from adjusting currently running GPU jobs could be done purely in userspace. Stack changes would be quite extensive and all, but that is not usually a good enough reason to put something in the kernel. That's why it is an RFC an invitation to discuss.
Even ionice inherits from nice (see task_nice_ioprio()) so I think argument can be made for drivers as well.
Regards,
Tvrtko
In this example perhaps everything could be handled in userspace so that's another argument to be had. Userspace could query the current scheduling attributes before submitting work to the processing pipeline and adjust using respective uapi.
Downside would be inability to react to changes after the work is already running which may not be too serious limitation outside the world of multi-minute compute workloads. And latter are probably special case enough that would be configured explicitly.
- Decoupled with fair scheduelr. In our use case, high priority tasks often use rt scheduler.
Is it possible to tell GPU RT as we are telling them CFS nice?
Yes of course. We could create a common notification "data packet" which would be sent from both entry points and provide more data than just the nice value. Consumers (of the notifier chain) could then decide for themselves what they want to do with the data.
RT should have the same problem with CFS once we have cgroups.
Regards,
Tvrtko
- The range of value don't need to be bound to -20~19 or 0~139
could build a mapping between the priorities of process and GPU. It seems not a big deal.
Thanks barry
Thanks barry
On 07/10/2021 09:50, Tvrtko Ursulin wrote:
On 06/10/2021 21:21, Barry Song wrote:
On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
Hi,
On 06/10/2021 08:58, Barry Song wrote:
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote:
void set_user_nice(struct task_struct *p, long nice) { bool queued, running; - int old_prio; + int old_prio, ret; struct rq_flags rf; struct rq *rq;
@@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, long nice)
out_unlock: task_rq_unlock(rq, p, &rf);
+ ret = atomic_notifier_call_chain(&user_nice_notifier_list, nice, p); + WARN_ON_ONCE(ret != NOTIFY_DONE); }
How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
Thought crossed my mind but I couldn't see the practicality of a 3rd nice concept. I mean even to start with I struggle a bit with the usefulness of existing ionice vs nice. Like coming up with practical examples of usecases where it makes sense to decouple the two priorities.
From a different angle I did think inheriting CPU nice makes sense for GPU workloads. This is because today, and more so in the future, computations on a same data set do flow from one to the other.
Like maybe a simple example of batch image processing where CPU decodes, GPU does a transform and then CPU encodes. Or a different mix, doesn't really matter, since the main point it is one computing pipeline from users point of view.
I am on it. but I am also seeing two problems here:
- nice is not global in linux. For example, if you have two cgroups,
cgroup A has more quota then cgroup B. Tasks in B won't win even if it has a lower nice. cgroups will run proportional-weight time-based division of CPU.
- Historically, we had dynamic nice which was adjusted based on the
average sleep/running time; right now, we don't have dynamic nice, but virtual time still make tasks which sleep more preempt other tasks with the same nice or even lower nice. virtual time += physical time/weight by nice so, static nice number doesn't always make sense to decide preemption.
So it seems your patch only works under some simple situation for example no cgroups, tasks have similar sleep/running time.
Yes, I broadly agree with your assessment. Although there are plans for adding cgroup support to i915 scheduling, I doubt as fine grained control and exact semantics as there are on the CPU side will happen.
Mostly because the drive seems to be for more micro-controller managed scheduling which adds further challenges in connecting the two sides together.
But when you say it is a problem, I would characterize it more a weakness in terms of being only a subset of possible control. It is still richer (better?) than what currently exists and as demonstrated with benchmarks in my cover letter it can deliver improvements in user experience. If in the mid term future we can extend it with cgroup support then the concept should still apply and get closer to how you described nice works in the CPU world.
Main question in my mind is whether the idea of adding the sched_attr/priority notifier to the kernel can be justified. Because as mentioned before, everything apart from adjusting currently running GPU jobs could be done purely in userspace. Stack changes would be quite extensive and all, but that is not usually a good enough reason to put something in the kernel. That's why it is an RFC an invitation to discuss.
Even ionice inherits from nice (see task_nice_ioprio()) so I think argument can be made for drivers as well.
Now that I wrote this, I had a little bit of a light bulb moment. If I abandon the idea of adjusting the priority of already submitted work items, then I can do much of what I want purely from within the confines of i915.
I simply add code to inherit from current task nice on every new work item submission. This should probably bring the majority of the benefit I measured.
Regards,
Tvrtko
On Thu, Oct 7, 2021 at 10:09 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 07/10/2021 09:50, Tvrtko Ursulin wrote:
On 06/10/2021 21:21, Barry Song wrote:
On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
Hi,
On 06/10/2021 08:58, Barry Song wrote:
On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) john.wanghui@huawei.com wrote:
HI Tvrtko
On 2021/10/4 22:36, Tvrtko Ursulin wrote: > void set_user_nice(struct task_struct *p, long nice) > { > bool queued, running; > - int old_prio; > + int old_prio, ret; > struct rq_flags rf; > struct rq *rq; > > @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, > long nice) > > out_unlock: > task_rq_unlock(rq, p, &rf); > + > + ret = atomic_notifier_call_chain(&user_nice_notifier_list, > nice, p); > + WARN_ON_ONCE(ret != NOTIFY_DONE); > } How about adding a new "io_nice" to task_struct,and move the call chain to sched_setattr/getattr, there are two benefits:
We already have an ionice for block io scheduler. hardly can this new io_nice be generic to all I/O. it seems the patchset is trying to link process' nice with GPU's scheduler, to some extent, it makes more senses than having a common ionice because we have a lot of IO devices in the systems, we don't know which I/O the ionice of task_struct should be applied to.
Maybe we could have an ionice dedicated for GPU just like ionice for CFQ of bio/request scheduler.
Thought crossed my mind but I couldn't see the practicality of a 3rd nice concept. I mean even to start with I struggle a bit with the usefulness of existing ionice vs nice. Like coming up with practical examples of usecases where it makes sense to decouple the two priorities.
From a different angle I did think inheriting CPU nice makes sense for GPU workloads. This is because today, and more so in the future, computations on a same data set do flow from one to the other.
Like maybe a simple example of batch image processing where CPU decodes, GPU does a transform and then CPU encodes. Or a different mix, doesn't really matter, since the main point it is one computing pipeline from users point of view.
I am on it. but I am also seeing two problems here:
- nice is not global in linux. For example, if you have two cgroups,
cgroup A has more quota then cgroup B. Tasks in B won't win even if it has a lower nice. cgroups will run proportional-weight time-based division of CPU.
- Historically, we had dynamic nice which was adjusted based on the
average sleep/running time; right now, we don't have dynamic nice, but virtual time still make tasks which sleep more preempt other tasks with the same nice or even lower nice. virtual time += physical time/weight by nice so, static nice number doesn't always make sense to decide preemption.
So it seems your patch only works under some simple situation for example no cgroups, tasks have similar sleep/running time.
Yes, I broadly agree with your assessment. Although there are plans for adding cgroup support to i915 scheduling, I doubt as fine grained control and exact semantics as there are on the CPU side will happen.
Mostly because the drive seems to be for more micro-controller managed scheduling which adds further challenges in connecting the two sides together.
But when you say it is a problem, I would characterize it more a weakness in terms of being only a subset of possible control. It is still richer (better?) than what currently exists and as demonstrated with benchmarks in my cover letter it can deliver improvements in user experience. If in the mid term future we can extend it with cgroup support then the concept should still apply and get closer to how you described nice works in the CPU world.
Main question in my mind is whether the idea of adding the sched_attr/priority notifier to the kernel can be justified. Because as mentioned before, everything apart from adjusting currently running GPU jobs could be done purely in userspace. Stack changes would be quite extensive and all, but that is not usually a good enough reason to put something in the kernel. That's why it is an RFC an invitation to discuss.
Even ionice inherits from nice (see task_nice_ioprio()) so I think argument can be made for drivers as well.
Now that I wrote this, I had a little bit of a light bulb moment. If I abandon the idea of adjusting the priority of already submitted work items, then I can do much of what I want purely from within the confines of i915.
I simply add code to inherit from current task nice on every new work item submission. This should probably bring the majority of the benefit I measured.
I think the idea makes sense to link the process's priority with the GPU's scheduler. I have no doubt about this. My question is more of what is the best way to implement this.
Android has bg_non_interactive cgroup with much lower weight for background processes. interactive tasks, on the other hand, are placed in another cgroup with much higer weight. So Android depends on cgroup to improve user experience.
Chrome browser in your cover-letter uses nice to de-prioritise background tabs. this works perfectly as the whole chrome should be in the same cgroup, so changing nice will improve/decrease the resource gotten by tasks in this cgroup. But once we have two cgroups, bringing this nice belonging to the cgroup to the global scheduler of GPU will somehow break the aim.
For example, if we have two cgroup A and B /sys/fs/cgroup/cpu$ sudo sh -c 'echo 4096 > A/cpu.shares' /sys/fs/cgroup/cpu$ sudo sh -c 'echo 512 > B/cpu.shares'
task in B with lower nice will get more GPU than task in A. But actually A group has 8X weight of B. So the result seems wrong. especially real users like Android does depend on cgroup. I don't know how to overcome this "weakness", it seems not easy.
Regards,
Tvrtko
Thanks barry
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Tracking DRM clients more explicitly will allow later patches to accumulate past and current GPU usage in a centralised place and also consolidate access to owning task pid/name.
Unique client id is also assigned for the purpose of distinguishing/ consolidating between multiple file descriptors owned by the same process.
v2: Chris Wilson: * Enclose new members into dedicated structs. * Protect against failed sysfs registration.
v3: * sysfs_attr_init.
v4: * Fix for internal clients.
v5: * Use cyclic ida for client id. (Chris) * Do not leak pid reference. (Chris) * Tidy code with some locals.
v6: * Use xa_alloc_cyclic to simplify locking. (Chris) * No need to unregister individial sysfs files. (Chris) * Rebase on top of fpriv kref. * Track client closed status and reflect in sysfs.
v7: * Make drm_client more standalone concept.
v8: * Simplify sysfs show. (Chris) * Always track name and pid.
v9: * Fix cyclic id assignment.
v10: * No need for a mutex around xa_alloc_cyclic. * Refactor sysfs into own function. * Unregister sysfs before freeing pid and name. * Move clients setup into own function.
v11: * Call clients init directly from driver init. (Chris)
v12: * Do not fail client add on id wrap. (Maciej)
v13 (Lucas): Rebase.
v14: * Dropped sysfs bits.
v15: * Dropped tracking of pid/ and name. * Dropped RCU freeing of the client object.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Chris Wilson chris@chris-wilson.co.uk # v11 Reviewed-by: Aravind Iddamsetty aravind.iddamsetty@intel.com # v11 Signed-off-by: Chris Wilson chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/Makefile | 5 +- drivers/gpu/drm/i915/i915_drm_client.c | 68 ++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_drm_client.h | 50 +++++++++++++++++++ drivers/gpu/drm/i915/i915_drv.c | 6 +++ drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/i915_gem.c | 21 ++++++-- 6 files changed, 150 insertions(+), 5 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 5c8e022a7383..005b5df425a1 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -32,8 +32,9 @@ subdir-ccflags-y += -I$(srctree)/$(src) # Please keep these build lists sorted!
# core driver code -i915-y += i915_drv.o \ - i915_config.o \ +i915-y += i915_config.o \ + i915_drm_client.o \ + i915_drv.o \ i915_irq.o \ i915_getparam.o \ i915_mitigations.o \ diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c new file mode 100644 index 000000000000..e61e9ba15256 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_drm_client.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/types.h> + +#include "i915_drm_client.h" +#include "i915_gem.h" +#include "i915_utils.h" + +void i915_drm_clients_init(struct i915_drm_clients *clients, + struct drm_i915_private *i915) +{ + clients->i915 = i915; + clients->next_id = 0; + + xa_init_flags(&clients->xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ); +} + +struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients) +{ + struct i915_drm_client *client; + struct xarray *xa = &clients->xarray; + int ret; + + client = kzalloc(sizeof(*client), GFP_KERNEL); + if (!client) + return ERR_PTR(-ENOMEM); + + xa_lock_irq(xa); + ret = __xa_alloc_cyclic(xa, &client->id, client, xa_limit_32b, + &clients->next_id, GFP_KERNEL); + xa_unlock_irq(xa); + if (ret < 0) + goto err; + + kref_init(&client->kref); + client->clients = clients; + + return client; + +err: + kfree(client); + + return ERR_PTR(ret); +} + +void __i915_drm_client_free(struct kref *kref) +{ + struct i915_drm_client *client = + container_of(kref, typeof(*client), kref); + struct xarray *xa = &client->clients->xarray; + unsigned long flags; + + xa_lock_irqsave(xa, flags); + __xa_erase(xa, client->id); + xa_unlock_irqrestore(xa, flags); + kfree(client); +} + +void i915_drm_clients_fini(struct i915_drm_clients *clients) +{ + GEM_BUG_ON(!xa_empty(&clients->xarray)); + xa_destroy(&clients->xarray); +} diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h new file mode 100644 index 000000000000..e8986ad51176 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_drm_client.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef __I915_DRM_CLIENT_H__ +#define __I915_DRM_CLIENT_H__ + +#include <linux/kref.h> +#include <linux/xarray.h> + +struct drm_i915_private; + +struct i915_drm_clients { + struct drm_i915_private *i915; + + struct xarray xarray; + u32 next_id; +}; + +struct i915_drm_client { + struct kref kref; + + unsigned int id; + + struct i915_drm_clients *clients; +}; + +void i915_drm_clients_init(struct i915_drm_clients *clients, + struct drm_i915_private *i915); + +static inline struct i915_drm_client * +i915_drm_client_get(struct i915_drm_client *client) +{ + kref_get(&client->kref); + return client; +} + +void __i915_drm_client_free(struct kref *kref); + +static inline void i915_drm_client_put(struct i915_drm_client *client) +{ + kref_put(&client->kref, __i915_drm_client_free); +} + +struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients); + +void i915_drm_clients_fini(struct i915_drm_clients *clients); + +#endif /* !__I915_DRM_CLIENT_H__ */ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 1e0e4175b481..5e4d2ea1fae5 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -69,6 +69,7 @@ #include "gt/intel_rc6.h"
#include "i915_debugfs.h" +#include "i915_drm_client.h" #include "i915_drv.h" #include "i915_ioc32.h" #include "i915_irq.h" @@ -345,6 +346,8 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
intel_gt_init_early(&dev_priv->gt, dev_priv);
+ i915_drm_clients_init(&dev_priv->clients, dev_priv); + i915_gem_init_early(dev_priv);
/* This must be called before any calls to HAS_PCH_* */ @@ -364,6 +367,7 @@ static int i915_driver_early_probe(struct drm_i915_private *dev_priv)
err_gem: i915_gem_cleanup_early(dev_priv); + i915_drm_clients_fini(&dev_priv->clients); intel_gt_driver_late_release(&dev_priv->gt); intel_region_ttm_device_fini(dev_priv); err_ttm: @@ -383,6 +387,7 @@ static void i915_driver_late_release(struct drm_i915_private *dev_priv) intel_irq_fini(dev_priv); intel_power_domains_cleanup(dev_priv); i915_gem_cleanup_early(dev_priv); + i915_drm_clients_fini(&dev_priv->clients); intel_gt_driver_late_release(&dev_priv->gt); intel_region_ttm_device_fini(dev_priv); vlv_suspend_cleanup(dev_priv); @@ -999,6 +1004,7 @@ static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file) struct drm_i915_file_private *file_priv = file->driver_priv;
i915_gem_context_close(file); + i915_drm_client_put(file_priv->client);
kfree_rcu(file_priv, rcu);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 63c939891f1c..de2ba95b81f6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -96,6 +96,7 @@ #include "intel_wakeref.h" #include "intel_wopcm.h"
+#include "i915_drm_client.h" #include "i915_gem.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" @@ -284,6 +285,8 @@ struct drm_i915_file_private { /** ban_score: Accumulated score of all ctx bans and fast hangs. */ atomic_t ban_score; unsigned long hang_timestamp; + + struct i915_drm_client *client; };
/* Interface history: @@ -1238,6 +1241,8 @@ struct drm_i915_private {
struct i915_pmu pmu;
+ struct i915_drm_clients clients; + struct i915_hdcp_comp_master *hdcp_master; bool hdcp_comp_added;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 981e383d1a5d..3df78babe1a9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1177,25 +1177,40 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv) int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file) { struct drm_i915_file_private *file_priv; - int ret; + struct i915_drm_client *client; + int ret = -ENOMEM;
DRM_DEBUG("\n");
file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); if (!file_priv) - return -ENOMEM; + goto err_alloc; + + client = i915_drm_client_add(&i915->clients); + if (IS_ERR(client)) { + ret = PTR_ERR(client); + goto err_client; + }
file->driver_priv = file_priv; file_priv->dev_priv = i915; file_priv->file = file; + file_priv->client = client;
file_priv->bsd_engine = -1; file_priv->hang_timestamp = jiffies;
ret = i915_gem_context_open(i915, file); if (ret) - kfree(file_priv); + goto err_context; + + return 0;
+err_context: + i915_drm_client_put(client); +err_client: + kfree(file_priv); +err_alloc: return ret; }
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Make GEM contexts keep a reference to i915_drm_client for the whole of of their lifetime which will come handy in following patches.
v2: Don't bother supporting selftests contexts from debugfs. (Chris) v3 (Lucas): Finish constructing ctx before adding it to the list v4 (Ram): Rebase. v5: Trivial rebase for proto ctx changes. v6: Rebase after clients no longer track name and pid.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Chris Wilson chris@chris-wilson.co.uk # v5 Reviewed-by: Aravind Iddamsetty aravind.iddamsetty@intel.com # v5 Signed-off-by: Chris Wilson chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 5 +++++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++ 2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 8208fd5b72c3..9296d69681d7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -956,6 +956,9 @@ static void i915_gem_context_release_work(struct work_struct *work) if (vm) i915_vm_put(vm);
+ if (ctx->client) + i915_drm_client_put(ctx->client); + mutex_destroy(&ctx->engines_mutex); mutex_destroy(&ctx->lut_mutex);
@@ -1373,6 +1376,8 @@ static void gem_context_register(struct i915_gem_context *ctx, ctx->file_priv = fpriv;
ctx->pid = get_task_pid(current, PIDTYPE_PID); + ctx->client = i915_drm_client_get(fpriv->client); + snprintf(ctx->name, sizeof(ctx->name), "%s[%d]", current->comm, pid_nr(ctx->pid));
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index c4617e4d9fa9..598c57ac5cdf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -277,6 +277,9 @@ struct i915_gem_context { /** @link: place with &drm_i915_private.context_list */ struct list_head link;
+ /** @client: struct i915_drm_client */ + struct i915_drm_client *client; + /** * @ref: reference count *
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
We soon want to start answering questions like how much GPU time is the context belonging to a client which exited still using.
To enable this we start tracking all context belonging to a client on a separate list.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Aravind Iddamsetty aravind.iddamsetty@intel.com Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 12 ++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 3 +++ drivers/gpu/drm/i915/i915_drm_client.c | 2 ++ drivers/gpu/drm/i915/i915_drm_client.h | 5 +++++ 4 files changed, 22 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 9296d69681d7..d1992ba59ed8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1169,6 +1169,7 @@ static void set_closed_name(struct i915_gem_context *ctx)
static void context_close(struct i915_gem_context *ctx) { + struct i915_drm_client *client; struct i915_address_space *vm;
/* Flush any concurrent set_engines() */ @@ -1205,6 +1206,13 @@ static void context_close(struct i915_gem_context *ctx) list_del(&ctx->link); spin_unlock(&ctx->i915->gem.contexts.lock);
+ client = ctx->client; + if (client) { + spin_lock(&client->ctx_lock); + list_del_rcu(&ctx->client_link); + spin_unlock(&client->ctx_lock); + } + mutex_unlock(&ctx->mutex);
/* @@ -1385,6 +1393,10 @@ static void gem_context_register(struct i915_gem_context *ctx, old = xa_store(&fpriv->context_xa, id, ctx, GFP_KERNEL); WARN_ON(old);
+ spin_lock(&ctx->client->ctx_lock); + list_add_tail_rcu(&ctx->client_link, &ctx->client->ctx_list); + spin_unlock(&ctx->client->ctx_lock); + spin_lock(&i915->gem.contexts.lock); list_add_tail(&ctx->link, &i915->gem.contexts.list); spin_unlock(&i915->gem.contexts.lock); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 598c57ac5cdf..b878e1b13b38 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -280,6 +280,9 @@ struct i915_gem_context { /** @client: struct i915_drm_client */ struct i915_drm_client *client;
+ /** link: &drm_client.context_list */ + struct list_head client_link; + /** * @ref: reference count * diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c index e61e9ba15256..91a8559bebf7 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.c +++ b/drivers/gpu/drm/i915/i915_drm_client.c @@ -38,6 +38,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients) goto err;
kref_init(&client->kref); + spin_lock_init(&client->ctx_lock); + INIT_LIST_HEAD(&client->ctx_list); client->clients = clients;
return client; diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h index e8986ad51176..0207dfad4568 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.h +++ b/drivers/gpu/drm/i915/i915_drm_client.h @@ -7,6 +7,8 @@ #define __I915_DRM_CLIENT_H__
#include <linux/kref.h> +#include <linux/list.h> +#include <linux/spinlock.h> #include <linux/xarray.h>
struct drm_i915_private; @@ -23,6 +25,9 @@ struct i915_drm_client {
unsigned int id;
+ spinlock_t ctx_lock; /* For add/remove from ctx_list. */ + struct list_head ctx_list; /* List of contexts belonging to client. */ + struct i915_drm_clients *clients; };
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
A simple hash table of registered clients indexed by the task struct pointer is kept to be used in a following patch.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 ++ drivers/gpu/drm/i915/i915_drm_client.c | 31 ++++++++++++++++++++- drivers/gpu/drm/i915/i915_drm_client.h | 13 +++++++++ 3 files changed, 45 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index d1992ba59ed8..8d4d687ab1d0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1932,6 +1932,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, return -EIO; }
+ i915_drm_client_update_owner(ext_data.fpriv->client, current); + ext_data.pc = proto_context_create(i915, args->flags); if (IS_ERR(ext_data.pc)) return PTR_ERR(ext_data.pc); diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c index 91a8559bebf7..82b9636482ef 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.c +++ b/drivers/gpu/drm/i915/i915_drm_client.c @@ -18,6 +18,9 @@ void i915_drm_clients_init(struct i915_drm_clients *clients, clients->next_id = 0;
xa_init_flags(&clients->xarray, XA_FLAGS_ALLOC | XA_FLAGS_LOCK_IRQ); + + rwlock_init(&clients->lock); + hash_init(clients->tasks); }
struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients) @@ -42,6 +45,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients) INIT_LIST_HEAD(&client->ctx_list); client->clients = clients;
+ i915_drm_client_update_owner(client, current); + return client;
err: @@ -54,9 +59,14 @@ void __i915_drm_client_free(struct kref *kref) { struct i915_drm_client *client = container_of(kref, typeof(*client), kref); - struct xarray *xa = &client->clients->xarray; + struct i915_drm_clients *clients = client->clients; + struct xarray *xa = &clients->xarray; unsigned long flags;
+ write_lock(&clients->lock); + hash_del(&client->node); + write_unlock(&clients->lock); + xa_lock_irqsave(xa, flags); __xa_erase(xa, client->id); xa_unlock_irqrestore(xa, flags); @@ -68,3 +78,22 @@ void i915_drm_clients_fini(struct i915_drm_clients *clients) GEM_BUG_ON(!xa_empty(&clients->xarray)); xa_destroy(&clients->xarray); } + +void i915_drm_client_update_owner(struct i915_drm_client *client, + struct task_struct *owner) +{ + struct i915_drm_clients *clients; + + if (READ_ONCE(client->owner) == owner) + return; + + clients = client->clients; + write_lock(&clients->lock); + if (READ_ONCE(client->owner) != owner) { + if (client->owner) + hash_del(&client->node); + client->owner = owner; + hash_add(clients->tasks, &client->node, (uintptr_t)owner); + } + write_unlock(&clients->lock); +} diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h index 0207dfad4568..42fd79f0558a 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.h +++ b/drivers/gpu/drm/i915/i915_drm_client.h @@ -6,8 +6,11 @@ #ifndef __I915_DRM_CLIENT_H__ #define __I915_DRM_CLIENT_H__
+#include <linux/hashtable.h> #include <linux/kref.h> #include <linux/list.h> +#include <linux/rwlock.h> +#include <linux/sched.h> #include <linux/spinlock.h> #include <linux/xarray.h>
@@ -18,6 +21,9 @@ struct i915_drm_clients {
struct xarray xarray; u32 next_id; + + rwlock_t lock; + DECLARE_HASHTABLE(tasks, 6); };
struct i915_drm_client { @@ -28,6 +34,9 @@ struct i915_drm_client { spinlock_t ctx_lock; /* For add/remove from ctx_list. */ struct list_head ctx_list; /* List of contexts belonging to client. */
+ struct task_struct *owner; /* No reference kept, never dereferenced. */ + struct hlist_node node; + struct i915_drm_clients *clients; };
@@ -52,4 +61,8 @@ struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients);
void i915_drm_clients_fini(struct i915_drm_clients *clients);
+void i915_drm_client_update_owner(struct i915_drm_client *client, + struct task_struct *owner); + + #endif /* !__I915_DRM_CLIENT_H__ */
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") introduced some scheduling related vfuncs which take integer request priority as argument.
Make them instead take struct i915_sched_attr, which is the type encapsulating this information, so it probably aligns with the design better. It definitely enables extending the set of scheduling attributes.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com --- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-- drivers/gpu/drm/i915/i915_scheduler_types.h | 4 ++-- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7147fe80919e..e91d803a6453 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine) return engine->class != RENDER_CLASS; }
-static void kick_execlists(const struct i915_request *rq, int prio) +static void kick_execlists(const struct i915_request *rq, + const struct i915_sched_attr *attr) { struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; const struct i915_request *inflight; + const int prio = attr->priority;
/* * We only need to kick the tasklet once for the high priority diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ba0de35f6323..b5883a4365ca 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine) }
static void guc_bump_inflight_request_prio(struct i915_request *rq, - int prio) + const struct i915_sched_attr *attr) { struct intel_context *ce = rq->context; + const int prio = attr->priority; u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 762127dd56c5..534bab99fcdc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Must be called before changing the nodes priority */ if (sched_engine->bump_inflight_request_prio) - sched_engine->bump_inflight_request_prio(from, prio); + sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Defer (tasklet) submission until after all of our updates. */ if (sched_engine->kick_backend) - sched_engine->kick_backend(node_to_request(node), prio); + sched_engine->kick_backend(node_to_request(node), attr); }
spin_unlock(&sched_engine->lock); diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index b0a1b58c7893..24b9ac1c2ce2 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -177,13 +177,13 @@ struct i915_sched_engine { * @kick_backend: kick backend after a request's priority has changed */ void (*kick_backend)(const struct i915_request *rq, - int prio); + const struct i915_sched_attr *attr);
/** * @bump_inflight_request_prio: update priority of an inflight request */ void (*bump_inflight_request_prio)(struct i915_request *rq, - int prio); + const struct i915_sched_attr *attr);
/** * @retire_inflight_request_prio: indicate request is retired to
On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") introduced some scheduling related vfuncs which take integer request priority as argument.
Make them instead take struct i915_sched_attr, which is the type encapsulating this information, so it probably aligns with the design better. It definitely enables extending the set of scheduling attributes.
Understand the motivation here but the i915_scheduler is going to disapear when we move to the DRM scheduler or at least its functionality of priority inheritance will be pushed into the DRM scheduler. I'd be very careful making any changes here as the priority in the DRM scheduler is defined as single enum:
/* These are often used as an (initial) index * to an array, and as such should start at 0. */ enum drm_sched_priority { DRM_SCHED_PRIORITY_MIN, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL,
DRM_SCHED_PRIORITY_COUNT, DRM_SCHED_PRIORITY_UNSET = -2 };
Adding a field to the i915_sched_attr is fairly easy as we already have a structure but changing the DRM scheduler might be a tougher sell. Anyway you can make this work without adding the 'nice' field to i915_sched_attr? Might be worth exploring so when we move to the DRM scheduler this feature drops in a little cleaner.
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-- drivers/gpu/drm/i915/i915_scheduler_types.h | 4 ++-- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7147fe80919e..e91d803a6453 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine) return engine->class != RENDER_CLASS; }
-static void kick_execlists(const struct i915_request *rq, int prio) +static void kick_execlists(const struct i915_request *rq,
const struct i915_sched_attr *attr)
{ struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ba0de35f6323..b5883a4365ca 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine) }
static void guc_bump_inflight_request_prio(struct i915_request *rq,
int prio)
const struct i915_sched_attr *attr)
{ struct intel_context *ce = rq->context;
const int prio = attr->priority; u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 762127dd56c5..534bab99fcdc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Must be called before changing the nodes priority */ if (sched_engine->bump_inflight_request_prio)
sched_engine->bump_inflight_request_prio(from, prio);
sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Defer (tasklet) submission until after all of our updates. */ if (sched_engine->kick_backend)
sched_engine->kick_backend(node_to_request(node), prio);
sched_engine->kick_backend(node_to_request(node), attr);
}
spin_unlock(&sched_engine->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index b0a1b58c7893..24b9ac1c2ce2 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -177,13 +177,13 @@ struct i915_sched_engine { * @kick_backend: kick backend after a request's priority has changed */ void (*kick_backend)(const struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @bump_inflight_request_prio: update priority of an inflight request
*/ void (*bump_inflight_request_prio)(struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @retire_inflight_request_prio: indicate request is retired to
-- 2.30.2
On 06/10/2021 18:12, Matthew Brost wrote:
On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") introduced some scheduling related vfuncs which take integer request priority as argument.
Make them instead take struct i915_sched_attr, which is the type encapsulating this information, so it probably aligns with the design better. It definitely enables extending the set of scheduling attributes.
Understand the motivation here but the i915_scheduler is going to disapear when we move to the DRM scheduler or at least its functionality of priority inheritance will be pushed into the DRM scheduler. I'd be very careful making any changes here as the priority in the DRM scheduler is defined as single enum:
/* These are often used as an (initial) index
- to an array, and as such should start at 0.
*/ enum drm_sched_priority { DRM_SCHED_PRIORITY_MIN, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL,
DRM_SCHED_PRIORITY_COUNT, DRM_SCHED_PRIORITY_UNSET = -2
};
Adding a field to the i915_sched_attr is fairly easy as we already have a structure but changing the DRM scheduler might be a tougher sell. Anyway you can make this work without adding the 'nice' field to i915_sched_attr? Might be worth exploring so when we move to the DRM scheduler this feature drops in a little cleaner.
/me rubs a crystal ball.. no idea how that would look. :)
Idea with separate nice in sched attr is that only default priority contexts are affected by nice. Thinking being, if i915 aware userspace has explicitly requested a different priority, then let it keep that. In other words, only if it did not bother and left it at default/normal, then we inherit from process nice.
I suppose to keep this idea in some future drm scheduler world it would require some work. But in general the concept is of course open to discussion.
Implementation wise, not having prio and nice separate opens some other problems in how this interacts with GEM context priority, but there are probably solutions to those problems as well.
I suppose I could define a "nice range" in the existing priorities and nice adjustments would only apply if within that range. That would allow max and min user priority to still be undisturbed as requested via the i915 uapi.
(min) -1023 ----- (nice 20) ----- 0 ------ (nice -19) ------ +1023 (max)
And I say if some userspace set a priority in the -20 to 19 range then it is allowed to adjust it via process nice. That would keep it a single int priority as today.
Regards,
Tvrtko
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-- drivers/gpu/drm/i915/i915_scheduler_types.h | 4 ++-- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7147fe80919e..e91d803a6453 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine) return engine->class != RENDER_CLASS; }
-static void kick_execlists(const struct i915_request *rq, int prio) +static void kick_execlists(const struct i915_request *rq,
const struct i915_sched_attr *attr)
{ struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ba0de35f6323..b5883a4365ca 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine) }
static void guc_bump_inflight_request_prio(struct i915_request *rq,
int prio)
const struct i915_sched_attr *attr)
{ struct intel_context *ce = rq->context;
const int prio = attr->priority; u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 762127dd56c5..534bab99fcdc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Must be called before changing the nodes priority */ if (sched_engine->bump_inflight_request_prio)
sched_engine->bump_inflight_request_prio(from, prio);
sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Defer (tasklet) submission until after all of our updates. */ if (sched_engine->kick_backend)
sched_engine->kick_backend(node_to_request(node), prio);
sched_engine->kick_backend(node_to_request(node), attr);
}
spin_unlock(&sched_engine->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index b0a1b58c7893..24b9ac1c2ce2 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -177,13 +177,13 @@ struct i915_sched_engine { * @kick_backend: kick backend after a request's priority has changed */ void (*kick_backend)(const struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @bump_inflight_request_prio: update priority of an inflight request
*/ void (*bump_inflight_request_prio)(struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @retire_inflight_request_prio: indicate request is retired to
-- 2.30.2
On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") introduced some scheduling related vfuncs which take integer request priority as argument.
Make them instead take struct i915_sched_attr, which is the type encapsulating this information, so it probably aligns with the design better. It definitely enables extending the set of scheduling attributes.
Understand the motivation here but the i915_scheduler is going to disapear when we move to the DRM scheduler or at least its functionality of priority inheritance will be pushed into the DRM scheduler. I'd be very careful making any changes here as the priority in the DRM scheduler is defined as single enum:
Yeah I'm not sure it makes sense to build this and make the conversion to drm/sched even harder. We've already merged a lot of code with a "we'll totally convert to drm/sched right after" promise, there's not really room for more fun like this built on top of i915-scheduler. -Daniel
/* These are often used as an (initial) index
- to an array, and as such should start at 0.
*/ enum drm_sched_priority { DRM_SCHED_PRIORITY_MIN, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL,
DRM_SCHED_PRIORITY_COUNT, DRM_SCHED_PRIORITY_UNSET = -2
};
Adding a field to the i915_sched_attr is fairly easy as we already have a structure but changing the DRM scheduler might be a tougher sell. Anyway you can make this work without adding the 'nice' field to i915_sched_attr? Might be worth exploring so when we move to the DRM scheduler this feature drops in a little cleaner.
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-- drivers/gpu/drm/i915/i915_scheduler_types.h | 4 ++-- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7147fe80919e..e91d803a6453 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine) return engine->class != RENDER_CLASS; }
-static void kick_execlists(const struct i915_request *rq, int prio) +static void kick_execlists(const struct i915_request *rq,
const struct i915_sched_attr *attr)
{ struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ba0de35f6323..b5883a4365ca 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine) }
static void guc_bump_inflight_request_prio(struct i915_request *rq,
int prio)
const struct i915_sched_attr *attr)
{ struct intel_context *ce = rq->context;
const int prio = attr->priority; u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 762127dd56c5..534bab99fcdc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Must be called before changing the nodes priority */ if (sched_engine->bump_inflight_request_prio)
sched_engine->bump_inflight_request_prio(from, prio);
sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Defer (tasklet) submission until after all of our updates. */ if (sched_engine->kick_backend)
sched_engine->kick_backend(node_to_request(node), prio);
sched_engine->kick_backend(node_to_request(node), attr);
}
spin_unlock(&sched_engine->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index b0a1b58c7893..24b9ac1c2ce2 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -177,13 +177,13 @@ struct i915_sched_engine { * @kick_backend: kick backend after a request's priority has changed */ void (*kick_backend)(const struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @bump_inflight_request_prio: update priority of an inflight request
*/ void (*bump_inflight_request_prio)(struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @retire_inflight_request_prio: indicate request is retired to
-- 2.30.2
On 13/10/2021 13:01, Daniel Vetter wrote:
On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC priority management") introduced some scheduling related vfuncs which take integer request priority as argument.
Make them instead take struct i915_sched_attr, which is the type encapsulating this information, so it probably aligns with the design better. It definitely enables extending the set of scheduling attributes.
Understand the motivation here but the i915_scheduler is going to disapear when we move to the DRM scheduler or at least its functionality of priority inheritance will be pushed into the DRM scheduler. I'd be very careful making any changes here as the priority in the DRM scheduler is defined as single enum:
Yeah I'm not sure it makes sense to build this and make the conversion to drm/sched even harder. We've already merged a lot of code with a "we'll totally convert to drm/sched right after" promise, there's not really room for more fun like this built on top of i915-scheduler.
It is not really fun on top of i915-scheduler. It is fun on top of the concept of uapi gem context priority. As long as there is gem context priority, and requests inherit from it, the concept works. This is demonstrated by the fact it ties in with the GuC backend which reduces to three priorities already. It is limited granularity but it does something.
Implementation details aside, key question is the proposal to tie process nice with GPU scheduling priority. There seems to be interest from other parties so there probably is something here.
But I do plan to simplify this RFC to not add anything to i915_sched_attr and also drop the task sched attr change notifier.
Regards,
Tvrtko
-Daniel
/* These are often used as an (initial) index
- to an array, and as such should start at 0.
*/ enum drm_sched_priority { DRM_SCHED_PRIORITY_MIN, DRM_SCHED_PRIORITY_NORMAL, DRM_SCHED_PRIORITY_HIGH, DRM_SCHED_PRIORITY_KERNEL,
DRM_SCHED_PRIORITY_COUNT, DRM_SCHED_PRIORITY_UNSET = -2
};
Adding a field to the i915_sched_attr is fairly easy as we already have a structure but changing the DRM scheduler might be a tougher sell. Anyway you can make this work without adding the 'nice' field to i915_sched_attr? Might be worth exploring so when we move to the DRM scheduler this feature drops in a little cleaner.
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-- drivers/gpu/drm/i915/i915_scheduler_types.h | 4 ++-- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7147fe80919e..e91d803a6453 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine) return engine->class != RENDER_CLASS; }
-static void kick_execlists(const struct i915_request *rq, int prio) +static void kick_execlists(const struct i915_request *rq,
const struct i915_sched_attr *attr)
{ struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index ba0de35f6323..b5883a4365ca 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs *engine) }
static void guc_bump_inflight_request_prio(struct i915_request *rq,
int prio)
const struct i915_sched_attr *attr)
{ struct intel_context *ce = rq->context;
const int prio = attr->priority; u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 762127dd56c5..534bab99fcdc 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Must be called before changing the nodes priority */ if (sched_engine->bump_inflight_request_prio)
sched_engine->bump_inflight_request_prio(from, prio);
sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
@@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node *node,
/* Defer (tasklet) submission until after all of our updates. */ if (sched_engine->kick_backend)
sched_engine->kick_backend(node_to_request(node), prio);
sched_engine->kick_backend(node_to_request(node), attr);
}
spin_unlock(&sched_engine->lock);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index b0a1b58c7893..24b9ac1c2ce2 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -177,13 +177,13 @@ struct i915_sched_engine { * @kick_backend: kick backend after a request's priority has changed */ void (*kick_backend)(const struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @bump_inflight_request_prio: update priority of an inflight request
*/ void (*bump_inflight_request_prio)(struct i915_request *rq,
int prio);
const struct i915_sched_attr *attr);
/**
- @retire_inflight_request_prio: indicate request is retired to
-- 2.30.2
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Introduce the concept of context nice value which matches the process nice.
We do this by extending the struct i915_sched_attr and add a helper (i915_sched_attr_priority) to be used to convert to effective priority when used by backend code and for priority sorting.
Context nice is then inherited from the process which creates the GEM context and utilised secondary to context priority, only when the latter has been left at the default setting, in order to avoid disturbing any application made choices of low and high (batch processing and maybe latency sensitive compositing). In those cases nice value adjusts the effective priority in the narrow band of -19 to +20 around I915_CONTEXT_DEFAULT_PRIORITY.
This means that in theory userspace using the context priority uapi directly has a wider range of possible adjustments (thought to be beneficial), but in practice that only applies to execlists platforms. With GuC there are only three priority buckets (less than zero is low priority, zero is normal and greater than zero is high) which therefore interact as expected with the nice adjustment. It makes the question of should the nice be a sub-range of GEM priorities, or stretched across the whole, a moot one.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 5 +++++ drivers/gpu/drm/i915/i915_scheduler.c | 12 ++++++++---- drivers/gpu/drm/i915/i915_scheduler.h | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_scheduler_types.h | 8 ++++++++ 8 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 8d4d687ab1d0..fed0733cb652 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) if (i915->params.enable_hangcheck) pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); pc->sched.priority = I915_PRIORITY_NORMAL; + pc->sched.nice = task_nice(current);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) { if (!HAS_EXECLISTS(i915)) { diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index e91d803a6453..1a02c65823a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
static int rq_prio(const struct i915_request *rq) { - return READ_ONCE(rq->sched.attr.priority); + return i915_request_priority(rq); }
static int effective_prio(const struct i915_request *rq) @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq, { struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine; + const int prio = i915_sched_attr_priority(attr); const struct i915_request *inflight; - const int prio = attr->priority;
/* * We only need to kick the tasklet once for the high priority diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b5883a4365ca..f258607685a2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, const struct i915_sched_attr *attr) { struct intel_context *ce = rq->context; - const int prio = attr->priority; + const int prio = i915_sched_attr_priority(attr); u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */ diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 79da5eca60af..a8c6f3a64895 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr, return x;
x += snprintf(buf + x, len - x, - " prio=%d", attr->priority); + " prio=%d nice=%d", attr->priority, attr->nice);
return x; } diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7bd9ed20623e..c2c4c344837e 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq, #define I915_WAIT_PRIORITY BIT(1) /* small priority bump for the request */ #define I915_WAIT_ALL BIT(2) /* used by i915_gem_object_wait() */
+static inline int i915_request_priority(const struct i915_request *rq) +{ + return i915_sched_attr_priority(&rq->sched.attr); +} + void i915_request_show(struct drm_printer *m, const struct i915_request *rq, const char *prefix, diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 534bab99fcdc..e75793e36454 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node, static void __i915_schedule(struct i915_sched_node *node, const struct i915_sched_attr *attr) { - const int prio = max(attr->priority, node->attr.priority); + const int prio = + max(i915_sched_attr_priority(attr), + i915_sched_attr_priority(&node->attr)); struct i915_sched_engine *sched_engine; struct i915_dependency *dep, *p; struct i915_dependency stack; @@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (node_signaled(p->signaler)) continue;
- if (prio > READ_ONCE(p->signaler->attr.priority)) + if (prio > i915_sched_attr_priority(&p->signaler->attr)) list_move_tail(&p->dfs_link, &dfs); } } @@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node, lockdep_assert_held(&sched_engine->lock);
/* Recheck after acquiring the engine->timeline.lock */ - if (prio <= node->attr.priority || node_signaled(node)) + if (prio <= i915_sched_attr_priority(&node->attr) || + node_signaled(node)) continue;
GEM_BUG_ON(node_to_request(node)->engine->sched_engine != @@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (sched_engine->bump_inflight_request_prio) sched_engine->bump_inflight_request_prio(from, attr);
- WRITE_ONCE(node->attr.priority, prio); + WRITE_ONCE(node->attr, *attr);
/* * Once the request is ready, it will be placed into the @@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node) void i915_sched_node_reinit(struct i915_sched_node *node) { node->attr.priority = I915_PRIORITY_INVALID; + node->attr.nice = 0; node->semaphores = 0; node->flags = 0;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 0b9b86af6c7f..75ccc9f55d14 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node); void i915_schedule(struct i915_request *request, const struct i915_sched_attr *attr);
+static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr) +{ + int prio = attr->priority; + + /* + * Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the + * nice setting. + */ + if (!prio) + prio = -attr->nice; + + return prio; +} + struct list_head * i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index 24b9ac1c2ce2..159237aa7609 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -29,6 +29,14 @@ struct i915_sched_attr { * The &drm_i915_private.kernel_context is assigned the lowest priority. */ int priority; + + /** + * @nice: context nice level + * + * Nice level follows the CPU scheduler nice value as set for the + * process owning the GPU context. + */ + int nice; };
/*
On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Introduce the concept of context nice value which matches the process nice.
We do this by extending the struct i915_sched_attr and add a helper (i915_sched_attr_priority) to be used to convert to effective priority when used by backend code and for priority sorting.
Context nice is then inherited from the process which creates the GEM context and utilised secondary to context priority, only when the latter has been left at the default setting, in order to avoid disturbing any application made choices of low and high (batch processing and maybe latency sensitive compositing). In those cases nice value adjusts the effective priority in the narrow band of -19 to +20 around I915_CONTEXT_DEFAULT_PRIORITY.
This means that in theory userspace using the context priority uapi directly has a wider range of possible adjustments (thought to be beneficial), but in practice that only applies to execlists platforms. With GuC there are only three priority buckets (less than zero is low priority, zero is normal and greater than zero is high) which therefore interact as expected with the nice adjustment. It makes the question of should the nice be a sub-range of GEM priorities, or stretched across the whole, a moot one.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 5 +++++ drivers/gpu/drm/i915/i915_scheduler.c | 12 ++++++++---- drivers/gpu/drm/i915/i915_scheduler.h | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_scheduler_types.h | 8 ++++++++ 8 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 8d4d687ab1d0..fed0733cb652 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) if (i915->params.enable_hangcheck) pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); pc->sched.priority = I915_PRIORITY_NORMAL;
pc->sched.nice = task_nice(current);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) { if (!HAS_EXECLISTS(i915)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index e91d803a6453..1a02c65823a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
static int rq_prio(const struct i915_request *rq) {
- return READ_ONCE(rq->sched.attr.priority);
- return i915_request_priority(rq);
}
static int effective_prio(const struct i915_request *rq) @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq, { struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine;
- const int prio = i915_sched_attr_priority(attr); const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b5883a4365ca..f258607685a2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, const struct i915_sched_attr *attr) { struct intel_context *ce = rq->context;
- const int prio = attr->priority;
const int prio = i915_sched_attr_priority(attr); u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 79da5eca60af..a8c6f3a64895 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr, return x;
x += snprintf(buf + x, len - x,
" prio=%d", attr->priority);
" prio=%d nice=%d", attr->priority, attr->nice);
return x;
} diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7bd9ed20623e..c2c4c344837e 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq, #define I915_WAIT_PRIORITY BIT(1) /* small priority bump for the request */ #define I915_WAIT_ALL BIT(2) /* used by i915_gem_object_wait() */
+static inline int i915_request_priority(const struct i915_request *rq) +{
- return i915_sched_attr_priority(&rq->sched.attr);
+}
void i915_request_show(struct drm_printer *m, const struct i915_request *rq, const char *prefix, diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 534bab99fcdc..e75793e36454 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node, static void __i915_schedule(struct i915_sched_node *node, const struct i915_sched_attr *attr) {
- const int prio = max(attr->priority, node->attr.priority);
- const int prio =
max(i915_sched_attr_priority(attr),
struct i915_sched_engine *sched_engine; struct i915_dependency *dep, *p; struct i915_dependency stack;i915_sched_attr_priority(&node->attr));
@@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (node_signaled(p->signaler)) continue;
if (prio > READ_ONCE(p->signaler->attr.priority))
} }if (prio > i915_sched_attr_priority(&p->signaler->attr)) list_move_tail(&p->dfs_link, &dfs);
@@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node, lockdep_assert_held(&sched_engine->lock);
/* Recheck after acquiring the engine->timeline.lock */
if (prio <= node->attr.priority || node_signaled(node))
if (prio <= i915_sched_attr_priority(&node->attr) ||
node_signaled(node)) continue;
GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
@@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (sched_engine->bump_inflight_request_prio) sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
WRITE_ONCE(node->attr, *attr);
/*
- Once the request is ready, it will be placed into the
@@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node) void i915_sched_node_reinit(struct i915_sched_node *node) { node->attr.priority = I915_PRIORITY_INVALID;
- node->attr.nice = 0; node->semaphores = 0; node->flags = 0;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 0b9b86af6c7f..75ccc9f55d14 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node); void i915_schedule(struct i915_request *request, const struct i915_sched_attr *attr);
+static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr) +{
- int prio = attr->priority;
- /*
* Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
* nice setting.
*/
- if (!prio)
prio = -attr->nice;
- return prio;
+}
struct list_head * i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index 24b9ac1c2ce2..159237aa7609 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -29,6 +29,14 @@ struct i915_sched_attr { * The &drm_i915_private.kernel_context is assigned the lowest priority. */ int priority;
- /**
* @nice: context nice level
*
* Nice level follows the CPU scheduler nice value as set for the
* process owning the GPU context.
*/
- int nice;
Same comment as the previous patch, I'd avoid adding a field to this structure as I don't think this will play with the DRM scheduler very well. If anything I think we should drop i915_sched_attr completely and just store these values directly in the request and pass around the request rather than i915_sched_attr.
Matt
};
/*
2.30.2
On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Introduce the concept of context nice value which matches the process nice.
We do this by extending the struct i915_sched_attr and add a helper (i915_sched_attr_priority) to be used to convert to effective priority when used by backend code and for priority sorting.
Context nice is then inherited from the process which creates the GEM context and utilised secondary to context priority, only when the latter has been left at the default setting, in order to avoid disturbing any application made choices of low and high (batch processing and maybe latency sensitive compositing). In those cases nice value adjusts the effective priority in the narrow band of -19 to +20 around I915_CONTEXT_DEFAULT_PRIORITY.
This means that in theory userspace using the context priority uapi directly has a wider range of possible adjustments (thought to be beneficial), but in practice that only applies to execlists platforms. With GuC there are only three priority buckets (less than zero is low priority, zero is normal and greater than zero is high) which therefore interact as expected with the nice adjustment. It makes the question of should the nice be a sub-range of GEM priorities, or stretched across the whole, a moot one.
Opps, sorry for the double reply to this patch - for some reason I thinking the below comment would be on the next patch.
The GuC + DRM scheduler actually has 4 priorities with the highest reserved for the KMD, so 3 user levels which is what I think you mean. So how would this work with only 3 levels?
The nice value can move a normal priority value to either low or high, right?
Normal to low seems fine but would moving to high be an issue?
I thought a high level was reserved for elevated user processes (e.g. a root user or a compositor)?
Would it be ok for a non-elevated user process to be the same priority as an elevated process?
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 5 +++++ drivers/gpu/drm/i915/i915_scheduler.c | 12 ++++++++---- drivers/gpu/drm/i915/i915_scheduler.h | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_scheduler_types.h | 8 ++++++++ 8 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 8d4d687ab1d0..fed0733cb652 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) if (i915->params.enable_hangcheck) pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); pc->sched.priority = I915_PRIORITY_NORMAL;
pc->sched.nice = task_nice(current);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) { if (!HAS_EXECLISTS(i915)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index e91d803a6453..1a02c65823a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
static int rq_prio(const struct i915_request *rq) {
- return READ_ONCE(rq->sched.attr.priority);
- return i915_request_priority(rq);
}
static int effective_prio(const struct i915_request *rq) @@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq, { struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine;
- const int prio = i915_sched_attr_priority(attr); const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b5883a4365ca..f258607685a2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, const struct i915_sched_attr *attr) { struct intel_context *ce = rq->context;
- const int prio = attr->priority;
const int prio = i915_sched_attr_priority(attr); u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 79da5eca60af..a8c6f3a64895 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr, return x;
x += snprintf(buf + x, len - x,
" prio=%d", attr->priority);
" prio=%d nice=%d", attr->priority, attr->nice);
return x;
} diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7bd9ed20623e..c2c4c344837e 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq, #define I915_WAIT_PRIORITY BIT(1) /* small priority bump for the request */ #define I915_WAIT_ALL BIT(2) /* used by i915_gem_object_wait() */
+static inline int i915_request_priority(const struct i915_request *rq) +{
- return i915_sched_attr_priority(&rq->sched.attr);
+}
void i915_request_show(struct drm_printer *m, const struct i915_request *rq, const char *prefix, diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 534bab99fcdc..e75793e36454 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node, static void __i915_schedule(struct i915_sched_node *node, const struct i915_sched_attr *attr) {
- const int prio = max(attr->priority, node->attr.priority);
- const int prio =
max(i915_sched_attr_priority(attr),
struct i915_sched_engine *sched_engine; struct i915_dependency *dep, *p; struct i915_dependency stack;i915_sched_attr_priority(&node->attr));
@@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (node_signaled(p->signaler)) continue;
if (prio > READ_ONCE(p->signaler->attr.priority))
} }if (prio > i915_sched_attr_priority(&p->signaler->attr)) list_move_tail(&p->dfs_link, &dfs);
@@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node, lockdep_assert_held(&sched_engine->lock);
/* Recheck after acquiring the engine->timeline.lock */
if (prio <= node->attr.priority || node_signaled(node))
if (prio <= i915_sched_attr_priority(&node->attr) ||
node_signaled(node)) continue;
GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
@@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (sched_engine->bump_inflight_request_prio) sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
WRITE_ONCE(node->attr, *attr);
/*
- Once the request is ready, it will be placed into the
@@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node) void i915_sched_node_reinit(struct i915_sched_node *node) { node->attr.priority = I915_PRIORITY_INVALID;
- node->attr.nice = 0; node->semaphores = 0; node->flags = 0;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 0b9b86af6c7f..75ccc9f55d14 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node); void i915_schedule(struct i915_request *request, const struct i915_sched_attr *attr);
+static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr) +{
- int prio = attr->priority;
- /*
* Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
* nice setting.
*/
- if (!prio)
prio = -attr->nice;
- return prio;
+}
struct list_head * i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index 24b9ac1c2ce2..159237aa7609 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -29,6 +29,14 @@ struct i915_sched_attr { * The &drm_i915_private.kernel_context is assigned the lowest priority. */ int priority;
- /**
* @nice: context nice level
*
* Nice level follows the CPU scheduler nice value as set for the
* process owning the GPU context.
*/
- int nice;
};
/*
2.30.2
On 06/10/2021 18:24, Matthew Brost wrote:
On Mon, Oct 04, 2021 at 03:36:49PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Introduce the concept of context nice value which matches the process nice.
We do this by extending the struct i915_sched_attr and add a helper (i915_sched_attr_priority) to be used to convert to effective priority when used by backend code and for priority sorting.
Context nice is then inherited from the process which creates the GEM context and utilised secondary to context priority, only when the latter has been left at the default setting, in order to avoid disturbing any application made choices of low and high (batch processing and maybe latency sensitive compositing). In those cases nice value adjusts the effective priority in the narrow band of -19 to +20 around I915_CONTEXT_DEFAULT_PRIORITY.
This means that in theory userspace using the context priority uapi directly has a wider range of possible adjustments (thought to be beneficial), but in practice that only applies to execlists platforms. With GuC there are only three priority buckets (less than zero is low priority, zero is normal and greater than zero is high) which therefore interact as expected with the nice adjustment. It makes the question of should the nice be a sub-range of GEM priorities, or stretched across the whole, a moot one.
Opps, sorry for the double reply to this patch - for some reason I thinking the below comment would be on the next patch.
The GuC + DRM scheduler actually has 4 priorities with the highest reserved for the KMD, so 3 user levels which is what I think you mean. So how would this work with only 3 levels?
The nice value can move a normal priority value to either low or high, right?
Yes. I was thinking about three user visible levels. The kernel one is reserved for internal stuff so I did not mention it.
Default, nice 0 = normal priority. Positive nice = low priority. Negative nice = high priority.
Same as mapping of -1024 to 1023 range works onto those three AFAIU. So I think it works.
Normal to low seems fine but would moving to high be an issue?
I thought a high level was reserved for elevated user processes (e.g. a root user or a compositor)?
Yes, and lowering the nice (negative nice) also needs CAP_SYS_NICE which is not available to user processes by default. So that matches i915 rules with going above default/normal.
Would it be ok for a non-elevated user process to be the same priority as an elevated process?
Not sure exactly what you have in mind but in principle yes. Same as it is okay for all current greater than zero priorities to map to a single priority in case of GuC.
Regards,
Tvrtko
Matt
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 +- drivers/gpu/drm/i915/i915_request.c | 2 +- drivers/gpu/drm/i915/i915_request.h | 5 +++++ drivers/gpu/drm/i915/i915_scheduler.c | 12 ++++++++---- drivers/gpu/drm/i915/i915_scheduler.h | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_scheduler_types.h | 8 ++++++++ 8 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 8d4d687ab1d0..fed0733cb652 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -257,6 +257,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) if (i915->params.enable_hangcheck) pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); pc->sched.priority = I915_PRIORITY_NORMAL;
pc->sched.nice = task_nice(current);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) { if (!HAS_EXECLISTS(i915)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index e91d803a6453..1a02c65823a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -250,7 +250,7 @@ static struct i915_priolist *to_priolist(struct rb_node *rb)
static int rq_prio(const struct i915_request *rq) {
- return READ_ONCE(rq->sched.attr.priority);
return i915_request_priority(rq); }
static int effective_prio(const struct i915_request *rq)
@@ -3221,8 +3221,8 @@ static void kick_execlists(const struct i915_request *rq, { struct intel_engine_cs *engine = rq->engine; struct i915_sched_engine *sched_engine = engine->sched_engine;
- const int prio = i915_sched_attr_priority(attr); const struct i915_request *inflight;
const int prio = attr->priority;
/*
- We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b5883a4365ca..f258607685a2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2417,7 +2417,7 @@ static void guc_bump_inflight_request_prio(struct i915_request *rq, const struct i915_sched_attr *attr) { struct intel_context *ce = rq->context;
- const int prio = attr->priority;
const int prio = i915_sched_attr_priority(attr); u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
/* Short circuit function */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 79da5eca60af..a8c6f3a64895 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1930,7 +1930,7 @@ static int print_sched_attr(const struct i915_sched_attr *attr, return x;
x += snprintf(buf + x, len - x,
" prio=%d", attr->priority);
" prio=%d nice=%d", attr->priority, attr->nice);
return x; }
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7bd9ed20623e..c2c4c344837e 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -399,6 +399,11 @@ long i915_request_wait(struct i915_request *rq, #define I915_WAIT_PRIORITY BIT(1) /* small priority bump for the request */ #define I915_WAIT_ALL BIT(2) /* used by i915_gem_object_wait() */
+static inline int i915_request_priority(const struct i915_request *rq) +{
- return i915_sched_attr_priority(&rq->sched.attr);
+}
- void i915_request_show(struct drm_printer *m, const struct i915_request *rq, const char *prefix,
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 534bab99fcdc..e75793e36454 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -155,7 +155,9 @@ lock_sched_engine(struct i915_sched_node *node, static void __i915_schedule(struct i915_sched_node *node, const struct i915_sched_attr *attr) {
- const int prio = max(attr->priority, node->attr.priority);
- const int prio =
max(i915_sched_attr_priority(attr),
struct i915_sched_engine *sched_engine; struct i915_dependency *dep, *p; struct i915_dependency stack;i915_sched_attr_priority(&node->attr));
@@ -209,7 +211,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (node_signaled(p->signaler)) continue;
if (prio > READ_ONCE(p->signaler->attr.priority))
} }if (prio > i915_sched_attr_priority(&p->signaler->attr)) list_move_tail(&p->dfs_link, &dfs);
@@ -247,7 +249,8 @@ static void __i915_schedule(struct i915_sched_node *node, lockdep_assert_held(&sched_engine->lock);
/* Recheck after acquiring the engine->timeline.lock */
if (prio <= node->attr.priority || node_signaled(node))
if (prio <= i915_sched_attr_priority(&node->attr) ||
node_signaled(node)) continue;
GEM_BUG_ON(node_to_request(node)->engine->sched_engine !=
@@ -257,7 +260,7 @@ static void __i915_schedule(struct i915_sched_node *node, if (sched_engine->bump_inflight_request_prio) sched_engine->bump_inflight_request_prio(from, attr);
WRITE_ONCE(node->attr.priority, prio);
WRITE_ONCE(node->attr, *attr);
/*
- Once the request is ready, it will be placed into the
@@ -305,6 +308,7 @@ void i915_sched_node_init(struct i915_sched_node *node) void i915_sched_node_reinit(struct i915_sched_node *node) { node->attr.priority = I915_PRIORITY_INVALID;
- node->attr.nice = 0; node->semaphores = 0; node->flags = 0;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 0b9b86af6c7f..75ccc9f55d14 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,20 @@ void i915_sched_node_fini(struct i915_sched_node *node); void i915_schedule(struct i915_request *request, const struct i915_sched_attr *attr);
+static inline int i915_sched_attr_priority(const struct i915_sched_attr *attr) +{
- int prio = attr->priority;
- /*
* Only allow I915_CONTEXT_DEFAULT_PRIORITY to be affected by the
* nice setting.
*/
- if (!prio)
prio = -attr->nice;
- return prio;
+}
- struct list_head * i915_sched_lookup_priolist(struct i915_sched_engine *sched_engine, int prio);
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h index 24b9ac1c2ce2..159237aa7609 100644 --- a/drivers/gpu/drm/i915/i915_scheduler_types.h +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h @@ -29,6 +29,14 @@ struct i915_sched_attr { * The &drm_i915_private.kernel_context is assigned the lowest priority. */ int priority;
/**
* @nice: context nice level
*
* Nice level follows the CPU scheduler nice value as set for the
* process owning the GPU context.
*/
int nice; };
/*
-- 2.30.2
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Connect i915 with the process nice change notifier so that our scheduling can react to runtime adjustments, on top of previously added nice value inheritance at context create time.
To achieve this we use the previously added map of clients per owning tasks in combination with the list of GEM contexts per client.
To avoid possibly unnecessary complications the updated context nice value will only apply to future submissions against the context.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/i915_drm_client.c | 31 ++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_drm_client.h | 3 +++ 2 files changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drm_client.c b/drivers/gpu/drm/i915/i915_drm_client.c index 82b9636482ef..e34c1228f65b 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.c +++ b/drivers/gpu/drm/i915/i915_drm_client.c @@ -7,10 +7,35 @@ #include <linux/slab.h> #include <linux/types.h>
+#include "gem/i915_gem_context.h" #include "i915_drm_client.h" #include "i915_gem.h" #include "i915_utils.h"
+static int +clients_notify(struct notifier_block *nb, unsigned long val, void *ptr) +{ + struct i915_drm_clients *clients = + container_of(nb, typeof(*clients), prio_notifier); + struct i915_drm_client *client; + + rcu_read_lock(); + read_lock(&clients->lock); + hash_for_each_possible(clients->tasks, client, node, (uintptr_t)ptr) { + struct i915_gem_context *ctx; + + if (client->owner != ptr) + continue; + + list_for_each_entry_rcu(ctx, &client->ctx_list, client_link) + ctx->sched.nice = (int)val; + } + read_unlock(&clients->lock); + rcu_read_unlock(); + + return NOTIFY_DONE; +} + void i915_drm_clients_init(struct i915_drm_clients *clients, struct drm_i915_private *i915) { @@ -21,6 +46,10 @@ void i915_drm_clients_init(struct i915_drm_clients *clients,
rwlock_init(&clients->lock); hash_init(clients->tasks); + + memset(&clients->prio_notifier, 0, sizeof(clients->prio_notifier)); + clients->prio_notifier.notifier_call = clients_notify; + register_user_nice_notifier(&clients->prio_notifier); }
struct i915_drm_client *i915_drm_client_add(struct i915_drm_clients *clients) @@ -75,6 +104,8 @@ void __i915_drm_client_free(struct kref *kref)
void i915_drm_clients_fini(struct i915_drm_clients *clients) { + unregister_user_nice_notifier(&clients->prio_notifier); + GEM_BUG_ON(!xa_empty(&clients->xarray)); xa_destroy(&clients->xarray); } diff --git a/drivers/gpu/drm/i915/i915_drm_client.h b/drivers/gpu/drm/i915/i915_drm_client.h index 42fd79f0558a..dda26aa42ac9 100644 --- a/drivers/gpu/drm/i915/i915_drm_client.h +++ b/drivers/gpu/drm/i915/i915_drm_client.h @@ -9,6 +9,7 @@ #include <linux/hashtable.h> #include <linux/kref.h> #include <linux/list.h> +#include <linux/notifier.h> #include <linux/rwlock.h> #include <linux/sched.h> #include <linux/spinlock.h> @@ -24,6 +25,8 @@ struct i915_drm_clients {
rwlock_t lock; DECLARE_HASHTABLE(tasks, 6); + + struct notifier_block prio_notifier; };
struct i915_drm_client {
dri-devel@lists.freedesktop.org