Re: [PATCH] drm/scheduler: fix setting the priorty for entities - bisected

8 Aug 2018


      Am 06.08.2018 02:13, schrieb Dieter Nützel:
...
Am 04.08.2018 06:18, schrieb Dieter Nützel:
...
Am 04.08.2018 06:12, schrieb Dieter Nützel:
...
Am 04.08.2018 05:27, schrieb Dieter Nützel:
...
Am 03.08.2018 13:09, schrieb Christian König:
...
Am 03.08.2018 um 03:08 schrieb Dieter Nützel:
...
Hello Christian, AMD guys,
this one _together_ with these series
[PATCH 1/7] drm/amdgpu: use new scheduler load balancing for VMs
https://lists.freedesktop.org/archives/amd-gfx/2018-August/024802.html
on top of
amd-staging-drm-next 53d5f1e4a6d9
freeze whole system (Intel Xeon X3470, RX580) during _first_ mouse 
move.
Same for sddm login or first move in KDE Plasma 5.
NO logs so far. - Expected?
Not even remotely, can you double check which patch from the 
"[PATCH
1/7] drm/amdgpu: use new scheduler load balancing for VMs" series 
is
causing the issue?
Ups,
_both_ 'series' on top of
bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem
definitions into amdgpu_gem header
works without a hitch.
But I have new (latest) µcode from openSUSE Tumbleweed.
kernel-firmware-20180730-35.1.src.rpm
Tested-by: Dieter Nützel Dieter@nuetzel-hh.de
I take this back.
Last much longer.
Mouse freeze.
Could grep a dmesg with remote phone ;-)
See the attachment.
Dieter
Argh, shi...
wrong dmesg version.
Should be this one. (For sure...)
Puh,
this took some time...
During the 'last' git bisect run => 'first bad commit is' I got next 
freeze.
But I could get a new dmesg.log file per remote phone (see attachment).
git bisect log show this:
SOURCE/amd-staging-drm-next> git bisect log
git bisect start
# good: [adebfff9c806afe1143d69a0174d4580cd27b23d] drm/scheduler: fix
setting the priorty for entities
git bisect good adebfff9c806afe1143d69a0174d4580cd27b23d
# bad: [43202e67a4e6fcb0e6b773e8eb1ed56e1721e882] drm/amdgpu: use
entity instead of ring for CS
git bisect bad 43202e67a4e6fcb0e6b773e8eb1ed56e1721e882
# bad: [9867b3a6ddfb73ee3105871541053f8e49949478] drm/amdgpu: use
scheduler load balancing for compute CS
git bisect bad 9867b3a6ddfb73ee3105871541053f8e49949478
# good: [5d097a4591aa2be16b21adbaa19a8abb76e47ea1] drm/amdgpu: use
scheduler load balancing for SDMA CS
git bisect good 5d097a4591aa2be16b21adbaa19a8abb76e47ea1
# first bad commit: [9867b3a6ddfb73ee3105871541053f8e49949478]
drm/amdgpu: use scheduler load balancing for compute CS
git log --oneline
5d097a4591aa (HEAD,
refs/bisect/good-5d097a4591aa2be16b21adbaa19a8abb76e47ea1) drm/amdgpu:
use scheduler load balancing for SDMA CS
d12ae5172f1f drm/amdgpu: use new scheduler load balancing for VMs
adebfff9c806
(refs/bisect/good-adebfff9c806afe1143d69a0174d4580cd27b23d)
drm/scheduler: fix setting the priorty for entities
bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem
definitions into amdgpu_gem header
5031ae5f9e5c drm/amdgpu: move psp macro into amdgpu_psp header
[-]
I'm not really sure that
drm/amdgpu: use scheduler load balancing for compute CS
is the offender.
One step earlier could it be, too.
drm/amdgpu: use scheduler load balancing for SDMA CS
I'm try running with the SDMA CS patch for the next days.
If you need more ask!
Hello Christian,
running the second day _without_ the 2. patch
[2/7] drm/amdgpu: use scheduler load balancing for SDMA CS
my system is stable, again.
To be clear.
I've now only #1 applied on top of amd-staging-drm-next.
'This one' is still in.
So we should switching the thread.
Dieter
...
...
...
...
...
Thanks,
Christian.
...
Greetings,
Dieter
Am 01.08.2018 16:27, schrieb Christian König:
> Since we now deal with multiple rq we need to update all of them, 
> not
> just the current one.
> 
> Signed-off-by: Christian König christian.koenig@amd.com
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   |  3 +--
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 36 
> ++++++++++++++++++++-----------
>  include/drm/gpu_scheduler.h               |  5 ++---
>  3 files changed, 26 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index df6965761046..9fcc14e2dfcf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -407,12 +407,11 @@ void amdgpu_ctx_priority_override(struct 
> amdgpu_ctx *ctx,
>      for (i = 0; i < adev->num_rings; i++) {
>          ring = adev->rings[i];
>          entity = &ctx->rings[i].entity;
> -        rq = &ring->sched.sched_rq[ctx_prio];
> 
>          if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
>              continue;
> 
> -        drm_sched_entity_set_rq(entity, rq);
> +        drm_sched_entity_set_priority(entity, ctx_prio);
>      }
>  }
> 
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 05dc6ecd4003..85908c7f913e 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -419,29 +419,39 @@ static void 
> drm_sched_entity_clear_dep(struct
> dma_fence *f, struct dma_fence_cb
>  }
> 
>  /**
> - * drm_sched_entity_set_rq - Sets the run queue for an entity
> + * drm_sched_entity_set_rq_priority - helper for 
> drm_sched_entity_set_priority
> + */
> +static void drm_sched_entity_set_rq_priority(struct drm_sched_rq 
> **rq,
> +                         enum drm_sched_priority priority)
> +{
> +    *rq = &(*rq)->sched->sched_rq[priority];
> +}
> +
> +/**
> + * drm_sched_entity_set_priority - Sets priority of the entity
>   *
>   * @entity: scheduler entity
> - * @rq: scheduler run queue
> + * @priority: scheduler priority
>   *
> - * Sets the run queue for an entity and removes the entity from 
> the previous
> - * run queue in which was present.
> + * Update the priority of runqueus used for the entity.
>   */
> -void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
> -                 struct drm_sched_rq *rq)
> +void drm_sched_entity_set_priority(struct drm_sched_entity 
> *entity,
> +                   enum drm_sched_priority priority)
>  {
> -    if (entity->rq == rq)
> -        return;
> -
> -    BUG_ON(!rq);
> +    unsigned int i;
> 
>      spin_lock(&entity->rq_lock);
> +
> +    for (i = 0; i < entity->num_rq_list; ++i)
> + drm_sched_entity_set_rq_priority(&entity->rq_list[i], 
> priority);
> +
>      drm_sched_rq_remove_entity(entity->rq, entity);
> -    entity->rq = rq;
> -    drm_sched_rq_add_entity(rq, entity);
> +    drm_sched_entity_set_rq_priority(&entity->rq, priority);
> +    drm_sched_rq_add_entity(entity->rq, entity);
> +
>      spin_unlock(&entity->rq_lock);
>  }
> -EXPORT_SYMBOL(drm_sched_entity_set_rq);
> +EXPORT_SYMBOL(drm_sched_entity_set_priority);
> 
>  /**
>   * drm_sched_dependency_optimized
> diff --git a/include/drm/gpu_scheduler.h 
> b/include/drm/gpu_scheduler.h
> index 0c4cfe689d4c..22c0f88f7d8f 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -298,9 +298,8 @@ void drm_sched_entity_fini(struct 
> drm_sched_entity *entity);
>  void drm_sched_entity_destroy(struct drm_sched_entity *entity);
>  void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
>                     struct drm_sched_entity *entity);
> -void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
> -                 struct drm_sched_rq *rq);
> -
> +void drm_sched_entity_set_priority(struct drm_sched_entity 
> *entity,
> +                   enum drm_sched_priority priority);
>  struct drm_sched_fence *drm_sched_fence_create(
>      struct drm_sched_entity *s_entity, void *owner);
>  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH] drm/scheduler: fix setting the priorty for entities - bisected