Am 06.08.2018 02:13, schrieb Dieter Nützel:
Am 04.08.2018 06:18, schrieb Dieter Nützel:
Am 04.08.2018 06:12, schrieb Dieter Nützel:
Am 04.08.2018 05:27, schrieb Dieter Nützel:
Am 03.08.2018 13:09, schrieb Christian König:
Am 03.08.2018 um 03:08 schrieb Dieter Nützel:
Hello Christian, AMD guys,
this one _together_ with these series [PATCH 1/7] drm/amdgpu: use new scheduler load balancing for VMs https://lists.freedesktop.org/archives/amd-gfx/2018-August/024802.html
on top of amd-staging-drm-next 53d5f1e4a6d9
freeze whole system (Intel Xeon X3470, RX580) during _first_ mouse move. Same for sddm login or first move in KDE Plasma 5. NO logs so far. - Expected?
Not even remotely, can you double check which patch from the "[PATCH 1/7] drm/amdgpu: use new scheduler load balancing for VMs" series is causing the issue?
Ups,
_both_ 'series' on top of
bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem definitions into amdgpu_gem header
works without a hitch.
But I have new (latest) µcode from openSUSE Tumbleweed. kernel-firmware-20180730-35.1.src.rpm
Tested-by: Dieter Nützel Dieter@nuetzel-hh.de
I take this back.
Last much longer. Mouse freeze. Could grep a dmesg with remote phone ;-)
See the attachment. Dieter
Argh, shi... wrong dmesg version.
Should be this one. (For sure...)
Puh,
this took some time... During the 'last' git bisect run => 'first bad commit is' I got next freeze. But I could get a new dmesg.log file per remote phone (see attachment).
git bisect log show this:
SOURCE/amd-staging-drm-next> git bisect log git bisect start # good: [adebfff9c806afe1143d69a0174d4580cd27b23d] drm/scheduler: fix setting the priorty for entities git bisect good adebfff9c806afe1143d69a0174d4580cd27b23d # bad: [43202e67a4e6fcb0e6b773e8eb1ed56e1721e882] drm/amdgpu: use entity instead of ring for CS git bisect bad 43202e67a4e6fcb0e6b773e8eb1ed56e1721e882 # bad: [9867b3a6ddfb73ee3105871541053f8e49949478] drm/amdgpu: use scheduler load balancing for compute CS git bisect bad 9867b3a6ddfb73ee3105871541053f8e49949478 # good: [5d097a4591aa2be16b21adbaa19a8abb76e47ea1] drm/amdgpu: use scheduler load balancing for SDMA CS git bisect good 5d097a4591aa2be16b21adbaa19a8abb76e47ea1 # first bad commit: [9867b3a6ddfb73ee3105871541053f8e49949478] drm/amdgpu: use scheduler load balancing for compute CS
git log --oneline 5d097a4591aa (HEAD, refs/bisect/good-5d097a4591aa2be16b21adbaa19a8abb76e47ea1) drm/amdgpu: use scheduler load balancing for SDMA CS d12ae5172f1f drm/amdgpu: use new scheduler load balancing for VMs adebfff9c806 (refs/bisect/good-adebfff9c806afe1143d69a0174d4580cd27b23d) drm/scheduler: fix setting the priorty for entities bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem definitions into amdgpu_gem header 5031ae5f9e5c drm/amdgpu: move psp macro into amdgpu_psp header [-]
I'm not really sure that drm/amdgpu: use scheduler load balancing for compute CS is the offender.
One step earlier could it be, too. drm/amdgpu: use scheduler load balancing for SDMA CS
I'm try running with the SDMA CS patch for the next days.
If you need more ask!
Hello Christian,
running the second day _without_ the 2. patch [2/7] drm/amdgpu: use scheduler load balancing for SDMA CS my system is stable, again.
To be clear. I've now only #1 applied on top of amd-staging-drm-next. 'This one' is still in. So we should switching the thread.
Dieter
Thanks, Christian.
Greetings, Dieter
Am 01.08.2018 16:27, schrieb Christian König: > Since we now deal with multiple rq we need to update all of them, > not > just the current one. > > Signed-off-by: Christian König christian.koenig@amd.com > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 3 +-- > drivers/gpu/drm/scheduler/gpu_scheduler.c | 36 > ++++++++++++++++++++----------- > include/drm/gpu_scheduler.h | 5 ++--- > 3 files changed, 26 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > index df6965761046..9fcc14e2dfcf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > @@ -407,12 +407,11 @@ void amdgpu_ctx_priority_override(struct > amdgpu_ctx *ctx, > for (i = 0; i < adev->num_rings; i++) { > ring = adev->rings[i]; > entity = &ctx->rings[i].entity; > - rq = &ring->sched.sched_rq[ctx_prio]; > > if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) > continue; > > - drm_sched_entity_set_rq(entity, rq); > + drm_sched_entity_set_priority(entity, ctx_prio); > } > } > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c > b/drivers/gpu/drm/scheduler/gpu_scheduler.c > index 05dc6ecd4003..85908c7f913e 100644 > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c > @@ -419,29 +419,39 @@ static void > drm_sched_entity_clear_dep(struct > dma_fence *f, struct dma_fence_cb > } > > /** > - * drm_sched_entity_set_rq - Sets the run queue for an entity > + * drm_sched_entity_set_rq_priority - helper for > drm_sched_entity_set_priority > + */ > +static void drm_sched_entity_set_rq_priority(struct drm_sched_rq > **rq, > + enum drm_sched_priority priority) > +{ > + *rq = &(*rq)->sched->sched_rq[priority]; > +} > + > +/** > + * drm_sched_entity_set_priority - Sets priority of the entity > * > * @entity: scheduler entity > - * @rq: scheduler run queue > + * @priority: scheduler priority > * > - * Sets the run queue for an entity and removes the entity from > the previous > - * run queue in which was present. > + * Update the priority of runqueus used for the entity. > */ > -void drm_sched_entity_set_rq(struct drm_sched_entity *entity, > - struct drm_sched_rq *rq) > +void drm_sched_entity_set_priority(struct drm_sched_entity > *entity, > + enum drm_sched_priority priority) > { > - if (entity->rq == rq) > - return; > - > - BUG_ON(!rq); > + unsigned int i; > > spin_lock(&entity->rq_lock); > + > + for (i = 0; i < entity->num_rq_list; ++i) > + drm_sched_entity_set_rq_priority(&entity->rq_list[i], > priority); > + > drm_sched_rq_remove_entity(entity->rq, entity); > - entity->rq = rq; > - drm_sched_rq_add_entity(rq, entity); > + drm_sched_entity_set_rq_priority(&entity->rq, priority); > + drm_sched_rq_add_entity(entity->rq, entity); > + > spin_unlock(&entity->rq_lock); > } > -EXPORT_SYMBOL(drm_sched_entity_set_rq); > +EXPORT_SYMBOL(drm_sched_entity_set_priority); > > /** > * drm_sched_dependency_optimized > diff --git a/include/drm/gpu_scheduler.h > b/include/drm/gpu_scheduler.h > index 0c4cfe689d4c..22c0f88f7d8f 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -298,9 +298,8 @@ void drm_sched_entity_fini(struct > drm_sched_entity *entity); > void drm_sched_entity_destroy(struct drm_sched_entity *entity); > void drm_sched_entity_push_job(struct drm_sched_job *sched_job, > struct drm_sched_entity *entity); > -void drm_sched_entity_set_rq(struct drm_sched_entity *entity, > - struct drm_sched_rq *rq); > - > +void drm_sched_entity_set_priority(struct drm_sched_entity > *entity, > + enum drm_sched_priority priority); > struct drm_sched_fence *drm_sched_fence_create( > struct drm_sched_entity *s_entity, void *owner); > void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel