On Tue, Sep 07, 2021 at 10:19:10AM -0700, Matt Roper wrote:
The reset domain is shared between render and all compute engines, so resetting one will affect the others.
Note: Before performing a reset on an RCS or CCS engine, the GuC will attempt to preempt-to-idle the other non-hung RCS/CCS engines to avoid impacting other clients (since some shared modules will be reset). If other engines are executing non-preemptable workloads, the impact is unavoidable and some work may be lost.
Bspec: 52549 Original-patch-by: Michel Thierry Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Vinay Belgaumkar vinay.belgaumkar@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Aravind Iddamsetty aravind.iddamsetty@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
Do we have igts validating this all properly?
Specifically that the reset stats are incremented correctly for guilty respectively victimized contexts.
This is necessary if it doesn't exist yet.
Also you need a patch set here that fixes up the igts which have wrong assumptions about context isolation. -Daniel
drivers/gpu/drm/i915/gt/intel_reset.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 91200c43951f..30598c1d070c 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -507,6 +507,10 @@ static int gen11_reset_engines(struct intel_gt *gt, [VECS1] = GEN11_GRDOM_VECS2, [VECS2] = GEN11_GRDOM_VECS3, [VECS3] = GEN11_GRDOM_VECS4,
[CCS0] = GEN11_GRDOM_RENDER,
[CCS1] = GEN11_GRDOM_RENDER,
[CCS2] = GEN11_GRDOM_RENDER,
}; struct intel_engine_cs *engine; intel_engine_mask_t tmp;[CCS3] = GEN11_GRDOM_RENDER,
-- 2.25.4