From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Trying to capture uninitialised engines when we wedged on init ends in tears. Skip that together with uC capture, since failure to initialise the latter can actually be one of the reasons for wedging on init.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2a2d7643b551..aa2b3aad9643 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) }
gt_record_info(error->gt); - gt_record_engines(error->gt, engine_mask, compress);
- if (INTEL_INFO(i915)->has_gt_uc) - error->gt->uc = gt_record_uc(error->gt, compress); + if (!intel_gt_has_unrecoverable_error(gt)) { + gt_record_engines(error->gt, engine_mask, compress); + + if (INTEL_INFO(i915)->has_gt_uc) + error->gt->uc = gt_record_uc(error->gt, + compress); + }
i915_vma_capture_finish(error->gt, compress);
On Tue, 9 Nov 2021 at 12:20, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Trying to capture uninitialised engines when we wedged on init ends in tears. Skip that together with uC capture, since failure to initialise the latter can actually be one of the reasons for wedging on init.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
This fixes the issue with missing GuC wedging the GPU and then blowing up when trying to use the driver?
Reviewed-by: Matthew Auld matthew.auld@intel.com
drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2a2d7643b551..aa2b3aad9643 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) }
gt_record_info(error->gt);
gt_record_engines(error->gt, engine_mask, compress);
if (INTEL_INFO(i915)->has_gt_uc)
error->gt->uc = gt_record_uc(error->gt, compress);
if (!intel_gt_has_unrecoverable_error(gt)) {
gt_record_engines(error->gt, engine_mask, compress);
if (INTEL_INFO(i915)->has_gt_uc)
error->gt->uc = gt_record_uc(error->gt,
compress);
} i915_vma_capture_finish(error->gt, compress);
-- 2.30.2
On 10/11/2021 10:48, Matthew Auld wrote:
On Tue, 9 Nov 2021 at 12:20, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Trying to capture uninitialised engines when we wedged on init ends in tears. Skip that together with uC capture, since failure to initialise the latter can actually be one of the reasons for wedging on init.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
This fixes the issue with missing GuC wedging the GPU and then blowing up when trying to use the driver?
Probably does not blow up when using the driver, but definitely does when accessing error state. Someone suggested it would instead be better to call i915_disable_error_state from wedge on init/fini, and I think indeed it would, so I plan to send v2 looking like that.
Regards,
Tvrtko
Reviewed-by: Matthew Auld matthew.auld@intel.com
drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2a2d7643b551..aa2b3aad9643 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) }
gt_record_info(error->gt);
gt_record_engines(error->gt, engine_mask, compress);
if (INTEL_INFO(i915)->has_gt_uc)
error->gt->uc = gt_record_uc(error->gt, compress);
if (!intel_gt_has_unrecoverable_error(gt)) {
gt_record_engines(error->gt, engine_mask, compress);
if (INTEL_INFO(i915)->has_gt_uc)
error->gt->uc = gt_record_uc(error->gt,
compress);
} i915_vma_capture_finish(error->gt, compress);
-- 2.30.2
dri-devel@lists.freedesktop.org