https://bugs.freedesktop.org/show_bug.cgi?id=99130
Bug ID: 99130 Summary: [IVB] CONFIG_PROVE_LOCKING=y causes igt/gem_exec/basic-wait-all (and possibly others) to fail Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: All Status: NEW Severity: normal Priority: medium Component: IGT Assignee: dri-devel@lists.freedesktop.org Reporter: dorota.czaplejewicz@collabora.co.uk
Created attachment 128524 --> https://bugs.freedesktop.org/attachment.cgi?id=128524&action=edit basic-wait-all log
Kernels built with the default CI config (e.g. https://intel-gfx-ci.01.org/CI/CI_DRM_1968/kernel.config.bz2 ) will trigger failures in the test igt/gem_exec/basic-wait-all The failures will rarely appear immediately; for faithful reproduction, running the test several times consecutively is needed (at least 4 times). Afterwards, the failures will be constant.
Changing CONFIG_PROVE_LOCKING from "=y" to "=n" will stop the failures.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #1 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128525 --> https://bugs.freedesktop.org/attachment.cgi?id=128525&action=edit basic-wait-all dmesg log
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #2 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Software: Fedora 24 kernel revision: 5415795 (drm-tip: 2016y-12m-16d-21h-26m-45s UTC integration manifest) kernel config: https://intel-gfx-ci.01.org/CI/CI_DRM_1968/kernel.config.bz2
Hardware: Intel i7-3770
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #3 from Chris Wilson chris@chris-wilson.co.uk --- Looks like the timeout value isn't being computed properly (possibly just that too much time is unaccounted?). Can you apply:
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1b2f18d0ca2b..dfe751b350ad 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3019,6 +3019,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { struct drm_i915_gem_wait *args = data; struct drm_i915_gem_object *obj; + u64 prev_timeout = args->timeout_ns; ktime_t start; long ret;
@@ -3042,6 +3043,12 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file) args->timeout_ns = 0; }
+ pr_info("%s: timeout %d.%09d -> %d.%09d\n", __func__, + prev_timeout < 0 ? -1 : ns_to_timespec(prev_timeout).tv_sec, + prev_timeout < 0 ? 0 : ns_to_timespec(prev_timeout).tv_nsec, + args->timeout_ns < 0 ? -1 : ns_to_timespec(args->timeout_ns).tv_sec, + args->timeout_ns < 0 ? 0 : ns_to_timespec(args->timeout_ns).tv_nsec); + i915_gem_object_put(obj); return ret; }
and get the output for a failure?
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #4 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128544 --> https://bugs.freedesktop.org/attachment.cgi?id=128544&action=edit Dmesg log with applied patch from comment #3
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #5 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- The same config option triggers failures in a similar way in:
igt/gem_sync/basic-each igt/gem_sync/basic-store-each
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #6 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128545 --> https://bugs.freedesktop.org/attachment.cgi?id=128545&action=edit igt/gem_sync/basic-each log
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #7 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128546 --> https://bugs.freedesktop.org/attachment.cgi?id=128546&action=edit igt/gem_sync/basic-each - dmesg (truncated beginning)
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #8 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128547 --> https://bugs.freedesktop.org/attachment.cgi?id=128547&action=edit igt/gem_sync/basic-store-each - log
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #9 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128548 --> https://bugs.freedesktop.org/attachment.cgi?id=128548&action=edit igt/gem_sync/basic-store-each - dmesg (truncated beginning)
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #10 from Chris Wilson chris@chris-wilson.co.uk --- [ 154.074065] [drm:drm_ioctl] pid=2323, dev=0xe200, auth=1, I915_GEM_WAIT [ 154.074748] i915_gem_wait_ioctl: timeout 0.500000000 -> 0.499993351 [ 154.075390] [drm:drm_ioctl] ret = -512 ... [ 155.072528] [drm:drm_ioctl] pid=2323, dev=0xe200, auth=1, I915_GEM_WAIT [ 155.073234] i915_gem_wait_ioctl: timeout 0.488539596 -> 0.488535373 [ 155.073916] [drm:drm_ioctl] ret = -512 [ 155.074605] [drm:drm_ioctl] pid=2323, dev=0xe200, auth=1, I915_GEM_WAIT [ 155.075307] i915_gem_wait_ioctl: timeout 0.488535373 -> 0.488532529
Walltime of 1s, but actual wait time of ~12ms.
Each ioctl is taking ~1.4ms (inside drmIoctl) with a further interval of ~0.7ms between ioctls. And we account for 3us.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #11 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Created attachment 128606 --> https://bugs.freedesktop.org/attachment.cgi?id=128606&action=edit ftrace call graph log for i915_gem_wait_ioctl
The call log capturing i915_gem_wait_ioctl invokation and overhead. The time of ioctl itself (20ns) is dwarfed by a slowdown somewhere in console_unlock() related to printk() calls (650ns)
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #12 from Chris Wilson chris@chris-wilson.co.uk --- cat /proc/consoles?
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #13 from Chris Wilson chris@chris-wilson.co.uk --- Also trying with CONFIG_DRM_FBDEV_EMULATION unset.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #14 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- # cat /proc/consoles tty0 -WU (EC p ) 4:1
Upon further digging, the slow stack *seems* like this (may be wrong, the tracing tool is less useful with huge numbers of calls):
console_unlock() call_console_drivers.isra.0.constprop.15() vt_console_print() lf() scrup() fbcon_scroll() fbcon_cursor() fbcon_redraw.isra.17()
Inside of this, the delay is split up among many shorter calls, most time occupied by fbcon_putcs -> bit_putcs -> cfb_imageblit.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #15 from Chris Wilson chris@chris-wilson.co.uk --- That's us! Woohoo \o/
What does cat /sys/kernel/debug/dri/0/i915_gem_framebuffer report? Mainly interested in knowing if you have a tiled + fenced fbcon.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #16 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Here it is:
$ sudo cat /sys/kernel/debug/dri/0/i915_gem_framebuffer
fbcon size: 1024 x 768, depth 24, 32 bpp, modifier 0x0, refcount 1, obj ffff880221116200: p 3072KiB 41 00 uncached (pinned x 1) (display) (ggtt offset: 00080000, size: 00300000, type: 0)
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #17 from Chris Wilson chris@chris-wilson.co.uk --- Try:
diff --git a/lib/igt_core.c b/lib/igt_core.c index a18a728c..403b9423 100644 --- a/lib/igt_core.c +++ b/lib/igt_core.c @@ -67,6 +67,7 @@
#include "igt_core.h" #include "igt_aux.h" +#include "igt_sysfs.h"
#ifdef HAVE_LIBGEN_H #include <libgen.h> /* for basename() on Solaris */ @@ -512,6 +513,7 @@ static void common_exit_handler(int sig) { if (!igt_only_list_subtests()) { low_mem_killer_disable(false); + kick_fbcon(true); }
/* When not killed by a signal check that igt_exit() has been properly @@ -731,6 +733,7 @@ out: exit(ret == -1 ? 0 : IGT_EXIT_INVALID);
if (!list_subtests) { + kick_fbcon(false); kmsg(KERN_INFO "[IGT] %s: executing\n", command_str); print_version();
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #18 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- After adding the kick_fbcon() calls, the problem goes away, even without rebooting.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #19 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- Should this be turned into a patch for igt or should I try to identify the underlying issue causing the slowness?
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #20 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- I just realized the patch was merged. Should this bug be considered fixed now?
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #21 from Chris Wilson chris@chris-wilson.co.uk --- I kept the bug around in case we wanted to dig deeper into what the slowdown actually was. If it's below the level of care, lets close and move on.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
--- Comment #22 from Dorota Czaplejewicz dorota.czaplejewicz@collabora.co.uk --- I haven't seen the underlying slowness causing problems anywhere else yet. Perhaps we can close this and reopen if it reappears.
https://bugs.freedesktop.org/show_bug.cgi?id=99130
Chris Wilson chris@chris-wilson.co.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |WORKSFORME Status|NEW |RESOLVED
https://bugs.freedesktop.org/show_bug.cgi?id=99130
Elizabeth elizabethx.de.la.torre.mena@intel.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #23 from Elizabeth elizabethx.de.la.torre.mena@intel.com --- Closing. Please reopen if needed. Thanks.
dri-devel@lists.freedesktop.org