On Fri, Dec 12, 2014 at 5:57 AM, Jani Nikula jani.nikula@linux.intel.com wrote:
On Fri, 12 Dec 2014, Chris Wilson chris@chris-wilson.co.uk wrote:
On Fri, Dec 12, 2014 at 08:17:23AM +0100, Daniel Vetter wrote:
On Thu, Dec 11, 2014 at 06:18:12PM -0500, Rob Clark wrote:
Many distro's have mechanism in place to collect and automatically file bugs for failed WARN()s. And since every third line in i915 is a WARN()
Come on Rob, that's not necessary.
sorry, I didn't intend that to be mean, so much as a bit tongue-in-cheek :-P
i915 currently leads the rhel7 and fedora abrt leaderboard, but I realized if you divide the # of abrts by the # of WARN's in the driver code, i915 looks a lot better :-P
it generates quite a lot of noise which is somewhat disconcerting to the end user.
Separate out the internal hw-is-in-the-state-I-expected checks into I915_WARN()s and allow configuration via i915.verbose_checks module param about whether this will generate a full blown stacktrace or just DRM_ERROR().
Signed-off-by: Rob Clark robdclark@gmail.com
Yeah I guess makes sense, although I still claim that these are as much "we've lost track of shit" bugs as when a refcount underflows or a pointer is NULL when it shouldn't. But I also agree that we've done a stellar job this year at not locking at these bugs, so meh.
How about making the state checker WARNs conditional on drm.debug&2? The premise is that they are often tantalising unhelpful without the debug log, so only show them when we have both?
As I suggested in an internal mail just days ago, make the checks emit a DRM_ERROR normally (but do log something!), and a WARN if drm.debug & DRM_UT_KMS (or DRM_UT_DRIVER like Chris suggests) holds.
I object to adding new kernel parameters for this. We have enough, and our bug fixing efforts really don't need another round of "oh please *also* enable this new parameter we added".
I did kinda want to keep it as a separate param (or at least a separate drm.debug bit, but I preferred a param as we could more easily keep the default to 'true'), since for rawhide and people running their own kernels I do still want to get this "things-are-not-quite-in-the-state-I-expected" feedback to you. But at any rate, as long as we separate out the hw-state-check WARNs from the actually something-is-about-to-catch-fire WARNs, it is easy enough to patch the macro definitions in a distro kernel.
I had some half-baked idea that in the DRM_ERROR case, the message dumped out should contain some string we can search for, so that we could eventually have some "enable sending anonymous driver feedback" type option in the distro, which would still scan the kernel logs for this string and upload feedback via some non-abrt mechanism. (And so, in case the user is actually seeing a problem, when we ask them to attach logs, we still easily see this information. I think in most cases the full callstack doesn't add too much value, so much as knowing what assumptions were broken.) No real idea how that would work from the infrastructure side, so I didn't try to add anything yet, but seems like it could be useful.
BR, -R
BR, Jani.
-Chris
-- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Jani Nikula, Intel Open Source Technology Center