On Wed, Jun 16, 2021 at 01:28:33PM +0100, Matthew Auld wrote:
The first tracepoint for a request is trace_dma_fence_init which is called in the ctor before we have properly setup the request->engine. So if it's a non-recycled request the rq->engine might be NULL, or some garbage value, which leads to a crash.
I'd hammer it more in here that we cannot move the dma_fence_init out of the slab constructor (aka ctor), because that breaks the magic request recycling through SLAB_TYPESAFE_BY_RCU. You do explain it all, but in a rather dense fashion. More verbosity here would be good.
Since we are not permitted to use kmem_cache_zalloc() here with
This part I'm not really clear on ... what would zalloc break? Ideally references to the commits/bugs/functions in the code that would blow up would be good here.
SLAB_TYPESAFE_BY_RCU, one approach is simply to return DRIVER_NAME. We can then revisit this later if we decide to get rid of SLAB_TYPESAFE_BY_RCU.
Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Michael Mason michael.w.mason@intel.com Cc: Daniel Vetter daniel@ffwll.ch
As a stop-gab the code change looks reasonable, but I think the commit message needs work and much more of the history/background we're hitting here.
Thanks, Daniel
drivers/gpu/drm/i915/i915_request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 1014c71cf7f5..55fa94bde22e 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -61,7 +61,7 @@ static struct i915_global_request {
static const char *i915_fence_get_driver_name(struct dma_fence *fence) {
- return dev_name(to_request(fence)->engine->i915->drm.dev);
- return DRIVER_NAME;
}
static const char *i915_fence_get_timeline_name(struct dma_fence *fence)
2.26.3