Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine
/** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout)
thanks /Monk
sorry, I have some mistake in previous thread, correct it as followings.
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace() it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence;
old_fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(obj));
if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(&obj->seq); preempt_enable();
dma_fence_put(old_fence); return; }
if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } }
see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put.
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian Subject: reservation questions
Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine
/** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout)
thanks /Monk
Am 06.03.2018 um 10:56 schrieb Liu, Monk:
sorry, I have some mistake in previous thread, correct it as followings.
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, *and put all shared fence from obj->fence without waiting signaling.* (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
The exclusive fence replaces all shared fences, e.g. the exclusive operation needs to wait for all shared fences before it can access the protected object. Because of this we can drop all shared fences when a new exclusive fence is set.
call reservation_object_reserve_shared, *this time obj->staged isn't NULL, and it is freed* (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through *reservation_object_add_shared_inplace*, since old->shared_count is 1 now, during *reservation_object_add_shared_inplace*()
Why would old->shared_count be 1 now? As far as I can see it should be zero.
Christian.
it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; * old_fence = rcu_dereference_protected(fobj->shared[i],* *reservation_object_held(obj));* if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(&obj->seq); preempt_enable(); dma_fence_put(old_fence); return; } if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } }
see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put.
/Monk
*From:* Liu, Monk *Sent:* Tuesday, March 6, 2018 5:45:19 PM *To:* dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian *Subject:* reservation questions Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, *and put all shared fence from obj->fence without waiting signaling.* (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, *this time obj->staged isn't NULL, and it is freed* (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, *But BUG_ON(old->shared_count >= old->shared_max) will hit !*
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine
/**
- reservation_object_wait_timeout_rcu - Wait on reservation's objects
- shared and/or exclusive fences.
- @obj: the reservation object
- @wait_all: if true, wait on all fences, else wait on just exclusive
fence
- @intr: if true, do interruptible wait
- @timeout: timeout value in jiffies or zero to return immediately
- RETURNS
- Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
- greater than zer on success.
*/ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout)
thanks /Monk
Yeah should be 0, typo sorry
I use 3dmark test and successfully triggered a case in reserve_shared:
if (obj->staged!=NULL) {
BUG();
}
kfree(obj->staged)
Previously I cannot figure out why the hell this BUG() could be hit, finally the scenario comes up,
you only need one add_excl_fence() after reserved_shared(), and next time the reserve_shared will
go hit that BUG(), but that's okay, since it only frees the obj->staged that allocated right before this calling
in the previous reserve_shared(), and no one refers to it now.
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:05:27 PM To: Liu, Monk; dri-devel@lists.freedesktop.org; Chris Wilson Subject: Re: reservation questions
Am 06.03.2018 um 10:56 schrieb Liu, Monk:
sorry, I have some mistake in previous thread, correct it as followings.
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
The exclusive fence replaces all shared fences, e.g. the exclusive operation needs to wait for all shared fences before it can access the protected object. Because of this we can drop all shared fences when a new exclusive fence is set.
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace()
Why would old->shared_count be 1 now? As far as I can see it should be zero.
Christian.
it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; old_fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(obj)); if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(&obj->seq); preempt_enable(); dma_fence_put(old_fence); return; } if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } }
see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put.
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian Subject: reservation questions
Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine
/** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout)
thanks /Monk
Oh, I see for (i = 0; i < fobj->shared_count; ++i) {
struct dma_fence *old_fence;
Since it use I < fobj->shared_count, then the wild pointer access won’t hit, please ignore question 1
The only valuable is question 2: how we treat excl fence and shared fence? e.g. when we add an excl fence to resv, how to deal with shared ? current logic is just put the all do we need to wait on their signaling before putting them ?
thanks
/Monk
From: Liu, Monk Sent: 2018年3月6日 17:57 To: dri-devel@lists.freedesktop.org; Chris Wilson chris@chris-wilson.co.uk; Koenig, Christian Christian.Koenig@amd.com Subject: Re: reservation questions
sorry, I have some mistake in previous thread, correct it as followings.
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace() it would go through the shared list, but the fence in shared list is now wild pointer:
for (i = 0; i < fobj->shared_count; ++i) {
struct dma_fence *old_fence;
old_fence = rcu_dereference_protected(fobj->shared[i],
reservation_object_held(obj));
if (old_fence->context == fence->context &&
dma_fence_is_later(fence, old_fence)) {
/* memory barrier is added by write_seqcount_begin */
RCU_INIT_POINTER(fobj->shared[i], fence);
write_seqcount_end(&obj->seq);
preempt_enable();
dma_fence_put(old_fence);
return;
}
if (!signaled && dma_fence_is_signaled(old_fence)) {
signaled = old_fence;
signaled_idx = i;
}
}
see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put.
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian Subject: reservation questions
Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4,
call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far.
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine
/**
* reservation_object_wait_timeout_rcu - Wait on reservation's objects
* shared and/or exclusive fences.
* @obj: the reservation object
* @wait_all: if true, wait on all fences, else wait on just exclusive fence
* @intr: if true, do interruptible wait
* @timeout: timeout value in jiffies or zero to return immediately
*
* RETURNS
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
* greater than zer on success.
*/
long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
bool wait_all, bool intr,
unsigned long timeout)
thanks /Monk
When adding fences to a reservation object all newly added fences must be created after all existing fences.
In other words when adding a shared fence the caller must guarantee that all existing fences of the same context must be completed before this new one. Because of this we can safely replace the existing fence with the new one.
When adding an exclusive fence the caller must guarantee that all existing fences of that reservation object must complete before this new one. Because of this we can safely throw away all existing fences as well as the old exclusive fence.
Regards, Christian.
Am 06.03.2018 um 11:10 schrieb Liu, Monk:
Oh, I see
for(i =
0; i < fobj->shared_count; ++i) {
structdma_fence *old_fence;
Since it use I < fobj->shared_count, then the wild pointer access won’t hit, please ignore question 1
The only valuable is question 2: how we treat excl fence and shared fence?
e.g. when we add an excl fence to resv, how to deal with shared ? current logic is just put the all
do we need to wait on their signaling before putting them ?
thanks
/Monk
*From:*Liu, Monk *Sent:* 2018年3月6日17:57 *To:* dri-devel@lists.freedesktop.org; Chris Wilson chris@chris-wilson.co.uk; Koenig, Christian Christian.Koenig@amd.com *Subject:* Re: reservation questions
sorry, I have some mistake in previous thread, correct it as followings.
1) considering below sequence:
call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,
call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.
call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, *and put all shared fence from obj->fence without waiting signaling.*
(this action looks inappropriate, I think at least before put all those shared fences
we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared,
*this time obj->staged isn't NULL, and it is freed* (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,
call reservation_object_add_shared_fence,
this time should going through *reservation_object_add_shared_inplace*,
since old->shared_count is 1 now, during *reservation_object_add_shared_inplace*()
it would go through the shared list, but the fence in shared list is now wild pointer:
for(i =
0; i < fobj->shared_count; ++i) {
structdma_fence *old_fence;
* old_fence = **rcu_dereference_protected**(fobj->**shared**[i],*
***reservation_object_held**(obj));*
if(old_fence->context== fence->context&&
dma_fence_is_later(fence, old_fence)) {
/* memory barrier is added by write_seqcount_begin */
RCU_INIT_POINTER(fobj->shared[i], fence);
write_seqcount_end(&obj->seq);
preempt_enable();
dma_fence_put(old_fence);
return;
}
if(!signaled &&
dma_fence_is_signaled(old_fence)) {
signaled = old_fence;
signaled_idx = i;
}
}
see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj)
in outside, which is obj->fence, and in add_excl_fence, all dma_fence in
obj->fence is already put.
/Monk
*From:*Liu, Monk *Sent:* Tuesday, March 6, 2018 5:45:19 PM *To:* dri-devel@lists.freedesktop.org mailto:dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian *Subject:* reservation questions
Hi Christian & Chris
two question regarding resv:
1) considering below sequence:
call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,
call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.
call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, *and put all shared fence from obj->fence without waiting signaling.*
(this action looks inappropriate, I think at least before put all those shared fences
we should dma_wait_fence() on them to make sure they are signaled)
call reservation_object_reserve_shared,
*this time obj->staged isn't NULL, and it is freed*(nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,
call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
*But BUG_ON(old->shared_count >= old->shared_max) will hit !*
This looks a design flaw in reservation object, shouldn't we fix it ?
2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old
is that correct? if excl fence is really exclusively used, why we still consider both shared fence and
excl fence on wait_timeout_rcu() routine, see blew description of this routine
/**
reservation_object_wait_timeout_rcu - Wait on reservation's objects
shared and/or exclusive fences.
@obj: the reservation object
@wait_all: if true, wait on all fences, else wait on just exclusive
fence
@intr: if true, do interruptible wait
@timeout: timeout value in jiffies or zero to return immediately
RETURNS
Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
greater than zer on success.
*/
longreservation_object_wait_timeout_rcu(structreservation_object *obj,
boolwait_all,
boolintr,
unsigned
longtimeout)
thanks
/Monk
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged;
old = reservation_object_get_list(obj); obj->staged = NULL;
if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence);
________________________________ From: Chris Wilson chris@chris-wilson.co.uk Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence);
*From:* Chris Wilson chris@chris-wilson.co.uk *Sent:* Tuesday, March 6, 2018 6:10:21 PM *To:* Liu, Monk; dri-devel@lists.freedesktop.org; Koenig, Christian *Subject:* Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from
obj->fence
without waiting signaling. (this action looks inappropriate, I think at least before put all
those shared
fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
ok, that's good point ...
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions
Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); ________________________________ From: Chris Wilson chris@chris-wilson.co.ukmailto:chris@chris-wilson.co.uk Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
Hi Christian
I use blow patch to capture the incorrect case :
@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable();
- /* inplace update, no shared fences */ - while (i--) - dma_fence_put(rcu_dereference_protected(old->shared[i], - reservation_object_held(obj))); + /* inplace update, no shared fences continue after all shared signaled */ + while (i--) { + struct dma_fence *f = rcu_dereference_protected(old->shared[i], + reservation_object_held(obj)); + if (!dma_fence_is_signaled(f)) + BUG(); + + dma_fence_put(f); + /* better assign shared[i] with NULL for sure */ + rcu_assign_pointer(old->shared[i], NULL); + }
dma_fence_put(old_fence); + + } EXPORT_SYMBOL(reservation_object_add_excl_fence);
and I hit this BUG() during test:
[ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 0000:00:08.0 on minor 0 [ 105.623332] ------------[ cut here ]------------ [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: 0000 [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: ffff8f8a6bfa2f50 [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: ffff8f8a6635f628 [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 0000000000000001 [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 0000000000000000 [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: ffff8f8a64beee60 [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) knlGS:0000000000000000 [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 00000000001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [ 105.684574] do_syscall_64+0x6e/0x150 [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25 [ 105.687354] RIP: 0033:0x7fd13b25ff47 [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 00007fd13b25ff47 [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 0000000000000007 [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 000000000000000e [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 000000000188a430 [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 0000000000000000 [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff ff eb [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: ffffb1204159f9f0
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
Going to take a deep look ...
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 6:47 PM To: Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions
ok, that's good point ...
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions
Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); ________________________________ From: Chris Wilson chris@chris-wilson.co.ukmailto:chris@chris-wilson.co.uk Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
Hi Monk,
your check isn't correct because you still haven't understood the semantics here.
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
The semantic is NOT that all shared fences are signaled when the exclusive fence is added.
Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here.
I honestly don't know how else to explain it.
Regards, Christian.
Am 06.03.2018 um 12:03 schrieb Liu, Monk:
Hi Christian
I use blow patch to capture the incorrect case :
@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable();
- /* inplace update, no shared fences */
- while (i--)
- dma_fence_put(rcu_dereference_protected(old->shared[i],
- reservation_object_held(obj)));
- /* inplace update, no shared fences continue after all shared
signaled */
- while (i--) {
- struct dma_fence *f =
rcu_dereference_protected(old->shared[i],
- reservation_object_held(obj));
- if (!dma_fence_is_signaled(f))
- BUG();
- dma_fence_put(f);
- /* better assign shared[i] with NULL for sure */
- rcu_assign_pointer(old->shared[i], NULL);
- }
dma_fence_put(old_fence);
} EXPORT_SYMBOL(reservation_object_add_excl_fence);
and I hit this BUG() during test:
[ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 0000:00:08.0 on minor 0 [ 105.623332] ------------[ cut here ]------------ [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: 0000 [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: ffff8f8a6bfa2f50 [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: ffff8f8a6635f628 [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 0000000000000001 [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 0000000000000000 [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: ffff8f8a64beee60 [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) knlGS:0000000000000000 [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 00000000001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [ 105.684574] do_syscall_64+0x6e/0x150 [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25 [ 105.687354] RIP: 0033:0x7fd13b25ff47 [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 00007fd13b25ff47 [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 0000000000000007 [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 000000000000000e [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 000000000188a430 [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 0000000000000000 [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff ff eb [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: ffffb1204159f9f0
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
Going to take a deep look ...
/Monk
*From:* Liu, Monk *Sent:* Tuesday, March 6, 2018 6:47 PM *To:* Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.org *Subject:* Re: reservation questions
ok, that's good point ...
*From:* Koenig, Christian *Sent:* Tuesday, March 6, 2018 6:42:44 PM *To:* Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org *Subject:* Re: reservation questions Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence);
*From:* Chris Wilson chris@chris-wilson.co.uk mailto:chris@chris-wilson.co.uk *Sent:* Tuesday, March 6, 2018 6:10:21 PM *To:* Liu, Monk; dri-devel@lists.freedesktop.org mailto:dri-devel@lists.freedesktop.org; Koenig, Christian *Subject:* Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from
obj->fence
without waiting signaling. (this action looks inappropriate, I think at least before put all
those shared
fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
You mean the caller must guarantees the excl fence will only signal till all shared fence signaled, so you can just
ignores all shared fence if add_excl_fence() is invoked.
e.g. the excl fence is with the same ctx (but later) of the one in shared list.
thanks for the explanation
/Monk
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 7:11:50 PM To: Liu, Monk Cc: dri-devel@lists.freedesktop.org; Chris Wilson Subject: Re: reservation questions
Hi Monk,
your check isn't correct because you still haven't understood the semantics here.
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG, The semantic is NOT that all shared fences are signaled when the exclusive fence is added.
Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here.
I honestly don't know how else to explain it.
Regards, Christian.
Am 06.03.2018 um 12:03 schrieb Liu, Monk:
Hi Christian
I use blow patch to capture the incorrect case :
@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable();
- /* inplace update, no shared fences */ - while (i--) - dma_fence_put(rcu_dereference_protected(old->shared[i], - reservation_object_held(obj))); + /* inplace update, no shared fences continue after all shared signaled */ + while (i--) { + struct dma_fence *f = rcu_dereference_protected(old->shared[i], + reservation_object_held(obj)); + if (!dma_fence_is_signaled(f)) + BUG(); + + dma_fence_put(f); + /* better assign shared[i] with NULL for sure */ + rcu_assign_pointer(old->shared[i], NULL); + }
dma_fence_put(old_fence); + + } EXPORT_SYMBOL(reservation_object_add_excl_fence);
and I hit this BUG() during test:
[ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 0000:00:08.0 on minor 0 [ 105.623332] ------------[ cut here ]------------ [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: 0000 [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: ffff8f8a6bfa2f50 [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: ffff8f8a6635f628 [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 0000000000000001 [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 0000000000000000 [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: ffff8f8a64beee60 [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) knlGS:0000000000000000 [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 00000000001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [ 105.684574] do_syscall_64+0x6e/0x150 [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25 [ 105.687354] RIP: 0033:0x7fd13b25ff47 [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 00007fd13b25ff47 [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 0000000000000007 [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 000000000000000e [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 000000000188a430 [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 0000000000000000 [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff ff eb [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: ffffb1204159f9f0
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
Going to take a deep look ...
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 6:47 PM To: Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org Subject: Re: reservation questions
ok, that's good point ...
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org Subject: Re: reservation questions
Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); ________________________________ From: Chris Wilson chris@chris-wilson.co.ukmailto:chris@chris-wilson.co.uk Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
e.g. the excl fence is with the same ctx (but later) of the one in shared list.
Well it's not on the same context, but it is guaranteed to not complete before all shared fences.
See for example how it is used in amdgpu_copy_buffer(). We first sync to all fences in the reservation object:
r = amdgpu_sync_resv(adev, &job->sync, resv, AMDGPU_FENCE_OWNER_UNDEFINED, false); if (r) { DRM_ERROR("sync failed (%d).\n", r); goto error_free; }
This way the resulting fence will never signal before anything else and so can safely be used as exclusive fence in the reservation object.
Regards, Christian.
Am 06.03.2018 um 13:32 schrieb Liu, Monk:
You mean the caller must guarantees the excl fence will only signal till all shared fence signaled, so you can just
ignores all shared fence if add_excl_fence() is invoked.
e.g. the excl fence is with the same ctx (but later) of the one in shared list.
thanks for the explanation
/Monk
*From:* Koenig, Christian *Sent:* Tuesday, March 6, 2018 7:11:50 PM *To:* Liu, Monk *Cc:* dri-devel@lists.freedesktop.org; Chris Wilson *Subject:* Re: reservation questions Hi Monk,
your check isn't correct because you still haven't understood the semantics here.
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
The semantic is NOT that all shared fences are signaled when the exclusive fence is added.
Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here.
I honestly don't know how else to explain it.
Regards, Christian.
Am 06.03.2018 um 12:03 schrieb Liu, Monk:
Hi Christian
I use blow patch to capture the incorrect case :
@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable();
- /* inplace update, no shared fences */
- while (i--)
- dma_fence_put(rcu_dereference_protected(old->shared[i],
- reservation_object_held(obj)));
- /* inplace update, no shared fences continue after all shared
signaled */
- while (i--) {
- struct dma_fence *f =
rcu_dereference_protected(old->shared[i],
- reservation_object_held(obj));
- if (!dma_fence_is_signaled(f))
- BUG();
- dma_fence_put(f);
- /* better assign shared[i] with NULL for sure */
- rcu_assign_pointer(old->shared[i], NULL);
- }
dma_fence_put(old_fence);
} EXPORT_SYMBOL(reservation_object_add_excl_fence);
and I hit this BUG() during test:
[ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 0000:00:08.0 on minor 0 [ 105.623332] ------------[ cut here ]------------ [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: 0000 [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: ffff8f8a6bfa2f50 [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: ffff8f8a6635f628 [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 0000000000000001 [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 0000000000000000 [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: ffff8f8a64beee60 [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) knlGS:0000000000000000 [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 00000000001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [ 105.684574] do_syscall_64+0x6e/0x150 [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25 [ 105.687354] RIP: 0033:0x7fd13b25ff47 [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 00007fd13b25ff47 [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 0000000000000007 [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 000000000000000e [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 000000000188a430 [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 0000000000000000 [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff ff eb [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: ffffb1204159f9f0
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
Going to take a deep look ...
/Monk
*From:* Liu, Monk *Sent:* Tuesday, March 6, 2018 6:47 PM *To:* Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.org mailto:dri-devel@lists.freedesktop.org *Subject:* Re: reservation questions
ok, that's good point ...
*From:* Koenig, Christian *Sent:* Tuesday, March 6, 2018 6:42:44 PM *To:* Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org mailto:dri-devel@lists.freedesktop.org *Subject:* Re: reservation questions Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence);
*From:* Chris Wilson chris@chris-wilson.co.uk mailto:chris@chris-wilson.co.uk *Sent:* Tuesday, March 6, 2018 6:10:21 PM *To:* Liu, Monk; dri-devel@lists.freedesktop.org mailto:dri-devel@lists.freedesktop.org; Koenig, Christian *Subject:* Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence
from obj->fence
without waiting signaling. (this action looks inappropriate, I think at least before put all
those shared
fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Yeah, I got the point, using dependency/scheduler to make sure the resulting fence always not signal before the ones hooked in dependencies
thanks
/Monk
________________________________ From: Christian K?nig ckoenig.leichtzumerken@gmail.com Sent: Tuesday, March 6, 2018 8:46:57 PM To: Liu, Monk; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: reservation questions
e.g. the excl fence is with the same ctx (but later) of the one in shared list.
Well it's not on the same context, but it is guaranteed to not complete before all shared fences.
See for example how it is used in amdgpu_copy_buffer(). We first sync to all fences in the reservation object: r = amdgpu_sync_resv(adev, &job->sync, resv, AMDGPU_FENCE_OWNER_UNDEFINED, false); if (r) { DRM_ERROR("sync failed (%d).\n", r); goto error_free; } This way the resulting fence will never signal before anything else and so can safely be used as exclusive fence in the reservation object.
Regards, Christian.
Am 06.03.2018 um 13:32 schrieb Liu, Monk:
You mean the caller must guarantees the excl fence will only signal till all shared fence signaled, so you can just
ignores all shared fence if add_excl_fence() is invoked.
e.g. the excl fence is with the same ctx (but later) of the one in shared list.
thanks for the explanation
/Monk
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 7:11:50 PM To: Liu, Monk Cc: dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Chris Wilson Subject: Re: reservation questions
Hi Monk,
your check isn't correct because you still haven't understood the semantics here.
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG, The semantic is NOT that all shared fences are signaled when the exclusive fence is added.
Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here.
I honestly don't know how else to explain it.
Regards, Christian.
Am 06.03.2018 um 12:03 schrieb Liu, Monk:
Hi Christian
I use blow patch to capture the incorrect case :
@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable();
- /* inplace update, no shared fences */ - while (i--) - dma_fence_put(rcu_dereference_protected(old->shared[i], - reservation_object_held(obj))); + /* inplace update, no shared fences continue after all shared signaled */ + while (i--) { + struct dma_fence *f = rcu_dereference_protected(old->shared[i], + reservation_object_held(obj)); + if (!dma_fence_is_signaled(f)) + BUG(); + + dma_fence_put(f); + /* better assign shared[i] with NULL for sure */ + rcu_assign_pointer(old->shared[i], NULL); + }
dma_fence_put(old_fence); + + } EXPORT_SYMBOL(reservation_object_add_excl_fence);
and I hit this BUG() during test:
[ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 0000:00:08.0 on minor 0 [ 105.623332] ------------[ cut here ]------------ [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: 0000 [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: ffff8f8a6bfa2f50 [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: ffff8f8a6635f628 [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 0000000000000001 [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 0000000000000000 [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: ffff8f8a64beee60 [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) knlGS:0000000000000000 [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 00000000001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [ 105.684574] do_syscall_64+0x6e/0x150 [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25 [ 105.687354] RIP: 0033:0x7fd13b25ff47 [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 00007fd13b25ff47 [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 0000000000000007 [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 000000000000000e [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 000000000188a430 [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 0000000000000000 [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff ff eb [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: ffffb1204159f9f0
the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG,
Going to take a deep look ...
/Monk
________________________________ From: Liu, Monk Sent: Tuesday, March 6, 2018 6:47 PM To: Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org Subject: Re: reservation questions
ok, that's good point ...
________________________________ From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org Subject: Re: reservation questions
Hi Monk,
that is to remove the problem that allocating memory could fail.
E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management.
So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware.
Regards, Christian.
Am 06.03.2018 um 11:19 schrieb Liu, Monk:
Hi Chris
another question is why we not just call "reservation_object_reserve_shared"
during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need
to worry when and how much time it should call reserve_shared() ?
thanks !
void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); ________________________________ From: Chris Wilson chris@chris-wilson.co.ukmailto:chris@chris-wilson.co.uk Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions
Quoting Liu, Monk (2018-03-06 09:45:19)
call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled)
No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.)
call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL,
call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit !
How? You only free staged iff shared_count < shared_max.
You've reminded me that we should cover all this with a bunch of selftests. -Chris
_______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.orgmailto:dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org