On Thu, Jul 18, 2013 at 4:54 PM, David Herrmann dh.herrmann@gmail.com wrote:
Hi
On Thu, Jul 18, 2013 at 1:24 PM, Thomas Hellstrom thellstrom@vmware.com wrote:
On 07/18/2013 01:07 PM, David Herrmann wrote:
Hi
On Thu, Jul 18, 2013 at 10:53 AM, Thomas Hellstrom thellstrom@vmware.com wrote:
A quick look, but not a full review:
Looks mostly good, but it looks like the TTM vm lock isn't needed at all anymore (provided the vma offset manager is properly protected), since kref_get_unless_zero() is used when a reference after lookup is taken. (please see the kref_get_unless_zero documentation examples). When drm_vma_offset_remove() is called, the kref value is unconditionally zero, and that should block any successful lookup.
If we drop vm_lock without modifying TTM, this will not work. Even kref_get_unless_zero() needs some guarantee that the object is still valid. Assume this scenario:
drm_vma_offset_lookup() returns a valid node, however, before we call kref_get_*(), another thread destroys the last reference to the bo so it gets kfree()d. kref_get_unless_zero() will thus reference memory which can be _anything_ and is not guarantee to stay 0. (Documentation/kref.txt mentions this, too, and I guess it was you who wrote that).
I cannot see any locking around the mmap call that could guarantee this except vm_lock.
You're right. My apologies. Parental leave has taken its toll.
Otherwise, if the vma offset manager always needs external locking to make lookup + referencing atomic, then perhaps locking should be completely left to the caller?
I would actually prefer to keep it in the VMA manager. It allows some fast-paths which otherwise would need special semantics for the caller (for instance see the access-management patches which are based on this patchset or the reduced locking during setup/release). We'd also require the caller to use rwsems for good performance, which is not the case for gem.
So how about Daniel's proposal to add an _unlocked() version and provide _lock_lookup() and _unlock_lookup() helpers which just wrap the read_lock() and read_unlock?
Thanks! David
I think that if there are good reasons to keep locking internal, I'm fine with that, (And also, of course, with Daniel's proposal). Currently the add / remove / lookup paths are mostly used by TTM during object creation and destruction.
However, if the lookup path is ever used by pread / pwrite, that situation might change and we would then like to minimize the locking.
I tried to keep the change as minimal as I could. Follow-up patches are welcome. I just thought pushing the lock into drm_vma_* would simplify things. If there are benchmarks that prove me wrong, I'll gladly spend some time optimizing that.
Apart from this, I'd like to see someone ack the ttm_buffer_object_transfer() changes. I am not very confident with that. Everything else should be trivial.
Thanks David
Looks good to me, the transfer object must have an empty vm_node/vma_node as we only interested in tying the system ram or vram to the ghost object so that delayed delete the vram or system ram but the original buffer is still valid and thus its original vm_node/vma_node must not be free.
Cheers, Jerome