Op 20-11-12 02:03, 김승우 schreef:
Hi Maarten,
On 2012년 11월 19일 19:27, Maarten Lankhorst wrote:
Op 15-11-12 04:52, Seung-Woo Kim schreef:
Increasing ref counts of both dma-buf and gem for imported dma-buf come from gem makes memory leak. release function of dma-buf cannot be called because f_count of dma-buf increased by importing gem and gem ref count cannot be decrease because of exported dma-buf.
So I add dma_buf_put() for imported gem come from its own gem into each drivers having prime_import and prime_export capabilities. With this, only gem ref count is increased if importing gem exported from gem of same driver.
Signed-off-by: Seung-Woo Kim sw0312.kim@samsung.com Signed-off-by: Kyungmin.park kyungmin.park@samsung.com Cc: Inki Dae inki.dae@samsung.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Rob Clark rob.clark@linaro.org Cc: Alex Deucher alexander.deucher@amd.com
Mmap failiure in i915 was reported by Jani, and I think it was fixed with Dave's commit "drm/prime: add exported buffers to current fprivs imported buffer list (v2)", so I resend this patch.
I can send exynos only, but this issue is common in all drm driver having both prime inport and export, IMHO, it's better to send for all drivers.
I fear that this won't solve the issue completely and keeps open a small race.
I don't believe my patch can fix all issue related with gem prime completely. But it seems that it can solve unrecoverable memory leak caused by re-importing GEM. Anyway, can you give me an example of other race issue?
When the dma_buf is unreffed and refcount drops to 0, work is queued to free it.
Until then export_dma_buf member points to something, but refcount is 0 on it.
Importing to itself, then closing fd then re-export from the new place would probably trigger it.
I don't like the current destruction path either. It really looks like export_dma_buf should be unset when the exported buffer is closed in the original file. Anything else is racy. Especially setting export_dma_buf to NULL won't work with the current delayed fput semantics.
I cannot figure out all aspect of delayed fput, but until delayed work is called, dma_buf is not released so export_dma_buf is valid. Considering this, I can't find possible issue of delayed fput.
I'm fairly sure that when refcount drops to 0, even though the memory may not be freed yet, you no longer have a guarantee that the memory is still valid.
And of course after importing into a different process with its own drm namespace, how does file_priv->prime.lock still protect serialization?
I don't see any hierarchy either. export_dma_buf needs to be set to NULL before the dma_buf_put is called that is used for the reference inside export_dma_buf.
The release function should only release a reference to whatever backing memory is used.
So to me it seems instead we should do something like this:
dmabuf_release callback is a noop, no ref is held by the dma-buf.
attach and detach ops increase reference by 1*.
when the buffer is exported, export_dma_buf is set by core code and kept around alive until the gem object is closed.
dma_buf_put is not called by import function. This reference removed as part of gem object close instead.
Considering re-import, where drm file priv is different from exporter's file priv, attach and detach are not called because gem object is reused.
How do you think remove checking whether dma_buf is came from driver own exporter? Because without this checking, gem object is newly created, and then re-import issue will disappear.
The whole refcounting is a circular mess, sadly with no easy solution. :/
It seems to me that using gem reference counts is solving this problem at the wrong layer. A secondary type of reference count is needed for the underlying memory backing.
For those who use ttm, I would recommend that the dma_buf increases refcount on ttm_bo by one, so that the gem object can be destroyed and release its reference on the dma_buf.
WIthout a secondary refcount you end up in a position where you can't reliably and race free unref the gem object and dma_buf at the same time, since they're both independent interfaces with possibly different lifetimes.
It would really help if there were hard rules on lifetime of the export_dma_buf member, until then we're just patching one broken thing with another.
~Maarten