On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
On the other hand, the below shows how we could enhance the conventional way with my approach (just example):
CPU -> DMA, ioctl(qbuf command) ioctl(streamon) | | | | qbuf <- dma_buf_sync_get start streaming <- syncpoint
dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object. And the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA accesses the sync buffer.
And DMA -> CPU, ioctl(dqbuf command) | | dqbuf <- nothing to do
Actual syncpoint is when DMA operation is completed (in interrupt handler): the syncpoint is performed by calling dma_buf_sync_unlock(). Hence, my approach is to move the syncpoints into just before dma access as long as possible.
What you've just described does *not* work on architectures such as ARMv7 which do speculative cache fetches from memory at any time that that memory is mapped with a cacheable status, and will lead to data corruption.