Re: [PATCH v5 07/19] media: vsp1: dl: Support one-shot entries in the display list

19 Mar 2019

      Hi Brian,
On Mon, Mar 18, 2019 at 04:59:21PM +0000, Brian Starkey wrote:
...
Hi Laurent,
Sorry for the delay, I was travelling last week and didn't find a
chance to digest your diagrams (thanks for all the detail!)
No worries, and thank you for the time you're spending on this. In the
meantime I've found a solution that solves the race, using an entirely
different mechanism. It's all explained in v6 of the patch series.
...
On Fri, Mar 08, 2019 at 02:24:40PM +0200, Laurent Pinchart wrote:
...
On Thu, Mar 07, 2019 at 12:28:08PM +0000, Brian Starkey wrote:
...
On Wed, Mar 06, 2019 at 08:22:44PM +0200, Laurent Pinchart wrote:
[snip]
...
I can always queue a new one, but I have no way of telling if the newly
queued list raced with the frame end interrupt.
There's another register I'm not using that contains a shadow copy of
the registers list DMA address. It mirrors the address programmed by the
driver when there is no DMA of the registers list in progress, and
contains the address the of registers list being DMA'ed when a DMA is in
progress. I don't think I can do much with it, as reading it either
before or after reprogramming (to check if I can reprogram or if the
reprogram has been taken into account) is racy.
When you say it mirrors the address programmed, is that latched when
the update is accepted, or it will update the moment you program the
address register?
It is latched when the update is processed, at the next vblank following
the address programming. The timing diagram below shows vblank, the UPD
bit I use for synchronization, the registers list address register
exposed to the CPU, and the internal address used to DMA the registers
list. The address register is written four times to show how the
hardware reacts, but in practice there will be a single write per frame,
right after the beginning of vblank.
                                  DMA starts
                                  |     DMA ends
                                  |     |
                                  V     V
                                  ___________

VBLANK         ______________________|           |________
                     ________________     ________________
UPD            _____|                |___|
               _____ ______ _____________ ________ _______
ADDR register  __A__X__B___X______C______X___D____X___E___
               ______________________ ____________________
ADDR internal  ___________A__________X__________C_________
I can reprogram the address any number of times I want before the
vblank, but the update bit mechanism only lets me protect the race
related to the first write. For any subsequent write I won't be able to
tell whether it was complete before the hardware started the frame, or
arrived too late.
...
I assume the latter or you would have thought of this yourself (that
seems like a really strange/not-useful behaviour!). But if it is the
former you could:

In writeback start-of-frame, create a copy of the register list,
disabling writeback.
Write the address of this copy to the register

If/when an atomic commit comes in before you service the next
end-of-frame:

Write the address of the new register list
Check the status register. If the "pending" bit is zero, you know
you had a potential race.
Check the DMA address register. If it corresponds to the new
scene, the new commit won the race, otherwise it's been delayed
by a frame.

The issue here is that there's a potential race if the pending (UPD) bit
is one too. If the commit arrives just before vblank, but the address is
written just after vblank, after the UPD bit has been reset to 0 by the
hardware, but before the vblank interrupt is processed, then the commit
won't be applied yet, and I will have no way to know. Compare the two
following scenarios:
   [1]       [2] [3]           [4]       [5]
                |         |   |             |         |
                |         V   V             |         V
                V          __________       V          __________

VBLANK         _______________|          |________________|          |__
                     _________     _______________________
UPD            _____|         |___|                       |_____________
               _____ _____________ _____________ _______________________
ADDR register  __A__X______B______X______C______X___________D___________
               ___________________________________________ _____________
ADDR internal  _______A_______X______B____________________X______D______
[1] Atomic commit, registers list address write, with writeback enabled
[2] DMA starts for first atomic commit
[3] Writeback disable registers list address write
[4] Next atomic commit, with writeback disabled
[5] DMA starts for second atomic commit
   [1]       [2] [3]                     [4][5]
                |         |   |                       |  |
                |         V   V                       V  V
                V          __________                  __________

VBLANK         _______________|          |________________|          |__
                     _________     _______________________    __________
UPD            _____|         |___|                       |__|
               _____ _____________ __________________________ __________
ADDR register  __A__X______B______X_____________C____________X_____D____
               ___________________________________________ _____________
ADDR internal  _______A_______X______B____________________X______C______
[1] Atomic commit, registers list address write, with writeback enabled
[2] DMA starts for first atomic commit
[3] Writeback disable registers list address write
[4] DMA starts for writeback disable registers list (3)
[5] Next atomic commit, with writeback disabled, performed right after
    vblank but befrore the vblank interrupt is serviced
The UPD bit is 1 after writing the ADDR register the second time in both
cases. Furthermore, if [4] and [5] are very close in the second case,
the UPD bit may read 1 just before [5] if the read comes before [4]:
read UPD bit;
   /* VBLANK [4] */
   write ADDR register;
I thus can't rely on UPD = 1 before the write meaning that the write was
performed before vblank, and I can't rely either on the UPD bit after
write, as it's 1 in both cases.
My mistake, I got the UPD bit the wrong way around. I'm still not
entirely sure why you can't use "ADDR internal" to determine which
side won the race. It shows 'B' in the first case, and 'C' in the
second.
Because ADDR internal isn't available to the CPU :-(
...
When a new commit comes, unconditionally:

Write new address
Read status

if status.UPD == 0 --> You know for sure your new commit was just
                        latched.
 if status.UPD == 1 --> You need to check ADDR internal to see which
   		of these three happened:
1) Your first case happened. We're somewhere in the middle of the
   frame. ADDR internal will show 'B', and you know commit 'D' is
   going on-screen at the next vblank.

2) Your second case happened. The new commit raced with the
   latching of writeback-disable and "lost". ADDR internal will
   show 'C', and the new commit is delayed by a frame

3) (Teeny tiny small window) In-between reading status and ADDR
   internal, the new commit was latched. ADDR internal will show
   'D'. You know the new commit "won" so treat it the same as if
   UPD == 0 (which it will be, now).

Anyway, it's all moot now that you've found the chained lists thing -
that sounds ideal. I'll take a look at the new series shortly.
It's a neat hardware feature, yes. We were already using it for a
different purpose, I should have thought about it for writeback too from
the very beginning.
Please note I've sent a pull request for v7 as it has been fully
reviewed. Nonetheless, if you find issues, I can fix them on top.
...
...
I initially thought I could protect against the race using the following
procedure.

Single atomic commit, no IRQ delay
 [1]       [2]        [3]              [4]
              |         |          |                |
              |         V          V                V
              V          __________                  __________

VBLANK         _______________|          |________________|          |__
                     _________    
UPD            _____|         |_________________________________________
               _____ ___________________________________________________
ADDR register  __A__X___________________B_______________________________
               _______________ _________________________________________
ADDR internal  _______A_______X______B__________________________________
[1] Atomic commit, registers list address write, with writeback enabled
[2] DMA starts for first atomic commit
[3] Frame start, disable writeback in-place in registers list B
[4] DMA starts for "patched" registers list, disables writeback

Two atomic commits, no IRQ delay
 [1]       [2]  [3]   [4]              [5]
              |         |    |     |                |
              |         V    V     V                V
              V          __________                  __________

VBLANK         _______________|          |________________|          |__
                     _________      ______________________
UPD            _____|         |____|                      |_____________
               _____ ______________ ____________________________________
ADDR register  __A__X______B_______X_________________C__________________
               _______________ ___________________________ _____________
ADDR internal  _______A_______X_____________B_____________X______C______
[1] Atomic commit, registers list address write, with writeback enabled
[2] DMA starts for first atomic commit
[3] Next atomic commit, registers list address write, with writeback
    disabled
[4] Frame start, disable writeback in-place in registers list B
[5] DMA starts for second atomic commit, disables writeback
[3] and [4] can happen in any order, as long as they both come before
[5]. If [3] comes after [5], we're back to the previous case (Single
atomic commit, no IRQ delay).

Single atomic commit, IRQ delay
 [1]       [2]        [3]              [4] [5]
              |         |          |                |   |
              |         V          V                V   V
              V          __________                  __________

VBLANK         _______________|          |________________|          |__
                     _________    
UPD            _____|         |_________________________________________
               _____ ___________________________________________________
ADDR register  __A__X___________________B_______________________________
               _______________ _________________________________________
ADDR internal  _______A_______X______B__________________________________
[1] Atomic commit, registers list address write, with writeback enabled
[2] DMA starts for first atomic commit
[3] Frame start, IRQ is delayed to [5]
[4] DMA starts for unmodified registers list, writeback still enable
[5] disable writeback in-place in registers list B, too late
Here I need to detect that [5] was delayed after [4], and thus delay the
completion of the writeback job by one frame. This could be done by
checking the vblank interrupt status bit, if it is set then vblank
occurred and raced [5]. However, the same issue can also happen when no
race occurred if processing of the vblank interrupt for [2] is delayed
until [3]. Both the vblank interrupt and the frame start interrupt
status bits will be set, indicate a potential race.
The time between [2] and [3] is very short compared to the time between
[3] and [4] and to interrupt latency in general, so we would have lots
of false positives.
...
...
...
You don't happen to have a DMA engine trigger or something you could
use to do the register list modification at a guaranteed time do you?
Not that I know of, unfortunately.
...
Are you always going to be protected by an IOMMU, preventing the
writeback from trashing physical memory? If that's not the case, then
the race can have pretty dire consequences.
If the IOMMU is enabled in DT, yes. It's a system-level decision.
Well, it's your driver at the end of the day. But for me, a known
race-condition which would cause random memory corruption sounds like
a really Bad Thing. Halving frame-rate on systems with no IOMMU seems
preferable to me.
It is a really bad thing. I think the decision should be taken by
Renesas though.
-- 
Regards,

Laurent Pinchart

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH v5 07/19] media: vsp1: dl: Support one-shot entries in the display list