KMS timings (Re: [PATCH 6/8] drm/bochs: phase 3: provide a custom ->atomic_commit implementation)

20 Jul 2015

      On Mon, 20 Jul 2015 01:58:33 -0700
Stéphane Marchesin stephane.marchesin@gmail.com wrote:
...
On Mon, Jul 20, 2015 at 12:46 AM, Pekka Paalanen ppaalanen@gmail.com wrote:
...
On Sun, 19 Jul 2015 17:20:32 -0700
Stéphane Marchesin stephane.marchesin@gmail.com wrote:
...
On Thu, Jul 16, 2015 at 11:08 PM, Pekka Paalanen ppaalanen@gmail.com wrote:
...
On Thu, 16 Jul 2015 20:20:39 +0800
John Hunter zhjwpku@gmail.com wrote:
...
From: Zhao Junwang zhjwpku@gmail.com
This supports the asynchronous commits, required for page-flipping
Since it's virtual hw it's ok to commit async stuff right away, we
never have to wait for vblank.
Hi,
in theory, yes. This is what a patch to bochs implemented not too long
ago, so AFAIK you are only replicating the existing behaviour.
However, if userspace doing an async commit (or sync, I suppose) does
not incur any waits in the kernel in e.g. sending the page flip event,
then flip driven programs (e.g. a Wayland compositor, say, Weston)
will be running its rendering loop as a busy-loop, because the kernel
does not throttle it to the (virtual) display refresh rate.
This will cause maximal CPU usage and poor user experience as
everything else needs to fight for CPU time and event dispatch to get
through, like input.
I would hope someone could do a follow-up to implement a refresh cycle
emulation based on a clock. Userspace expects page flips to happen at
most at refresh rate when asking for vblank-synced flips. It's only
natural for userspace to drive its rendering loop based on the vblank
cycle.
I've been asking myself the same question (for the UDL driver) and I'm
not sure if this policy should go in the kernel. After all, there
could be legitimate reasons for user space to render lots of frames
per second. It seems to me that if user space doesn't want too many
fps, it should just throttle itself.
If userspace wants to render lots of frames per second, IMO it should
not be using vblank-synced operations in a way that may throttle it.
The lots of frames use case is already non-working for the majority of
the drivers without DRM_MODE_PAGE_FLIP_ASYNC, right?
The problem here I see is that one DRM driver decides to work different
to other DRM drivers. All real-hardware DRM drivers, when asked to do
vblank-synced update, actually do throttle to the vblank AFAIK.
udl is an exception here. It is (arguably) real hardware but doesn't throttle.
...
Is it
too much to assume, that the video mode set in a driver (refresh rate)
corresponds to the vblank rate which implicitly delays the completion
of vblank-sync'd operations to at least the next vblank boundary?
I think it's wrong to make user space think that a vsynced display
always matches the refresh rate in a world where:

some displays have variable refresh rates (not just the fancy new

stuff like g-sync, look for lvds_downclock in the intel driver for
example, also consider DSI displays)

some displays have no refresh rate (the ones we are talking about

here: udl, bochs...)
That means that refresh rate in a video mode is bogus. Can userspace
know when the refresh rate is meaningless? I suppose there are two
different cases of meaningless, too: when the driver ignores it as
input argument, and when it is used but has no guarantees for timings.
Assuming it's always meaningless wrt. timings is pretty harsh. E.g. the
Wayland Presentation extension's implementation in Weston uses the
refresh rate to predict the next flip time and hands it out to clients
for scheduling/interpolation purposes.
...

you can do partial vsynced updates by just waiting for a specific

scanline range which again breaks the assumption that "vsynced" ==
"refreshes at the monitor rate". In this case there is no visible
tearing (in that sense it is vsynced) but the flip time is not
predictable using the refresh rate.
Okay. That also invalidates the design (well, at least the
implementation, and sounds like DRM does not give any tools to allow
implementing it) the Wayland Presentation extension even on "good"
hardware, so nice to realize. I was already suggesting we should
stabilize it since it looks good, but this puts it all back to the
drawing board.
I think it also mostly invalidates the whole scheduling implementation
in Weston.
...
So I don't think we should perpetuate that problem. And I would like
user space to "see" the actual flip times to enable some kind of
scheduling where possible.
...
I think, if the driver cannot implement proper semantics (which IMO
includes the throttling) for vblank-sync'd operations and it does not
want to fake them with a clock, it should just refuse vblank-synced
operations.
Yes refusing vsynced flips for these drivers sounds reasonable. But
please let's not bake in another assumption in the API (or rather,
let's try to un-bake it).
Could you be more specific on everything, please?
What should drivers do in different situations, what guarantees we do
have, and how does userspace predict the earliest possible flip time?
How do you define flip time to begin with, if it's not tied to the
scanout cycle (vblank)?
How should a compositor schedule eveything, and what can it tell to the
clients about the timings in the immediate future?
You gave me the feeling that everything I thought I knew and relied on
is wrong.
...
...
That would push the problem to userspace, and it would be
obvious what's going wrong. Naturally, it would break some userspace
programs that expect vblank-synced operations to work, but is that
so much different to the current unfixed situation?
Thanks,
pq

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

KMS timings (Re: [PATCH 6/8] drm/bochs: phase 3: provide a custom ->atomic_commit implementation)