People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org --- drivers/gpu/drm/drm_stub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c index c236fd2..a1cbc0e 100644 --- a/drivers/gpu/drm/drm_stub.c +++ b/drivers/gpu/drm/drm_stub.c @@ -40,7 +40,7 @@ unsigned int drm_debug = 0; /* 1 to enable debug output */ EXPORT_SYMBOL(drm_debug);
-unsigned int drm_vblank_offdelay = 5000; /* Default to 5000 msecs. */ +unsigned int drm_vblank_offdelay = 50; /* Default to 50 msecs. */ EXPORT_SYMBOL(drm_vblank_offdelay);
unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
Nacked-by: Daniel Vetter daniel@ffwll.ch
Also discussed with Jesse on irc, we've had fun ;-)
On Tue, 30 Oct 2012 20:20:44 +0100 Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
Nacked-by: Daniel Vetter daniel@ffwll.ch
Also discussed with Jesse on irc, we've had fun ;-)
That's ridiculous. Just because we have a race we can't fix wrt reading hw regs, doesn't mean we can't reduce the timeout.
I nack your nack.
On 30.10.12 20:28, Jesse Barnes wrote:
On Tue, 30 Oct 2012 20:20:44 +0100 Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
Nacked-by: Daniel Vetter daniel@ffwll.ch
Also discussed with Jesse on irc, we've had fun ;-)
That's ridiculous. Just because we have a race we can't fix wrt reading hw regs, doesn't mean we can't reduce the timeout.
I nack your nack.
Jesse, thanks for cc'ing me, much appreciated :)
Psychtoolbox should be fine with a 50 msecs vblank off delay. I think i tested with values somewhere between 50 - 100 msecs at the time the drm patches were made. The way i schedule swaps for a specific target time 'twhen' is to:
1. glXGetSyncValuesOML(msc,ust) as a baseline vblank count and time. 2. Some basic math to calculate targetMSC based on 'twhen' and step 1. 3. glXSwapBuffersMscOML(targetMSC);
So it should tolerate wrong vblank counts due to races, as long as vblank doesn't get disabled between 1. and 3. A default vblank off delay lower than maybe 1 refresh duration would make me nervous, given that most kms drivers are not race-free and it is possible for a userspace app or the x-server to get stalled for a couple of msecs.
E.g., radeon wouldn't allow this race-free, because the drm code assumes that the hw vblank counter increments at leading edge of vblank, whereas radeon hw does it at the traling edge, iirc. nouveau doesn't have any hw vblank counter support, so it is "race free" until somebody fixes the driver ;-) - ie., it would work with apps that use steps 1/2/3 above, but broken for any other app.
More trusting apps could get into more trouble with 50 msecs vs. 5000 msecs ...
What kind of race in your code do you mean? What does your commit message "...now that we've dealt with the hw races in Mario's updated code..." refer to?
I remember that Matthew Garret sent out a patch set a couple of months ago which aimed to make this configurable per kms driver, so each driver could opt into low vblank off delay if it was ready to do it properly. That would make a lot of sense.
That said, for the current intel-ddx it wouldn't even matter for X11/GLX applications if the kernel returned random numbers instead of vblank counts, because the OML_sync_control swap scheduling in the ddx is totally broken, and even glXSwapBuffers is a wreck for any swapinterval setting except 0 and 1 since triple-buffering was introduced a year ago and introduced some bug. Psychtoolbox now detects this since a few weeks, prints out critical warnings to the user and tries to work around with glXSwapBuffers trickery.
I submitted a patch to fix the rather trivial but nasty bug. Jesse, if you find some time, could you review it and convince Chris to merge it? Should be on the intel-gfx list and in your inbox, a simple . Subject line "[PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.", sent at 7 October 2012.
I'll forward it again to you. Chris Wilson already gave me an off-list review for that patch series after i pinged him. Unfortunately a rather terse one, cited here:
"On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
Hi Chris,
can you please check & merge at least the first two patches of the series into the intel ddx?
The first is along the right path, but I think you want to base the split on divisor==0. The second is broken in the DRI2 layer, as the SwapLimit mechanism exposes the multi-client race to a single client (i.e. there is no serialisation between the InvalidateEvent and the GetBuffers, and the InvalidateEvent is always broadcast too early.) The third is just plain wrong, as pageflip may be a blocking ioctl (and will impose client blocking) so anybody who wants to override SwapBuffersWait will also want to prevent the inherent wait due to the flip. In any event that option is to only paper over the loose DRI2 specification and the lack of client control... -Chris "
If i understood the comment wrt. to patch [2/3] "The first is along the right path, but I think you want to base the split on divisor==0." correctly, then that would be just as broken as the current implementation, with some code obfuscation and confusion added, but i don't know, maybe i just misunderstood his comment? I asked for clarification, but he didn't respond to my followup e-mails. Maybe he was just too busy to reply, maybe i'm just too pushy and annoying.
I will try to make some friendship with piglit and see if i can write some tests when i find some time to catch such bugs earlier in the future.
Also, i take from Chris comments about patch [1/3] above that properly implemented triple-buffering with correct timestamping and synchronisation is impossible with the current X-Servers, because the DRI2 invalidate event mechanism is racy and broken to the point of not being fixable? Basically we would have to wait for DRI3?
Sorry for getting a bit off-topic here, but i think it's important to keep all layers working in a meaningful way if you want to support some rather serious applications of computer graphics.
thanks, -mario
On Fri, 02 Nov 2012 06:56:57 +0100 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
Jesse, thanks for cc'ing me, much appreciated :)
Psychtoolbox should be fine with a 50 msecs vblank off delay. I think i tested with values somewhere between 50 - 100 msecs at the time the drm patches were made. The way i schedule swaps for a specific target time 'twhen' is to:
- glXGetSyncValuesOML(msc,ust) as a baseline vblank count and time.
- Some basic math to calculate targetMSC based on 'twhen' and step 1.
- glXSwapBuffersMscOML(targetMSC);
So it should tolerate wrong vblank counts due to races, as long as vblank doesn't get disabled between 1. and 3. A default vblank off delay lower than maybe 1 refresh duration would make me nervous, given that most kms drivers are not race-free and it is possible for a userspace app or the x-server to get stalled for a couple of msecs.
E.g., radeon wouldn't allow this race-free, because the drm code assumes that the hw vblank counter increments at leading edge of vblank, whereas radeon hw does it at the traling edge, iirc. nouveau doesn't have any hw vblank counter support, so it is "race free" until somebody fixes the driver ;-) - ie., it would work with apps that use steps 1/2/3 above, but broken for any other app.
Right, ok that clarifies things for me. I knew you had some kind of race between reading the hw value and scheduling things. Is the above in a comment somewhere in the vblank code?
More trusting apps could get into more trouble with 50 msecs vs. 5000 msecs ...
What kind of race in your code do you mean? What does your commit message "...now that we've dealt with the hw races in Mario's updated code..." refer to?
I was vaguely referring to all the fixes from a couple of years ago, plus a foggy recollection of the above (which I thought might have been magically fixed somehow).
I remember that Matthew Garret sent out a patch set a couple of months ago which aimed to make this configurable per kms driver, so each driver could opt into low vblank off delay if it was ready to do it properly. That would make a lot of sense.
Yeah, that would also be fine with me. Unfortunately, these simple patches seem to rot on the vine and never get applied, so we're stuck with our current 5s timeout (which is huge).
That said, for the current intel-ddx it wouldn't even matter for X11/GLX applications if the kernel returned random numbers instead of vblank counts, because the OML_sync_control swap scheduling in the ddx is totally broken, and even glXSwapBuffers is a wreck for any swapinterval setting except 0 and 1 since triple-buffering was introduced a year ago and introduced some bug. Psychtoolbox now detects this since a few weeks, prints out critical warnings to the user and tries to work around with glXSwapBuffers trickery.
I submitted a patch to fix the rather trivial but nasty bug. Jesse, if you find some time, could you review it and convince Chris to merge it? Should be on the intel-gfx list and in your inbox, a simple . Subject line "[PATCH 2/3] ddx/dri2: Repair broken pageflip swap scheduling.", sent at 7 October 2012.
Yeah, I'll check it out. This code has changed a lot since I last worked on it...
I'll forward it again to you. Chris Wilson already gave me an off-list review for that patch series after i pinged him. Unfortunately a rather terse one, cited here:
"On Mon, 15 Oct 2012 16:46:52 +0200, Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
Hi Chris,
can you please check & merge at least the first two patches of the series into the intel ddx?
The first is along the right path, but I think you want to base the split on divisor==0. The second is broken in the DRI2 layer, as the SwapLimit mechanism exposes the multi-client race to a single client (i.e. there is no serialisation between the InvalidateEvent and the GetBuffers, and the InvalidateEvent is always broadcast too early.) The third is just plain wrong, as pageflip may be a blocking ioctl (and will impose client blocking) so anybody who wants to override SwapBuffersWait will also want to prevent the inherent wait due to the flip. In any event that option is to only paper over the loose DRI2 specification and the lack of client control... -Chris "
If i understood the comment wrt. to patch [2/3] "The first is along the right path, but I think you want to base the split on divisor==0." correctly, then that would be just as broken as the current implementation, with some code obfuscation and confusion added, but i don't know, maybe i just misunderstood his comment? I asked for clarification, but he didn't respond to my followup e-mails. Maybe he was just too busy to reply, maybe i'm just too pushy and annoying.
I will try to make some friendship with piglit and see if i can write some tests when i find some time to catch such bugs earlier in the future.
Also, i take from Chris comments about patch [1/3] above that properly implemented triple-buffering with correct timestamping and synchronisation is impossible with the current X-Servers, because the DRI2 invalidate event mechanism is racy and broken to the point of not being fixable? Basically we would have to wait for DRI3?
Sorry for getting a bit off-topic here, but i think it's important to keep all layers working in a meaningful way if you want to support some rather serious applications of computer graphics.
Agreed, and sorry for the breakage! piglit tests would definitely help if you find time.
Thanks,
On 02.11.12 19:37, Jesse Barnes wrote:
On Fri, 02 Nov 2012 06:56:57 +0100 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
Jesse, thanks for cc'ing me, much appreciated :)
Psychtoolbox should be fine with a 50 msecs vblank off delay. I think i tested with values somewhere between 50 - 100 msecs at the time the drm patches were made. The way i schedule swaps for a specific target time 'twhen' is to:
- glXGetSyncValuesOML(msc,ust) as a baseline vblank count and time.
- Some basic math to calculate targetMSC based on 'twhen' and step 1.
- glXSwapBuffersMscOML(targetMSC);
So it should tolerate wrong vblank counts due to races, as long as vblank doesn't get disabled between 1. and 3. A default vblank off delay lower than maybe 1 refresh duration would make me nervous, given that most kms drivers are not race-free and it is possible for a userspace app or the x-server to get stalled for a couple of msecs.
E.g., radeon wouldn't allow this race-free, because the drm code assumes that the hw vblank counter increments at leading edge of vblank, whereas radeon hw does it at the traling edge, iirc. nouveau doesn't have any hw vblank counter support, so it is "race free" until somebody fixes the driver ;-) - ie., it would work with apps that use steps 1/2/3 above, but broken for any other app.
Right, ok that clarifies things for me. I knew you had some kind of race between reading the hw value and scheduling things. Is the above in a comment somewhere in the vblank code?
There's some explanation in drm_irq.c - vblank_disable_and_save().
Fixing it would mean to make sure that the driver->get_vblank_counter() function always returns hw counts "as if" the hardware increments at leading edge. If the hw does that, great. If not, one can use the drivers scanout position query to find out where the scanout was while reading the hw vblank counter register. Then, based on knowledge when the gpu actually increments, one can either return the read hw value as is, or hw + 1 or whatever to fudge the "increment at leading edge" behaviour.
What kind of race in your code do you mean? What does your commit message "...now that we've dealt with the hw races in Mario's updated code..." refer to?
I was vaguely referring to all the fixes from a couple of years ago, plus a foggy recollection of the above (which I thought might have been magically fixed somehow).
Those fixed races between the irq vblank handler and the vblank on/off code. The race above between the on/off code and hw counter remains.
-mario
On Tue, Oct 30, 2012 at 12:20 PM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
[digging an old thread]
So I'm looking at this machine where we can't get good PSR residency because the vblank_offdelay is so long. Therefore, I'm suddenly very interested in solving this issue :) Of course I can't seem to find logs of the fun IRC discussion you guys had, can you describe what the race is, and also what are the registers you're talking about?
Stéphane
Nacked-by: Daniel Vetter daniel@ffwll.ch
Also discussed with Jesse on irc, we've had fun ;-)
Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, 2 Jul 2014 13:35:19 -0700 Stéphane Marchesin stephane.marchesin@gmail.com wrote:
On Tue, Oct 30, 2012 at 12:20 PM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
[digging an old thread]
So I'm looking at this machine where we can't get good PSR residency because the vblank_offdelay is so long. Therefore, I'm suddenly very interested in solving this issue :) Of course I can't seem to find logs of the fun IRC discussion you guys had, can you describe what the race is, and also what are the registers you're talking about?
Beyond that I don't see why this obvious and simple improvement should be blocked on some other work. Maybe it's a bit late now since Ville may already have patches for what Daniel mentions above, but I still find the nack to be totally misguided.
Dave, please just pick this up so everyone can benefit while we thrash through an i915 fix (doubtless introducing some bugs) that lets us disable immediately.
On Wed, Jul 02, 2014 at 01:42:38PM -0700, Jesse Barnes wrote:
On Wed, 2 Jul 2014 13:35:19 -0700 Stéphane Marchesin stephane.marchesin@gmail.com wrote:
On Tue, Oct 30, 2012 at 12:20 PM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
[digging an old thread]
So I'm looking at this machine where we can't get good PSR residency because the vblank_offdelay is so long. Therefore, I'm suddenly very interested in solving this issue :) Of course I can't seem to find logs of the fun IRC discussion you guys had, can you describe what the race is, and also what are the registers you're talking about?
Beyond that I don't see why this obvious and simple improvement should be blocked on some other work. Maybe it's a bit late now since Ville may already have patches for what Daniel mentions above, but I still find the nack to be totally misguided.
Dave, please just pick this up so everyone can benefit while we thrash through an i915 fix (doubtless introducing some bugs) that lets us disable immediately.
This needs an ack from Mario.
And I really don't see why we _now_ need to suddenly rush then when we have patches from Ville to address this properly. The blocker is only that it's not yet reviewed but meh.
And people with product ship dates looming over their head can always just apply this themselves.
Us sucking at reviewing is imo no reason at all to rush patches in. -Daniel
On Tue, 8 Jul 2014 15:56:04 +0200 Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Jul 02, 2014 at 01:42:38PM -0700, Jesse Barnes wrote:
On Wed, 2 Jul 2014 13:35:19 -0700 Stéphane Marchesin stephane.marchesin@gmail.com wrote:
On Tue, Oct 30, 2012 at 12:20 PM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Oct 30, 2012 at 8:09 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
Afaik the fundamental race of enabling the vblank is still there, so this is just duct-tape. And our hw has the required registers (on gen5+ at least) to close this race for real and abolish all "disable vblank irq later to paper over races and smooth things out). Hence I think we should dtrt and so
[digging an old thread]
So I'm looking at this machine where we can't get good PSR residency because the vblank_offdelay is so long. Therefore, I'm suddenly very interested in solving this issue :) Of course I can't seem to find logs of the fun IRC discussion you guys had, can you describe what the race is, and also what are the registers you're talking about?
Beyond that I don't see why this obvious and simple improvement should be blocked on some other work. Maybe it's a bit late now since Ville may already have patches for what Daniel mentions above, but I still find the nack to be totally misguided.
Dave, please just pick this up so everyone can benefit while we thrash through an i915 fix (doubtless introducing some bugs) that lets us disable immediately.
This needs an ack from Mario.
And I really don't see why we _now_ need to suddenly rush then when we have patches from Ville to address this properly. The blocker is only that it's not yet reviewed but meh.
And people with product ship dates looming over their head can always just apply this themselves.
Us sucking at reviewing is imo no reason at all to rush patches in.
This is just the most recent version: http://lists.freedesktop.org/archives/dri-devel/2012-October/029648.html
IIRC there was one posted back in 2010 too. So hardly rushing.
Cc'ing Mario in case he wants a different value than 50ms.
On Tue, 30 Oct 2012 14:09:12 -0500 Jesse Barnes jbarnes@virtuousgeek.org wrote:
People keep whining about this, but no one seems to send a patch. This *ought* to be safe now that we've dealt with the hw races in Mario's updated code, and fixed the bugs we know about in VT switch, DPMS, and multi-head configuraions.
Signed-off-by: Jesse Barnes jbarnes@virtuousgeek.org
drivers/gpu/drm/drm_stub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c index c236fd2..a1cbc0e 100644 --- a/drivers/gpu/drm/drm_stub.c +++ b/drivers/gpu/drm/drm_stub.c @@ -40,7 +40,7 @@ unsigned int drm_debug = 0; /* 1 to enable debug output */ EXPORT_SYMBOL(drm_debug);
-unsigned int drm_vblank_offdelay = 5000; /* Default to 5000 msecs. */ +unsigned int drm_vblank_offdelay = 50; /* Default to 50 msecs. */ EXPORT_SYMBOL(drm_vblank_offdelay);
unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */
dri-devel@lists.freedesktop.org