On Wed, May 21, 2014 at 10:35:59AM +0200, Daniel Vetter wrote:
On Wed, May 21, 2014 at 11:26:55AM +0300, Ville Syrjälä wrote:
For everything but the reset_vblank_counter() thing: Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com
And another caveat: I only glanced at the crtc_helper stuff. It looks sane but I didn't go reading through the drivers to figure out if they would fall over or work.
Oh, the rfc was really just that. I don't have any plans to burn my hands trying to merge those patches ;-) Especially since interest from non-i915 hackers seems to be low.
About the reset_vblank_counter(), I think we might still need it to keep the counter sane when the power well goes off. But I think we have other problems on that front esp. with suspend to disk. There the counter should definitely get reset on all platforms, so we migth apply any kind of diff to the user visible value. The fix would likely be to skip the diff adjustment when resuming.
I tried to take a quick look at that yesterday but there was something really fishy happening as the code didn't seem to observe the wraparound at all, even though I confirmed w/ intel_reg_read that it definitely did appear to wrap. I'll take another look at it today.
Another idea might be to rip out the diff adjustment altogether. That should just mean that the user visible counter wouldn't advance at all between drm_vblank_off() and drm_vblank_on(). But that might actually be the sane thing to do. Maybe we should just do a +1 there to make sure we don't report the same value before and after modeset. It should fix both the suspend problems and the power well problem.
Hm, like I've mentioned yesterday on irc the tests I have actually pass, at least if I throw your sanitize_crtc patch on top. vblank frame counter values monotonically increase across suspend/resume, runtime pm and anything else I manged to throw at it. And the limit in the test is 100 frames later, but I've only observed a few tens at most.
So I think the code as-is actually works. Whether intentional or not is still under dispute though ;-)
The real problem I have with the hsw counter reset is that the same issue should affect _any_ platform where we support runtime pm. Like snb or byt. But the code isn't there. Also if we have such a bug then it will also affect hibernate and suspend to disk. Which means that this should be done in drm_crtc_vblank_off/on, not in the guts of some random platforms runtime pm code.
So in my opinion the hsw vblank_count reset code needs to go anyway because:
- Either it isn't need any more (and we have the tests for this now) and it's just cargo-culted duct-tape.
- Or we need, but then it's in the wrong spot.
Given that can you reconsider that patch please?
Yeah. So as discussed on irc I think the right fix would be to sample the current counter in drm_vblank_on(), stick it into dev->vblank[crtc].last, but skip the diff adjustment to the user visible counter (maybe just +1 to make sure we never report the same value on both sides of a modeset). That should cover both the suspend case and the power well case. I can go and experiment with this a bit...
So I agree that the current code isn't the way things should be done. And since I now have an idea how it should be done, I'm fine with ripping the current thing out. So you can add: Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com