https://bugs.freedesktop.org/show_bug.cgi?id=82889
Priority: medium Bug ID: 82889 Assignee: dri-devel@lists.freedesktop.org Summary: [drm:si_dpm_set_power_state] *ERROR* si_disable_ulv failed Severity: normal Classification: Unclassified OS: Linux (All) Reporter: mmstickman@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Drivers/Gallium/radeonsi Product: Mesa
Yet another bug I've encountered on my Radeon HD 7950 with kernel 3.16.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Product|Mesa |DRI Version|git |unspecified Component|Drivers/Gallium/radeonsi |DRM/Radeon
--- Comment #1 from Michel Dänzer michel@daenzer.net --- Please attach the dmesg output.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #2 from mmstickman@gmail.com --- Created attachment 105003 --> https://bugs.freedesktop.org/attachment.cgi?id=105003&action=edit dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #3 from Alex Deucher agd5f@yahoo.com --- Is this a regression? If so, can you bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #4 from mmstickman@gmail.com --- I'm not sure if it is a regression or simply a new feature that doesn't work. I don't recall seeing the message in kernel 3.14 or prior. I don't have any experience in bisecting.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #5 from Samir Ibradžić sibradzic@gmail.com --- Created attachment 105455 --> https://bugs.freedesktop.org/attachment.cgi?id=105455&action=edit Radeon dpm hang
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #6 from Samir Ibradžić sibradzic@gmail.com --- Created attachment 105456 --> https://bugs.freedesktop.org/attachment.cgi?id=105456&action=edit Radeon dpm success
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #7 from Samir Ibradžić sibradzic@gmail.com --- I see this happening on Radeon HD 7950, kernel 3.13, which has the radeon dpm enabled by dafault. It is intermittent, happens on ~30% boots, and causes hang followed by reboot (no panic or oops msgs). Unfortunately, I could only catch the dmesg output via serial, when my machine hangs, kernel logs are not even saved to the disk.
Now, I see "[drm:si_dpm_set_power_state] *ERROR* si_disable_ulv failed" on serial only when ignore_loglevel kernel parameter is unset. Machine will hang ad reboot each time I spot it. I attached here the dmesg with ignore_loglevel and drm.debug=1 params, both failure and ok cases, for comparison. When failing, the machine just hangs breefly, and reboots, right after "[drm] pitch is 7680" message.
With radeon.dpm=0 parameter, this problem NEVER happns!
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #8 from lorenz.bona@gmail.com --- Same warning here with 3.17rc5. Building from this repository
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-3.17
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #9 from lorenz.bona@gmail.com --- Created attachment 106873 --> https://bugs.freedesktop.org/attachment.cgi?id=106873&action=edit dmesg | grep drm
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #10 from Alexandre Demers alexandre.f.demers@gmail.com --- Just moved from a 6950 (r600g) to a 7950 (radeonsi) and hit the same error on kernel 3.17-rc6.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #11 from Alexandre Demers alexandre.f.demers@gmail.com --- Is ULV standing for Ultra-low voltage? If so, isn't this option something meant to be applied on APU only?
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #12 from Alexandre Demers alexandre.f.demers@gmail.com --- Created attachment 106884 --> https://bugs.freedesktop.org/attachment.cgi?id=106884&action=edit journalctl log
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #13 from Alexandre Demers alexandre.f.demers@gmail.com --- I commented out every "return ret;" of si_dpm_set_power_state() in si_dpm.c. After booting this modified kernel, I can confirm this is the only error reported in si_dpm_set_power_state(): every other verification passes OK and it goes down to the very end.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #14 from Lorenzo Bona lorenz.bona@gmail.com --- (In reply to comment #11)
Is ULV standing for Ultra-low voltage? If so, isn't this option something meant to be applied on APU only?
Mmm don't know. Haven't digged more, but my GPU (R7-265) is running hotter than before, always around 39°-40° (even without any running application). With the kernel in debian sid, 3.16.X, I can see low temperatures in idle.
I'm unable to trigger this warning booting up with radeon.dpm=0, so I think is something related to power management.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #15 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to comment #14)
(In reply to comment #11)
Is ULV standing for Ultra-low voltage? If so, isn't this option something meant to be applied on APU only?
Mmm don't know. Haven't digged more, but my GPU (R7-265) is running hotter than before, always around 39°-40° (even without any running application). With the kernel in debian sid, 3.16.X, I can see low temperatures in idle.
I'm unable to trigger this warning booting up with radeon.dpm=0, so I think is something related to power management.
Well, according to my journalctl log, it seems to always be triggered after a power state switching. It doesn't do it from boot (power state 0) to performance (power state 1), but it is later. Strangely, I see a Sep 25 20:28:28 Xander kernel: switching from power state: Sep 25 20:28:28 Xander kernel: ui class: performance ... Sep 25 20:28:28 Xander kernel: switching to power state: Sep 25 20:28:28 Xander kernel: ui class: performance
Why would it try to switch from power state 1 to power state 1 (the same power state)? And why is it at that moment the problem arises? I'll have to do more tests to see if this behaviour happens each time.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #16 from Alexandre Demers alexandre.f.demers@gmail.com --- Alex, I think this "ERROR" should be at most a warning: I've been commenting out the "return ret" when we hit the error, and everything else goes as smooth as possible.
Also, do you have any clue on the way we should dig to understand why we are hitting this error? As said by Samir, this appeared with dpm.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #17 from Alex Deucher agd5f@yahoo.com --- Created attachment 107784 --> https://bugs.freedesktop.org/attachment.cgi?id=107784&action=edit disable ulv state on SI
(In reply to Alexandre Demers from comment #16)
Alex, I think this "ERROR" should be at most a warning: I've been commenting out the "return ret" when we hit the error, and everything else goes as smooth as possible.
Also, do you have any clue on the way we should dig to understand why we are hitting this error? As said by Samir, this appeared with dpm.
It's part of dpm so it only happens when dpm is enabled. ulv is a special low power state the card can go to in certain idle cases.
Does the attached patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #18 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to Alex Deucher from comment #17)
Created attachment 107784 [details] [review] disable ulv state on SI
(In reply to Alexandre Demers from comment #16)
Alex, I think this "ERROR" should be at most a warning: I've been commenting out the "return ret" when we hit the error, and everything else goes as smooth as possible.
Also, do you have any clue on the way we should dig to understand why we are hitting this error? As said by Samir, this appeared with dpm.
It's part of dpm so it only happens when dpm is enabled. ulv is a special low power state the card can go to in certain idle cases.
Does the attached patch help?
I changed yet again my card and I'm now running a R9 280X. I'll put the old card in tomorrow to have a look at it.
So ulv is a feature available on both APUs and 7950 (and some other GPUs). Nice to know.
But is ulv support truly supposed to be available on Tahiti? In fact, prior to your patch, why is there already a comment "/* XXX disable for A0 tahiti */" in drivers/gpu/drm/radeon/si_dpm.c but ulv.supported is set to true anyway just on the next line (the one you propose to change in your patch)? To me, it's like saying a thing and doing exactly the opposite at the same time, isn't it? Or is it because there is a special case (Tahiti) that we should be addressing identified by the comment that we are not?
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #19 from Alex Deucher agd5f@yahoo.com --- (In reply to Alexandre Demers from comment #18)
But is ulv support truly supposed to be available on Tahiti? In fact, prior to your patch, why is there already a comment "/* XXX disable for A0 tahiti */" in drivers/gpu/drm/radeon/si_dpm.c but ulv.supported is set to true anyway just on the next line (the one you propose to change in your patch)? To me, it's like saying a thing and doing exactly the opposite at the same time, isn't it? Or is it because there is a special case (Tahiti) that we should be addressing identified by the comment that we are not?
A0 is first silicon (basically the initial silicon samples we get back from the fab during bring up). The issue was fixed in later silicon revisions. There usually aren't any A0 boards in the wild.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #20 from Alexandre Demers alexandre.f.demers@gmail.com --- Sadly, I won't be able to test this patch, I had an opportunity to sell my hd 7950. We can keep it open if someone else can test it.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #21 from Lorenzo Bona lorenz.bona@gmail.com --- I've rebuilded today the whole stack (mesa, ddx, drm, xorg, and kernel) with latest commit.
Looks like the problem is now solved. Dmesg attached.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #22 from Lorenzo Bona lorenz.bona@gmail.com --- Created attachment 108125 --> https://bugs.freedesktop.org/attachment.cgi?id=108125&action=edit dmesg | grep drm
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #23 from sean darcy seandarcy2@gmail.com --- I've a kaveri a8-7100. After this patch, is there a way to reenable ulv without rebuilding drm?
https://bugs.freedesktop.org/show_bug.cgi?id=82889
--- Comment #24 from Alex Deucher agd5f@yahoo.com --- (In reply to sean darcy from comment #23)
I've a kaveri a8-7100. After this patch, is there a way to reenable ulv without rebuilding drm?
This patch and bug have nothing to do with Kaveri. It's specifically related Southern Islands GPUs.
https://bugs.freedesktop.org/show_bug.cgi?id=82889
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |MOVED Status|NEW |RESOLVED
--- Comment #25 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/518.
dri-devel@lists.freedesktop.org