https://bugzilla.kernel.org/show_bug.cgi?id=76321
Bug ID: 76321 Summary: Incorrect hwmon temperature when radeon card is turned off Product: Drivers Version: 2.5 Kernel Version: 3.15-rc3 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: pali.rohar@gmail.com Regression: No
When radeon card is turned off (automatically via runpm) sensors program show totally incorrect temperature:
radeon-pci-0100 Adapter: PCI adapter temp1: +511.0°C (crit = +120.0°C, hyst = +90.0°C)
When some opengl program is running with DRI_PRIME=1 then sensors reports:
radeon-pci-0100 Adapter: PCI adapter temp1: +46.0°C (crit = +120.0°C, hyst = +90.0°C)
So kernel radeon driver should not report temperature (e.g. returns -EINVAL in hwmon) when radeon card is turned off (via vgaswitcheroo or runpm).
https://bugzilla.kernel.org/show_bug.cgi?id=76321
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #1 from Alex Deucher alexdeucher@gmail.com --- Created attachment 136331 --> https://bugzilla.kernel.org/attachment.cgi?id=136331&action=edit possible fix
Does this patch help?
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #2 from Pali Rohár pali.rohar@gmail.com --- Here is output from sensors with your patch (when card is turned off):
radeon-pci-0100 Adapter: PCI adapter ERROR: Can't get value of subfeature temp1_crit: Can't read ERROR: Can't get value of subfeature temp1_crit_hyst: Can't read temp1: N/A (crit = +0.0°C, hyst = +0.0°C)
Values crit & hyst worked fine (as it is not runtime value, but static).
And also your patch disabling showing/changing dpm/pm profile. I do not know if there is some problem to use that sysfs files when card is turned off, but if not I think there is no reason to disable it.
So in my opinion only this part of patch is needed:
@@ -538,8 +570,13 @@ static ssize_t radeon_hwmon_show_temp(struct device *dev, char *buf) { struct radeon_device *rdev = dev_get_drvdata(dev); + struct drm_device *ddev = rdev->ddev; int temp;
+ if ((rdev->flags & RADEON_IS_PX) && + (ddev->switch_power_state != DRM_SWITCH_POWER_ON)) + return -EINVAL; + if (rdev->asic->pm.get_temperature) temp = radeon_get_temperature(rdev); else
Or is really needed other hunks too?
https://bugzilla.kernel.org/show_bug.cgi?id=76321
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #136331|0 |1 is obsolete| |
--- Comment #3 from Alex Deucher alexdeucher@gmail.com --- Created attachment 136431 --> https://bugzilla.kernel.org/attachment.cgi?id=136431&action=edit possible fix v2
You can't change any hardware state when the card is powered down. This patch is a little less strict however.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #4 from Pali Rohár pali.rohar@gmail.com --- With v2 patch now sensors does not report any error (when card is turned off):
$ sensors radeon-pci-0100 Adapter: PCI adapter temp1: N/A (crit = +120.0°C, hyst = +90.0°C)
This looks ok.
And for power_dpm_state & power_dpm_force_performance_level: I understand that it cannot be changed when card is turned off (I see that it also disappear from PCI bus space), but I would like to see option to set "initial" dpm state/level which will be used when card is turned om again. This can be usefull for scripts which setting powersave options depending on hostplugging AC adapter. What do you think about using same sysfs entries for it?
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #5 from Alex Deucher alexdeucher@gmail.com --- (In reply to Pali Rohár from comment #4)
And for power_dpm_state & power_dpm_force_performance_level: I understand that it cannot be changed when card is turned off (I see that it also disappear from PCI bus space), but I would like to see option to set "initial" dpm state/level which will be used when card is turned om again. This can be usefull for scripts which setting powersave options depending on hostplugging AC adapter. What do you think about using same sysfs entries for it?
That's fine if someone wants to implement it. It's not a high priority for me at the moment.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #6 from Pali Rohár pali.rohar@gmail.com --- Ok, I will look at it and will try to implemenent it. So can you commit radeon_hwmon_show_temp() part of patch?
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #7 from Alex Deucher alexdeucher@gmail.com --- Yes, I already sent attachment 136431 upstream.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #8 from Pali Rohár pali.rohar@gmail.com --- Created attachment 146011 --> https://bugzilla.kernel.org/attachment.cgi?id=146011&action=edit patch for get/set dpm state
(In reply to Alex Deucher from comment #5)
(In reply to Pali Rohár from comment #4)
And for power_dpm_state & power_dpm_force_performance_level: I understand that it cannot be changed when card is turned off (I see that it also disappear from PCI bus space), but I would like to see option to set "initial" dpm state/level which will be used when card is turned om again. This can be usefull for scripts which setting powersave options depending on hostplugging AC adapter. What do you think about using same sysfs entries for it?
That's fine if someone wants to implement it. It's not a high priority for me at the moment.
It is implemented in attached patch. I tested it and it working fine on my system. My scripts automacitally change dpm state to battery when power adapter is unplugged. And when radeon card is powered on it initialize in battery state (even if last state was performance before turning card off).
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #9 from Alex Deucher alexdeucher@gmail.com --- (In reply to Pali Rohár from comment #8)
Created attachment 146011 [details] patch for get/set dpm state
It is implemented in attached patch. I tested it and it working fine on my system. My scripts automacitally change dpm state to battery when power adapter is unplugged. And when radeon card is powered on it initialize in battery state (even if last state was performance before turning card off).
Patch looks fine wth respect to the power state change. You can drop the changes to radeon_[get|set]_dpm_forced_performance_level() however. That won't work without changes to the current force_performance_level interfaces. If you can clean up the patch as requested and generate a proper git patch, I'll apply it.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
Pali Rohár pali.rohar@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #146011|0 |1 is obsolete| |
--- Comment #10 from Pali Rohár pali.rohar@gmail.com --- Created attachment 146241 --> https://bugzilla.kernel.org/attachment.cgi?id=146241&action=edit patch v2 for get/set dpm state
New patch v2 without changes to radeon_[get|set]_dpm_forced_performance_level().
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #11 from Alex Deucher alexdeucher@gmail.com --- Applied to my tree. thanks.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #12 from Pali Rohár pali.rohar@gmail.com --- Alex, can you send that patch to stable trees like that in attachment 136431? Because before these patches sysfs reported battery/performance dpm state when card was turned off and some script used that information...
https://bugzilla.kernel.org/show_bug.cgi?id=76321
--- Comment #13 from Alex Deucher alexdeucher@gmail.com --- Yes, I've cc'ed stable.
https://bugzilla.kernel.org/show_bug.cgi?id=76321
Pali Rohár pali.rohar@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |CODE_FIX
dri-devel@lists.freedesktop.org