https://bugzilla.kernel.org/show_bug.cgi?id=119211
Bug ID: 119211 Summary: amdgpu disables fan by default Product: Drivers Version: 2.5 Kernel Version: 4.5.5 Hardware: All OS: Linux Tree: Fedora Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: stsp@list.ru Regression: No
I have radeon R9 380. After the KMS driver activates, the fans on the GPU stops. They can be activated again by properly setting up fancontrol, but this wasn't configured on my PC. As the result, for the last few years I have replaced many motherboards, all starting to have bad capacitors around the video card. But only now I have noticed that the fans are not rotating... :(
The driver should activate fans by default, or don't touch the initial settings (fans are rotating before linux have started), but not stop them by default.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #1 from Vedran Miletić vedran@miletic.net --- You should produce some GPU load before the fan activates.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #2 from Stas Sergeev stsp@list.ru --- Even besides the fact that GPU was so hot I couldn't even touch it?
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #3 from Stas Sergeev stsp@list.ru --- Essentially, when driver initializes, it puts 0 to /sys/class/hwmon/hwmon0/pwm1. And unless you set up fancontrol (which is a major pita), this 0 remain there, no matter how you load you GPU. It should put some other value there, like 50 or more. On my system 50 is a minimum value needed to get the GPU fan rotating.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #4 from Jimi JimiJames.Bove@gmail.com --- Is this bug still happening? With my R9 Fury on amdgpu, cat /sys/class/hwmon/hwmon0/pwm1 (well, in my case, it's hwmon2 because I have another card), returns 35 on idle, not 0, but the fans are not running. Even when I'm running a AAA game, my card doesn't even reach 40°C, so my cooling system is too good for me to be able to actually see if the fans turn on when they should. When I first started my computer up, pwm1 was giving me 56, but it went down to 35 before I could finish opening my case and has stayed there no matter what I do. When the card is bound to vfio-pci instead of amdgpu (for a virtual machine), the fan is on all the time, even though the card's low idle temperatures must be similar.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #5 from Jimi JimiJames.Bove@gmail.com --- I managed to check the fans while pwm1 was giving values like 68, 61, and 56, and they were not turned on. I don't know if that means anything, because the card was still <40 degrees and definitely not too hot to touch.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #6 from Stas Sergeev stsp@list.ru ---
Is this bug still happening?
For me it is happening as a hell. And because fancontrol service also doesn't work on my PC (I've filled another reports about it), the problems are very real.
I managed to check the fans while pwm1 was giving values like
You can write the values there, too. In fact, I wonder who changes them for you. Do you have the fancontrol set up and running? $ systemctl status fancontrol
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #7 from Jimi JimiJames.Bove@gmail.com --- I do not have fancontrol set up or running (it's inactive on my system). I don't know anything about fancontrol at all. I'm running Arch Linux, so I pretty much only am running services that I know about.
I tried writing values myself with echo, like 'echo 50 > /sys/class/hwmon/hwmon0/pwm1', but that didn't affect it at all. Is that not how you're supposed to change it?
https://bugzilla.kernel.org/show_bug.cgi?id=119211
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #8 from Alex Deucher alexdeucher@gmail.com --- By default the hw controls the fan based on temperature, etc.
Not all cards have a fan control. If you do, then the following standard HWMON pwm attributes should be available:
* pwm1_enable: Current fan management mode (MANUAL or AUTO) * pwm1: Current PWM value (power percentage) * pwm1_min: The minimum PWM speed allowed * pwm1_max: The maximum PWM speed allowed (bypassed when hitting Fan_boost)
The fan can be driven in different modes:
* 1: The fan can be driven in manual (use pwm1 to change the speed); * 2; The fan is driven automatically depending on the temperature.
See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/...
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #9 from Stas Sergeev stsp@list.ru --- (In reply to Alex Deucher from comment #8)
By default the hw controls the fan based on temperature, etc.
For me not.
Not all cards have a fan control. If you do, then the following standard HWMON pwm attributes should be available:
- pwm1_enable: Current fan management mode (MANUAL or AUTO)
I have always '1' there. Trying to write 0 or 2 there still leaves 1. It simply doesn't change.
- pwm1: Current PWM value (power percentage)
Always 0, unless manually written.
The fan can be driven in different modes:
- 1: The fan can be driven in manual (use pwm1 to change the speed);
- 2; The fan is driven automatically depending on the temperature.
What should I write to pwm1_enable? '2'? It doesn't change.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #10 from Alex Deucher alexdeucher@gmail.com --- Please attach your dmesg output and xorg log (if running X).
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #11 from Stas Sergeev stsp@list.ru --- Created attachment 228111 --> https://bugzilla.kernel.org/attachment.cgi?id=228111&action=edit dmesg
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #12 from Stas Sergeev stsp@list.ru --- Created attachment 228121 --> https://bugzilla.kernel.org/attachment.cgi?id=228121&action=edit Xorg log
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #13 from Jimi JimiJames.Bove@gmail.com --- That's interesting. My pwm1_enable returns 1, and trying to change it to 0 or 2 does nothing, but my pwm1 value does indeed change on its own, and I've never seen it be 0. It sounds like I don't have this bug but do have some other less major one? If pwm1 is changing on its own, can I trust that the fan will turn on if my card ever gets too hot?
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #14 from Jimi JimiJames.Bove@gmail.com --- I should mention, my pwm1_min is 0 and pwm1_max is 255.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #15 from Stas Sergeev stsp@list.ru ---
I should mention, my pwm1_min is 0 and pwm1_max is 255.
Same here. IMHO pwm1_min should contain the value that keeps the fan rotating at a minimal safe speed. Putting 0 there makes it entirely useless.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #16 from Jimi JimiJames.Bove@gmail.com --- Not necessarily. Less fans means less power usage means money saved, and as we can see with my computer, you can keep the card cool without its own fans. I have 4 case fans that are plugged directly into power and so are less competent at knowing when to turn off. Something needs to be done about your fan not activating when it should, though.
https://bugzilla.kernel.org/show_bug.cgi?id=119211
--- Comment #17 from Stas Sergeev stsp@list.ru ---
Not necessarily. Less fans means less power usage means money saved
You can set up fancontrol or put 0 into pwm1 manually to stop the fan. But putting 0 into pwm1_min is IMHO quite useless, it can as well just not exist at all. But if it will contain the minimum _safe_ value, then that can well be used. Currently fancontrol have to "evaluate" the minimal safe value by hands. It lowers the pwm1 value and looks when the fan have stopped by checking the value of fan1_input if that exists. And it doesn't exist for amdgpu, so you need to do such a probe by hands.
dri-devel@lists.freedesktop.org