https://bugs.freedesktop.org/show_bug.cgi?id=76490
Priority: medium Bug ID: 76490 Assignee: dri-devel@lists.freedesktop.org Summary: No output after radeon module is loaded (R9 270X) Severity: normal Classification: Unclassified OS: All Reporter: mail@geleia.net Hardware: Other Status: NEW Version: unspecified Component: DRM/Radeon Product: DRI
Created attachment 96217 --> https://bugs.freedesktop.org/attachment.cgi?id=96217&action=edit kernel log
The screen goes black after the radeon module is loaded. The only way I can get any output is to blacklist the radeon module, load it via modprobe and then change the resolution with xrandr from another computer via ssh.
I seem to get some sort of lockup if I don't blacklist the module, because then I get a black screen at startup and I cannot even ssh into the machine. I tried to enable netconsole from the kernel command line but I can't get it to work (do I have to compile it statically?).
I tried this with 3.14-rc6. For reference, I'm including the log output I get after I load the radeon module.
This card is an MSI R9 270X Gaming 4G.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #1 from Gustavo Lopes mail@geleia.net --- When it hangs (which always happen if I don't have X already started), I sometimes get a bunch vertical white and black stripes. dmesg doesn't show anything interesting, just
[ 278.575937] fbcon: radeondrmfb (fb0) is primary device [ 278.590490] Console: switching to colour frame buffer device 240x67 [ 278.602467] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device [ 278.602531] radeon 0000:01:00.0: registered panic notifier [ 278.606539] [drm] Initialized radeon 2.37.0 20080528 for 0000:01:00.0 on minor 0
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #2 from Gustavo Lopes mail@geleia.net --- Things seem to work fine with radeon.dpm=0.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #3 from Alex Deucher agd5f@yahoo.com --- Does booting with radeon.runpm=0 on the kernel command line in grub also help?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #4 from Gustavo Lopes mail@geleia.net --- No, it still stalls.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #5 from Gustavo Lopes mail@geleia.net --- Same problem in 3.15-rc5.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #6 from Alex Deucher agd5f@yahoo.com --- Have you installed the latest mc ucode for pitcarin? http://people.freedesktop.org/~agd5f/radeon_ucode/PITCAIRN_mc2.bin make sure that is installed and available in your initrd if you are using one.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #7 from Gustavo Lopes mail@geleia.net --- I installed it now, but still no luck.
glopes ~ $ ls -l /lib/firmware/radeon/PITCAIRN_mc* -rw-r--r-- 1 root root 31100 Mai 13 18:58 /lib/firmware/radeon/PITCAIRN_mc2.bin -rw-r--r-- 1 root root 31076 Mar 23 02:02 /lib/firmware/radeon/PITCAIRN_mc.bin
When I run with radeon.dpm=0, it seems to load the correct file:
[ 0.630585] [drm] radeon: 4096M of VRAM memory ready [ 0.630586] [drm] radeon: 1024M of GTT memory ready. [ 0.630593] [drm] Loading PITCAIRN Microcode [ 0.630632] [drm] radeon/PITCAIRN_mc2.bin: 31100 bytes [ 0.630644] [drm] Internal thermal controller with fan control [ 0.630673] [drm] radeon: power management initialized
I'm attaching the full log as well.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #8 from Gustavo Lopes mail@geleia.net --- Created attachment 98989 --> https://bugs.freedesktop.org/attachment.cgi?id=98989&action=edit kernel log with dpm=0 on 3.15-rc5
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #9 from Gustavo Lopes mail@geleia.net --- Oh and I made sure the initramfs had the module and the firmware. For reference the xz cpio image is here: https://s3-eu-west-1.amazonaws.com/artefacto-test/initramfs-linux-mainline.i...
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #10 from Alex Deucher agd5f@yahoo.com --- Created attachment 98997 --> https://bugs.freedesktop.org/attachment.cgi?id=98997&action=edit disable some dpm features
Does this patch help? If so, can you narrow down which setting(s) are the problematic one(s)?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #11 from Gustavo Lopes mail@geleia.net --- It doesn't seem to help, no.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #12 from Gustavo Lopes mail@geleia.net --- I tried sprinkling si_dpm_ini() and si_dpm_enabled with printk and msleep statements, but while I can tell they're being executed (it takes much longer for the screen to become black due to the sleeps), I cannot see any log messages. The last lines I see are:
[drm] radeon kernel modesetting enabled. fb: switching to radeondrmfb from EFI VGA
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #13 from Gustavo Lopes mail@geleia.net --- Created attachment 99000 --> https://bugs.freedesktop.org/attachment.cgi?id=99000&action=edit kernel log dpm on plus disabling patch
By statically compiling netconsole and the nic driver and having radeon as module, I was able to get a kernel log. This is full log, I get nothing after this.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Gustavo Lopes mail@geleia.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|No output after radeon |Hang during boot when DPM |module is loaded (R9 270X) |is on (R9 270X)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #14 from Gustavo Lopes mail@geleia.net --- Still present in 3.16-rc2: https://gist.github.com/cataphract/29a7c132ef4c240e9330 (last message varies; in my other try it got a little further but the log started later as well)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Alex Deucher agd5f@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dex+fdobugzilla@dragonslave | |.de
--- Comment #15 from Alex Deucher agd5f@yahoo.com --- *** Bug 79773 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #16 from Alex Deucher agd5f@yahoo.com --- Created attachment 101819 --> https://bugs.freedesktop.org/attachment.cgi?id=101819&action=edit disable cg
Does this patch help? You might also try in conjuction with attachment 98997.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #17 from Gustavo Lopes mail@geleia.net --- Still nothing, both with 101819 and 101819 + 98997. Same behavior.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #18 from Alex Deucher agd5f@yahoo.com --- Can you try my drm-next-3.17-wip branch: http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-wip along with the updated ucode here: http://people.freedesktop.org/~agd5f/radeon_ucode/ucode.tar.gz
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #19 from Gustavo Lopes mail@geleia.net --- Nope. Only difference is took some extra 60 seconds when it couldn't find radeon/TAHITI_uvd.bin (which was not in your tarball). After I copied it from my distro's linux-firmware, I had quicker hangs. console output for both situations: https://gist.github.com/cataphract/4dac266bba4f9be44ea7
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #20 from dex+fdobugzilla@dragonslave.de --- I can confirm: drm-next-3.17-wip + new ucode doesn't make any difference.
Testscenario:
* Built/Installed new kernel, copied new ucode into /lib/firmware * Built new initrd * reboot with "nomodeset" and gfxpayload=text into multi user runlevel * modprobe radeon drm=1 modeset=1
Monitor went black (but shows connected DVI), system is unresponsive.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #21 from dex+fdobugzilla@dragonslave.de --- I compared the output of the failing module load with dpm=1:
[ 4.823925] caps: [ 4.823927] uvd vclk: 0 dclk: 0 [ 4.823929] power level 0 sclk: 15000 mclk: 15000 vddc: 950 vddci: 950 pcie gen: 3 [ 4.823930] status: c r b [ 4.823934] == power state 1 == [ 4.823935] ui class: performance [ 4.823937] internal class: none [ 4.823940] caps: [ 4.823941] uvd vclk: 0 dclk: 0 [ 4.823943] power level 0 sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3 [ 4.823945] power level 1 sclk: 45000 mclk: 140000 vddc: 950 vddci: 1025 pcie gen: 3 [ 4.823947] power level 2 sclk: 103000 mclk: 140000 vddc: 1163 vddci: 1025 pcie gen: 3 [ 4.823949] power level 3 sclk: 108000 mclk: 140000 vddc: 1206 vddci: 1025 pcie gen: 3 [ 4.823950] status: [ 4.823952] == power state 2 == [ 4.823953] ui class: none [ 4.823955] internal class: uvd [ 4.823957] caps: video [ 4.823959] uvd vclk: 72000 dclk: 56000 [ 4.823960] power level 0 sclk: 45000 mclk: 140000 vddc: 950 vddci: 1025 pcie gen: 3 [ 4.823975] power level 1 sclk: 45000 mclk: 140000 vddc: 950 vddci: 1025 pcie gen: 3 [ 4.823977] power level 2 sclk: 103000 mclk: 140000 vddc: 1163 vddci: 1025 pcie gen: 3 [ 4.823979] status: [ 4.823980] == power state 3 == [ 4.823981] ui class: none [ 4.823982] internal class: none [ 4.823984] caps: [ 4.823986] uvd vclk: 0 dclk: 0 [ 4.823988] power level 0 sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3 [ 4.823990] power level 1 sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3 [ 4.823991] power level 2 sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3 [ 4.823993] status:
With the VGA Bios someone uploaded here:
http://www.techpowerup.com/vgabios/150430/sapphire-r9270x-4096-131103.html
CCC Overdrive Limits GPU Clock: 1400.00 MHz Memory Clock: 1625.00 MHz Clock State 0 Core Clk: 1070.00 MHz Memory Clk: 1400.00 MHz Flags: Boot Clock State 1 Core Clk: 1070.00 MHz Memory Clk: 1400.00 MHz Flags: Optimal Perf Clock State 2 Core Clk: 1020.00 MHz Memory Clk: 1400.00 MHz Flags: UVD Clock State 3 Core Clk: 300.00 MHz Memory Clk: 150.00 MHz Flags:
For power state 3 sclk and mclk corespond to Core Clk and Memory Clk.
In power state 2 sclk is 10 MHz lower, with power state 1 its 10 MHz higher and in boot state its 100 MHz higher.
I don't know how the radeon DPM code figures the power state levels but something is wrong here.
Can I force dpm into a power level at module load time? I suspect forcing into state 3 should work.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #22 from dex+fdobugzilla@dragonslave.de --- I just tried 3.16.0-rc4-gd8dacc8 from drm-next-3.17-wip.
Still no DPM.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #23 from dex+fdobugzilla@dragonslave.de --- I did some checks using the old profile based aproach for PM and switched between the states.
Following are the data from /sys/kernel/debug/dri/0/radeon_pm_info when switching via echo X > /sys/class/drm/card0/device/power_profile
Default: ================================= default engine clock: 1080000 kHz current engine clock: 149990 kHz default memory clock: 1400000 kHz current memory clock: 149990 kHz voltage: 1206 mV PCIE lanes: 8
Low: ================================= default engine clock: 1080000 kHz current engine clock: 299990 kHz default memory clock: 1400000 kHz current memory clock: 149990 kHz voltage: 875 mV PCIE lanes: 8
Mid: ================================= default engine clock: 1080000 kHz current engine clock: 299990 kHz default memory clock: 1400000 kHz current memory clock: 149990 kHz voltage: 875 mV PCIE lanes: 8
High: ================================= default engine clock: 1080000 kHz current engine clock: 1080000 kHz default memory clock: 1400000 kHz current memory clock: 1399990 kHz voltage: 1206 mV PCIE lanes: 8
The last state (high) results in immediate freeze.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #24 from dex+fdobugzilla@dragonslave.de --- Kernel 3.18.0-rc4 with git://people.freedesktop.org/~agd5f/linux drm-next-3.19 branch atop.
Same as before.
And I noticed I compared with the wrong Link. The right one is this:
http://www.techpowerup.com/vgabios/152427/msi-r9270x-4096-131205-1.html
I'd love to see if there is perhaps an Firmware Update for this card, but MSI only provides a tool namend "Live Update" that only works on $evilOS.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #25 from dex+fdobugzilla@dragonslave.de --- Created attachment 112040 --> https://bugs.freedesktop.org/attachment.cgi?id=112040&action=edit Patch for force lower mclk
I did some clock bisecting and came to the conclusion that (at least on my card) a memclock of 1200 Mhz is the highes stable.
With the attached patch DPM is stable for me.
Could this have something todo with the card having 4Gb of VRAM?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #26 from Alex Deucher alexdeucher@gmail.com --- (In reply to dex+fdobugzilla from comment #25)
Could this have something todo with the card having 4Gb of VRAM?
Doubtful. More likely the card requires special some voltage tweaks for the higher mclks.
Can you attach a copy of your vbios?
(as root) (use lspci to get the bus id) cd /sys/bus/pci/devices/<pci bus id> echo 1 > rom cat rom > /tmp/vbios.rom echo 0 > rom
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #27 from dex+fdobugzilla@dragonslave.de --- Created attachment 112051 --> https://bugs.freedesktop.org/attachment.cgi?id=112051&action=edit Video BIOS MSI R270X 4G Gaming
Here you are. Hope you can disassemble it
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #28 from Alex Deucher alexdeucher@gmail.com --- Created attachment 112144 --> https://bugs.freedesktop.org/attachment.cgi?id=112144&action=edit temporary workaround
The attached patch adds a temporary workaround until I sort out what's wrong with the higher mclk.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #29 from dex+fdobugzilla@dragonslave.de --- I can confirm the patch works.
Will this be part of 3.19?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #30 from Alex Deucher alexdeucher@gmail.com --- (In reply to dex+fdobugzilla from comment #29)
I can confirm the patch works.
Will this be part of 3.19?
yes and stable kernels.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #31 from Gustavo Lopes mail@geleia.net --- I'm using 4.0-rc1 and the radeon module now works, but it hangs once or twice a day, something I did not experience with catalyst. It seems to be more frequent under load.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #32 from Alex Deucher alexdeucher@gmail.com --- (In reply to Gustavo Lopes from comment #31)
I'm using 4.0-rc1 and the radeon module now works, but it hangs once or twice a day, something I did not experience with catalyst. It seems to be more frequent under load.
Does it help if you limit the clock to something lower than 1200Mhz?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #33 from Gustavo Lopes mail@geleia.net --- It doesn't help.
I patched 4.0 rc2 to set the maximum to 1100 Mhz (down from 1200). The computer still hanged after roughly one day running xscreensaver. Another time X seems to have crashed first because I was left seeing two kernel error messages quickly alternating (the same one but about two different rings).
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Fabrice Bellet fabrice@bellet.info changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |fabrice@bellet.info
--- Comment #34 from Fabrice Bellet fabrice@bellet.info --- Created attachment 115340 --> https://bugs.freedesktop.org/attachment.cgi?id=115340&action=edit Video BIOS Sapphire Radeon R9 270 Dual-X 2G GDDR5
I have the same problem with this card, and the workaround also works :
{ PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0xe271, 0, 120000 },
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #35 from Tobias Droste tdroste@gmx.de --- I have to do this:
{ PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 },
This is with a Sapphire Radeon R9 270X 2GB GDDR5.
A higher value for either sclk or mclk results in an instant freeze as soon as the radeon kernel module gets loaded. I'm running linux 4.1 from airlied drm-fixes branch.
I'm quite annoyed by this, because of 3 reasons:
1) I bought this card, because my old card had this PM bug and this didn't look like it would be fixed any time soon: https://bugzilla.kernel.org/show_bug.cgi?id=60523
2) With the settings above the performance of the card is actually *worse* than the old card (+ additional graphical glitches...)
3) This card works fine with any sclk/mclk combination with the same vddc (1238mV) in windows and I can overclock there!
I'm also wondering why I get a different VBIOS size if I get the bios in windows (gpu-z) and linux. Is it because different firmware gets loaded? The (working) vbios under windows is twice as large as the linux one (see attachments).
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Tobias Droste tdroste@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |tdroste@gmx.de
--- Comment #36 from Tobias Droste tdroste@gmx.de --- Created attachment 116921 --> https://bugs.freedesktop.org/attachment.cgi?id=116921&action=edit VBIOS Sapphire Radeon R9 270X 2GB (linux)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #37 from Tobias Droste tdroste@gmx.de --- Created attachment 116922 --> https://bugs.freedesktop.org/attachment.cgi?id=116922&action=edit VBIOS Sapphire Radeon R9 270X 2GB (linux)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Tobias Droste tdroste@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #116922|0 |1 is obsolete| |
--- Comment #38 from Tobias Droste tdroste@gmx.de --- Created attachment 116923 --> https://bugs.freedesktop.org/attachment.cgi?id=116923&action=edit VBIOS Sapphire Radeon R9 270X 2GB (windows)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #39 from Alex Deucher alexdeucher@gmail.com --- (In reply to Tobias Droste from comment #35)
- This card works fine with any sclk/mclk combination with the same vddc
(1238mV) in windows and I can overclock there!
There is apparently some aspect of the set up that we are not programming correctly that manifests with higher clocks on certain boards.
I'm also wondering why I get a different VBIOS size if I get the bios in windows (gpu-z) and linux. Is it because different firmware gets loaded? The (working) vbios under windows is twice as large as the linux one (see attachments).
The vbios is loaded from rom on the card. The firmware for the various micro-controllers on the GPU are loaded by the driver and are not part of the vbios. I'm not sure off hand why they differ. Perhaps gpuz always returns a 128K image regardless of what size the actual bios is? Or maybe it asks the driver windows driver for a copy and the windows driver always stores 128K images regardless of the actual image size. I quick look at the tables and I only see one small difference in the overdrive table: -OD max sclk: 140000, max mclk: 162500 (win) +OD max sclk: 107000, max mclk: 140000 (linux) Everything else appears to be the same. I'm guessing the windows driver patched that and gpuz fetches the copy from the driver.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #40 from Tobias Droste tdroste@gmx.de --- Ah sorry the difference in the bios versions was me. I fiddled with it to try to get it to boot in linux without the workaround in the kernel. You are correct in linux and windows they are the same but GPU-Z seems to add some padding to the end.
Here's another one with a pitcairn where DPM is not working: http://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-sourc...
Do you think it's a problem with the kernel code or with the firmware? Does windows use the same firmware for DPM?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #41 from Alex Deucher alexdeucher@gmail.com --- (In reply to Tobias Droste from comment #40)
Do you think it's a problem with the kernel code or with the firmware? Does windows use the same firmware for DPM?
I think it's probably a driver bug. Windows and Linux use the same ucode.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #42 from Daniel Exner dex+fdobugzilla@dragonslave.de --- My best guess is that clocks are propably ok, but voltage is too low, perhaps confused by the fact that all of those cards are "factory overclocked".
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #43 from Tobias Droste tdroste@gmx.de --- I don't think the voltage is a problem as the voltage used by the linux driver seems to be the same as by the windows driver. For my card it's 1238mV for high(er) clocks in windows and linux. I even tried to set 1238mV for all power profiles in the bios and it was still not working as expected.
All these cards seem to use GDDR5 VRAM. Maybe the driver has to do something different for this type of RAM?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Tobias Droste tdroste@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freedesktop.or | |g/show_bug.cgi?id=91294
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #44 from Daniel Exner dex+fdobugzilla@dragonslave.de --- Just to rule this out I did a bios upgrade and tried reverting the blacklisting of my card: on X start black screen so of no use.
Should I attach the new bios?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #45 from Tobias Droste tdroste@gmx.de --- Where did you get a new bios from? MSI?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #46 from Daniel Exner dex+fdobugzilla@dragonslave.de --- I was lucky as someone had exactly the same card (S/N prefix identical) and requested a new Bios in the MSI forums.
The old bios uploaded there was identical to mine.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #47 from christoffer.appe@gmail.com --- Created attachment 118004 --> https://bugs.freedesktop.org/attachment.cgi?id=118004&action=edit MSI R9 390 MB bios
Recently got an MSI R9 390, it also suffers problem with DPM enabled.
Would really appreciate if someone could help me (and other linux users with MSI R9 390) out with values for the si_dpm_quirk_list line.
Attaching a copy of my vbios, also a link to the card at techpowerup, where the bios also can be found: http://www.techpowerup.com/vgabios/173058/msi-r9390-8192-150521.html
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #48 from Tobias Droste tdroste@gmx.de --- There is only one way to find out the values: Trial and error.
Start with what works with other cards: { PCI_VENDOR_ID_ATI, <PCI_DEVICE_ID>, <PCI_SUBSYSTEM_VENDOR_ID>, <PCI_SUBSYSTEM_DEVICE_ID>, 0, 120000 },
Last value is mclk (memory) and the other is sclk (gpu). 0=use bios default. Values are in 10kHz (not sure why ) so 85000=850MHz, 120000=1.2GHz, ....
I found it easier to first get a memory value that works. With that I could boot up to certain point (sometimes even to login!) and then it crashed. If a memory limit is enough than you're good after that. After I found a memory value that somewhat worked I tried different sclk values to get a system that actually boots and can run for a few hours.
You don't have to fear anything because it will only limit the clocks if the bios clocks are actually higher, so there's nothing that can break. Not sure if there's a problem with too low values, but I don't think so.
So it comes down to change values -> recompile kernel module -> reboot -> if it's still not working, start again.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #49 from Kevin wittyman37@yahoo.com --- I think I our issues are related if not the same. I bisected and that brought me to this bug report. It seems like a "fix" for this bug caused my issues.
https://bugzilla.kernel.org/show_bug.cgi?id=103271
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #50 from Tobias Droste tdroste@gmx.de --- Hm nice... Could you upload your bios? Would be interesting if it's different to my bios. I can't event boot without this workaround.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #51 from Kevin McCormack wittyman37@yahoo.com --- Tobias, I don't know how to do that. If you can explain or point me in the right direction I'd be happy to upload the bios.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #52 from Tobias Droste tdroste@gmx.de ---
From comment #26:
(as root) (use lspci to get the bus id) cd /sys/bus/pci/devices/<pci bus id> echo 1 > rom cat rom > /tmp/vbios.rom echo 0 > rom
then upload /tmp/vbios.rom
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #53 from Kevin McCormack wittyman37@yahoo.com --- Created attachment 118292 --> https://bugs.freedesktop.org/attachment.cgi?id=118292&action=edit Sapphire Dual-X R9 270X 2GB OC Edition vbios
OK, Tobias, I did as you guided me.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #54 from Tobias Droste tdroste@gmx.de --- Ok they _are_ different. Alex can you have a look at this and tell us what's different between the bioses?
Compare VBIOS Sapphire Radeon R9 270X 2GB (windows) with Sapphire Dual-X R9 270X 2GB OC Edition vbios
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #55 from Tobias Droste tdroste@gmx.de --- What I can see:
Your bios: AMD VER015.0400.001
My bios: AMD VER015.0400.032
Your bios: 12/09/13 00:31
My bios: 12/25/14 22:33
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #56 from Kevin McCormack wittyman37@yahoo.com --- Hey guys, I am just wondering if there is any news about this? I noticed a new commit for an MSI R7 370 here https://github.com/torvalds/linux/commit/e78654799135a788a941bacad3452fbd708... that makes my patch now not work. So it looks like this may be a gpu bios issue. Should I update my bios? If so, how do I do this? Thanks!
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #57 from Alex Deucher alexdeucher@gmail.com --- I don't think this has anything to do with the vbios. I suspect the same pci ids are just used in multiple board configurations (e.g., different clocks or vram chip vendors or voltage configurations) so the pci ids are not enough as is to differentiate. I need to take a closer look at the vbioses.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #58 from Kevin McCormack wittyman37@yahoo.com --- Alex, have you had a chance to look at the vbios files? I think that Michael Larabel of Phoronix is also having difficulties with his R9 270X card.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #59 from m.gabrielboehme@googlemail.com --- I switched to a PowerColor R7 370 PCS+ and have the same problems as reported already. Starting with radeon.dpm=0 or nomodeset helps to boot up. I'm on Fedora 23 at the moment with a 4.2 kernel version. The quirk_list fix (in my case: { PCI_VENDOR_ID_ATI, 0x6811, 0x148c, 0x2356, <CORE_CLOCK>, <VRAM_CLOCK>} ) seems not to work, but I'll try some more values. I'll also add the vbios of my card.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #60 from m.gabrielboehme@googlemail.com --- Created attachment 119434 --> https://bugs.freedesktop.org/attachment.cgi?id=119434&action=edit PowerColor R7 370 PCS+ VBIOS
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #61 from Maxim Sheviakov mrader3940@gmail.com --- Heh, got a pretty same issue. Although I've got my patch for MSI R7 370 Armor 2X proposed and present in 4.3, I've got some weird issues with dpm, like complete system hang + black screen after some time using PC (dpm enabled), so I have to put radeon.dpm=0 to params to boot and use the system somehow. However, it looks like an ID conflict in si_dpm.c because of a newer patch to that file (check github), because my GPU works flawlessly with 4.2.X kernel + my patch applied. Here's my bug, if someone's interested: https://bugs.freedesktop.org/show_bug.cgi?id=92865
https://bugs.freedesktop.org/show_bug.cgi?id=76490
almos aaalmosss@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |aaalmosss@gmail.com
--- Comment #62 from almos aaalmosss@gmail.com --- I also have problems with dpm on my ASUS R9 270X. Under no load and high load it seems stable, but with low load (e.g. playing an old game, or watching a video with mpv -vo opengl) it is very unstable. It suddenly switches to white screen, and the machine is hardlocked. I couldn't reach more than 2-3 days of uptime.
Since I activated the profile method and I switch manually between low and high states, it hasn't crashed. It also seems stable in windows.
My guess is that it can't properly handle frequent switching between power level 0 and 1, where all clocks and voltages change at once (or maybe it's just the memory reclocking?).
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #63 from Stefan Ott stefan@ott.net --- I appear to have the same issue on an ASUS STRIX R7 370. It's also a factory-overclocked card and radeon.dpm=0 seems to work.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #64 from Maxim Sheviakov mrader3940@gmail.com --- So what does it mean? It boots with high (not low) clocks without dpm?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Benjamin Bellec b.bellec@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |b.bellec@gmail.com
--- Comment #65 from Benjamin Bellec b.bellec@gmail.com --- I just bought a Gigabyte "GV-R737WF2OC-2GD" (R7 370). Same problem: unable to boot Linux (Fedora 23 GNOME Workstation) Same fix: radeon.dpm=0
It was provided with a VBIOS "015.048.000.061" (F2 release) which I updated to "015.048.000.069" (F3 release) without improvement. http://www.gigabyte.fr/products/product-page.aspx?pid=5469#bios
The card works on Windows 10.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #66 from Benjamin Bellec b.bellec@gmail.com --- Created attachment 120154 --> https://bugs.freedesktop.org/attachment.cgi?id=120154&action=edit Gigabyte GV-R737WF2OC-2GD BIOS (F3 version)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #67 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Benjamin Bellec from comment #65)
I just bought a Gigabyte "GV-R737WF2OC-2GD" (R7 370). Same problem: unable to boot Linux (Fedora 23 GNOME Workstation) Same fix: radeon.dpm=0
It was provided with a VBIOS "015.048.000.061" (F2 release) which I updated to "015.048.000.069" (F3 release) without improvement. http://www.gigabyte.fr/products/product-page.aspx?pid=5469#bios
The card works on Windows 10.
You gotta read your VBios and insert values into the kernel source's drivers/gpu/drm/radeon/si_dpm.c into the quirk list. Google it for how to do that. Basically, you'll have to get such software (techpowerup provides one, as far as I remember) and then for your working system, put the needed ones in thay file and recompile your kernel. Then you may even send a commit :)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #68 from Benjamin Bellec b.bellec@gmail.com --- If I have to read the VBIOS and add I quirk in the kernel, why the kernel can't do this by himself ?
Moreover, I saw the previous quirk in the kernel, the max memory clock is often set to "120000". I guess it stands for 1.2GHz QDR which is equivalent to 4.8GHz. My card, like all the R7 370 are supposed to work at 5.6GHz so this is a serious lost of performance.
At the moment I will just return my card.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #69 from Maxim Sheviakov mrader3940@gmail.com --- I can't see no logic. Nothing sensible. Remember: NOTHING does XXX automatically,it first has to be implemented. And, hell, tbe kernel actually reads the vbios and looks for the same IDs, but it's not able to find them - they are absent.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #70 from Tobias Droste tdroste@gmx.de --- (In reply to Benjamin Bellec from comment #68)
If I have to read the VBIOS and add I quirk in the kernel, why the kernel can't do this by himself ?
Moreover, I saw the previous quirk in the kernel, the max memory clock is often set to "120000". I guess it stands for 1.2GHz QDR which is equivalent to 4.8GHz. My card, like all the R7 370 are supposed to work at 5.6GHz so this is a serious lost of performance.
At the moment I will just return my card.
For the record:
This stuff *can't* be read from the VBIOS and has to be found by trial and error. You also don't have to google the steps, they are described in comment #48.
But otherwise you are right, It will limit your card and replacing it with another one seems like the only option you have right now. At least that's what I did too, because I don't see this bug fixed in the near future.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #71 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Tobias Droste from comment #70)
(In reply to Benjamin Bellec from comment #68)
If I have to read the VBIOS and add I quirk in the kernel, why the kernel can't do this by himself ?
Moreover, I saw the previous quirk in the kernel, the max memory clock is often set to "120000". I guess it stands for 1.2GHz QDR which is equivalent to 4.8GHz. My card, like all the R7 370 are supposed to work at 5.6GHz so this is a serious lost of performance.
At the moment I will just return my card.
For the record:
This stuff *can't* be read from the VBIOS and has to be found by trial and error. You also don't have to google the steps, they are described in comment #48.
But otherwise you are right, It will limit your card and replacing it with another one seems like the only option you have right now. At least that's what I did too, because I don't see this bug fixed in the near future.
Hmm, nice point. By the way, is the MCLK kinda divided by four? So, if Memory Clock is 5600MHz, I'll have to do 5600/4*10000 to get the correct value? It's like, 120000 value = 1.2GHz*4 = Original frequency = 4800MHz, right? If that's it, I'll fix my quirk (AGAIN, LOL) and try using 970MHz and 5600MHz written as needed, 'cause my GPU is MSI R7 370 Armor 2X, and looks like values in quirk are *kinda* low for it.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #72 from Alex Deucher alexdeucher@gmail.com --- (In reply to Maxim Sheviakov from comment #71)
Hmm, nice point. By the way, is the MCLK kinda divided by four? So, if Memory Clock is 5600MHz, I'll have to do 5600/4*10000 to get the correct value? It's like, 120000 value = 1.2GHz*4 = Original frequency = 4800MHz, right? If that's it, I'll fix my quirk (AGAIN, LOL) and try using 970MHz and 5600MHz written as needed, 'cause my GPU is MSI R7 370 Armor 2X, and looks like values in quirk are *kinda* low for it.
The mclk values are the actual mclk values. GDDR5 is quad pumped so you get 4x effective data rate per clock. That might be what you are thinking of.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #73 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Alex Deucher from comment #72)
(In reply to Maxim Sheviakov from comment #71)
Hmm, nice point. By the way, is the MCLK kinda divided by four? So, if Memory Clock is 5600MHz, I'll have to do 5600/4*10000 to get the correct value? It's like, 120000 value = 1.2GHz*4 = Original frequency = 4800MHz, right? If that's it, I'll fix my quirk (AGAIN, LOL) and try using 970MHz and 5600MHz written as needed, 'cause my GPU is MSI R7 370 Armor 2X, and looks like values in quirk are *kinda* low for it.
The mclk values are the actual mclk values. GDDR5 is quad pumped so you get 4x effective data rate per clock. That might be what you are thinking of.
Looks like I get it now. Today I'll try to play with those values and experiment with MCLK values, maybe with GPU clock too; if it's good, I will let everyone know.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #74 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Maxim Sheviakov from comment #73)
(In reply to Alex Deucher from comment #72)
(In reply to Maxim Sheviakov from comment #71)
Hmm, nice point. By the way, is the MCLK kinda divided by four? So, if Memory Clock is 5600MHz, I'll have to do 5600/4*10000 to get the correct value? It's like, 120000 value = 1.2GHz*4 = Original frequency = 4800MHz, right? If that's it, I'll fix my quirk (AGAIN, LOL) and try using 970MHz and 5600MHz written as needed, 'cause my GPU is MSI R7 370 Armor 2X, and looks like values in quirk are *kinda* low for it.
The mclk values are the actual mclk values. GDDR5 is quad pumped so you get 4x effective data rate per clock. That might be what you are thinking of.
Looks like I get it now. Today I'll try to play with those values and experiment with MCLK values, maybe with GPU clock too; if it's good, I will let everyone know.
So right now I'm building a test kernel based on Linux Zen 4.3. Changed values in my line: from "{... 0, 120000}," to "{... 97000, 140000}", so that GPU clock is 970MHz and Memory clock is 1.4GHz aka 5.6GHz. Will let you all know if I succeed in that.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #75 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Maxim Sheviakov from comment #74)
(In reply to Maxim Sheviakov from comment #73)
(In reply to Alex Deucher from comment #72)
(In reply to Maxim Sheviakov from comment #71)
Hmm, nice point. By the way, is the MCLK kinda divided by four? So, if Memory Clock is 5600MHz, I'll have to do 5600/4*10000 to get the correct value? It's like, 120000 value = 1.2GHz*4 = Original frequency = 4800MHz, right? If that's it, I'll fix my quirk (AGAIN, LOL) and try using 970MHz and 5600MHz written as needed, 'cause my GPU is MSI R7 370 Armor 2X, and looks like values in quirk are *kinda* low for it.
The mclk values are the actual mclk values. GDDR5 is quad pumped so you get 4x effective data rate per clock. That might be what you are thinking of.
Looks like I get it now. Today I'll try to play with those values and experiment with MCLK values, maybe with GPU clock too; if it's good, I will let everyone know.
So right now I'm building a test kernel based on Linux Zen 4.3. Changed values in my line: from "{... 0, 120000}," to "{... 97000, 140000}", so that GPU clock is 970MHz and Memory clock is 1.4GHz aka 5.6GHz. Will let you all know if I succeed in that.
Nope, the system is unusable after Plymouth tries to start. Even with 1.3GHz. Looks like it's a dpm error, as on Windows the card is really stable. with those values, even if it's a bit overclocked.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #76 from Alex Deucher alexdeucher@gmail.com --- Can you try the code in this branch: http://cgit.freedesktop.org/~agd5f/linux/log/?h=new_smc and the new ucode from here: http://people.freedesktop.org/~agd5f/radeon_ucode/k/
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #77 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Alex Deucher from comment #76)
Can you try the code in this branch: http://cgit.freedesktop.org/~agd5f/linux/log/?h=new_smc and the new ucode from here: http://people.freedesktop.org/~agd5f/radeon_ucode/k/
How do I do it? For the first link: Is it enough to copy http://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon?h=new_s... to 4.3 source tree? Or should I use the git ver of kernel?
Second link: what should I do with it?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #78 from Alex Deucher alexdeucher@gmail.com --- (In reply to Maxim Sheviakov from comment #77)
How do I do it? For the first link: Is it enough to copy http://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/ radeon?h=new_smc to 4.3 source tree? Or should I use the git ver of kernel?
Either fetch the git tree and build it directly or apply the top 4 patches to another kernel.
Second link: what should I do with it?
Add the files to /lib/firmware/radeon and update your initrd if you are using one.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #79 from Maxim Sheviakov mrader3940@gmail.com --- Oh, thanks. When I get my new PSU (maybe tomorrow) I'll rebuild my 4.3-zen and build 4.4 from git, both with these changes and normal GPU (higher than present in 4.3/4.4) clocks - will report.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #80 from Daniel Exner dex+fdobugzilla@dragonslave.de --- Is this a revision of the previous override? Read: should this previous patch be reverted before testing?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #81 from Alex Deucher alexdeucher@gmail.com --- (In reply to Daniel Exner from comment #80)
Is this a revision of the previous override? Read: should this previous patch be reverted before testing?
If you have a quirk in place for your board, remove it.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #82 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Alex Deucher from comment #81)
(In reply to Daniel Exner from comment #80)
Is this a revision of the previous override? Read: should this previous patch be reverted before testing?
If you have a quirk in place for your board, remove it.
So, those workaround lines in si_dpm.c have to be removed in order to use thise new patches?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #83 from Maxim Sheviakov mrader3940@gmail.com --- These*
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #84 from Alex Deucher alexdeucher@gmail.com --- (In reply to Maxim Sheviakov from comment #82)
So, those workaround lines in si_dpm.c have to be removed in order to use thise new patches?
You can try the patches either way. You need to remove the quick for your card if there is one to see if they eliminate the need for the quirk.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #85 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Alex Deucher from comment #84)
(In reply to Maxim Sheviakov from comment #82)
So, those workaround lines in si_dpm.c have to be removed in order to use thise new patches?
You can try the patches either way. You need to remove the quick for your card if there is one to see if they eliminate the need for the quirk.
Roger that! Will try ASAP (still haven't got my PSU).
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #86 from Daniel Exner dex+fdobugzilla@dragonslave.de --- Tried kernel 4.4.0-rc4 with
"drm/radeon: load different smc firmware on some SI variants"
and
"drm/radeon: print pci revision id as well as pci ids"
applied.
The good news: this kernel boots just fine:
[ 3.120205] [drm] radeon kernel modesetting enabled. [ 3.135919] [drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6810 0x1462:0x3036 0x00).
But if I remove line 2927 from drivers/gpu/drm/radeon/si_dpm.c the initial problems return: boot fails.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #87 from Stefan Ott stefan@ott.net --- Nice, this seems to fix the issue on my ASUS card.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #88 from Tobias Droste tdroste@gmx.de --- Doesn't fix it for me, it still locks up at boot with dpm enabled and the quirk removed.
[drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6810 0x174B:0xE271 0x00)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #89 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Tobias Droste from comment #88)
Doesn't fix it for me, it still locks up at boot with dpm enabled and the quirk removed.
[drm] initializing kernel modesetting (PITCAIRN 0x1002:0x6810 0x174B:0xE271 0x00)
Have you put the new firmware files to your initramfs/initrd? Check replies above.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #90 from Tobias Droste tdroste@gmx.de --- Yes I did.
And right now it's only working for Stefan (R7 370).
It's not working for me (R9 270X) and Daniel (R9 270X).
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #91 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Tobias Droste from comment #90)
Yes I did.
And right now it's only working for Stefan (R7 370).
It's not working for me (R9 270X) and Daniel (R9 270X).
Hmm... Seems like the code is useful for 3XX GPUs. Anyway, still no PSU with me, and I will test the changes with my R7 370 from MSI when I get the thingie. We gotta find somebody else with R7 370 and ask to try those patches & firmware.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #92 from Maxim Sheviakov mrader3940@gmail.com --- So, got my PSU yesterday. Compiled 4.3.3-zen with -Ofast + those patches, quirk removed and firmware added to initrd. Modesetting works, I'm able to see Plymouth finishing its animation. However, at X start stage I get a complete hang, but monitor's active. It's likely a PM error, as else there would be a hang at modesetting stage. It's similar to an issue I had when compiled the kernel with quirk containing my card's normal MEM and CORE clock values - hang due to PM error.
Should I do something else? And is there a way to make the card work at its normal frequencies?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #93 from Maxim Sheviakov mrader3940@gmail.com --- Can anyone give me values from si_dpm.c for MSI R7 370 2GB Gaming 2G (Red)? I think I have an idea on how to implement higher/normal clocks on Armor 2X. Also, a copy of fresh VBios would be welcome.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #94 from Maxim Sheviakov mrader3940@gmail.com --- How can I acquire GPU and MEM clocks being used? Just tried flashing R7 370 Gaming 2G VBIOS from EvilOS-10 and it boots and even works on my Archlinux installation. Is there a way to get the values?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #95 from Tobias Droste tdroste@gmx.de --- $ cat /sys/kernel/debug/dri/0/radeon_pm_info
If you have debugfs mounted on /sys/kernel/debug
Are you suggesting that Microsoft Windows 10 is delivering a different VBIOS for your card then what was originally installed on the graphics card?
If so, who installs this? Windows itself? As far as I know is the driver only loading some binaries inside the VBIOS, but not replacing it. Or is this a new feature of the windows driver?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #96 from Maxim Sheviakov mrader3940@gmail.com --- 1) Thanks. 2) Nope. There's a tool - "ATIFlash" - from TechPowerUp. It allows you to A) Save your current VBios B) Flash another VBios I think we have to modify vendor/model IDs, or fix clocks to their normal values. Yup, no powersaving, but who cares?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #97 from Maxim Sheviakov mrader3940@gmail.com --- He-hey! Succeeded in booting and making the card work with 1050Mhz core clocks! So, I added the firmware, applied the pathes from Alex, modified quirk's values so that it's 1020MHz core + 1200MHz mem, compiled -zen kernel - got the X server working. Couldn't test anymore, but further info will arise at about 17:00 Moscow time.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #98 from Alex Deucher alexdeucher@gmail.com --- (In reply to Tobias Droste from comment #95)
Are you suggesting that Microsoft Windows 10 is delivering a different VBIOS for your card then what was originally installed on the graphics card?
If so, who installs this? Windows itself? As far as I know is the driver only loading some binaries inside the VBIOS, but not replacing it. Or is this a new feature of the windows driver?
Neither Windows nor the Windows driver flashes a new vbios. Flashing an arbitrary vbios is not recommended, may render your card useless, and may void your warranty.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #99 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Alex Deucher from comment #98)
(In reply to Tobias Droste from comment #95)
Are you suggesting that Microsoft Windows 10 is delivering a different VBIOS for your card then what was originally installed on the graphics card?
If so, who installs this? Windows itself? As far as I know is the driver only loading some binaries inside the VBIOS, but not replacing it. Or is this a new feature of the windows driver?
Neither Windows nor the Windows driver flashes a new vbios. Flashing an arbitrary vbios is not recommended, may render your card useless, and may void your warranty.
Interesting, but it got flashed 0_0 Also, the problem is not in VBios or IDs. It's all about memory clock - setting a value higher than 1.2GHz (in a quirk) makes the system hang after Plymouth/before display server start. So, to my mind we have to do something with DPM/PowerPlay code or make some userspace overclock support, as there's no other way right now. By the way, is there such a tool that allows to overclock memory of the card?
And yep, with its standard VBios card works with 1050MHz/1.2GHz (core/memory) clocks. I'm using those SMC patches + new firmware. Maybe they should be sent upstream, even to add that new firmware files and code to use them?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #100 from Daniel Exner dex+fdobugzilla@dragonslave.de --- (In reply to Maxim Sheviakov from comment #99)
Interesting, but it got flashed 0_0 Also, the problem is not in VBios or IDs. It's all about memory clock - setting a value higher than 1.2GHz (in a quirk) makes the system hang after Plymouth/before display server start. So, to my mind we have to do something with DPM/PowerPlay code or make some userspace overclock support, as there's no other way right now. By the way, is there such a tool that allows to overclock memory of the card?
If that worked for you you are lucky, but at least I won't flash a different BIOS just to _downgrade_ my card, possibly breaking it completely. Alas the already in place quirk results in exactly the same.
And yep, with its standard VBios card works with 1050MHz/1.2GHz (core/memory) clocks. I'm using those SMC patches + new firmware. Maybe they should be sent upstream, even to add that new firmware files and code to use them?
The new firmware files are fine for 370X it seems but still need work for 270X. I guess most 270X users CC in this ticket will happily test possible reworked ones as soon as they are available and we patiently wait for Alex.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #101 from Maxim Sheviakov mrader3940@gmail.com --- (In reply to Daniel Exner from comment #100)
If that worked for you you are lucky, but at least I won't flash a different BIOS just to _downgrade_ my card, possibly breaking it completely. Alas the already in place quirk results in exactly the same.
That's not a _downgrade_, that's a way to change an ID.
And yep, with its standard VBios card works with 1050MHz/1.2GHz (core/memory) clocks. I'm using those SMC patches + new firmware. Maybe they should be sent upstream, even to add that new firmware files and code to use them?
The new firmware files are fine for 370X it seems but still need work for 270X. I guess most 270X users CC in this ticket will happily test possible reworked ones as soon as they are available and we patiently wait for Alex.
1) No 370X :D 2)I guess everyone in this CC will happily test anything that is *supposed* to fix the issues =)
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Nicholas Vaughan nchlsvaughan@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freedesktop.or | |g/show_bug.cgi?id=94692
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #102 from samdenies@zhentarim.net --- I want to add another data point for a card not yet mentioned in this bug. I have had this issue for quite some time, awaiting a fix. I run a fully-updated Debian testing, and my card is described below.
XFX R9 270X Vendor ID: 1002 Device ID: 6810 Subsystem Vendor ID: 1682 Subsystem Device ID: 9275
I don't believe this matches the existing quirk, and I haven't created a custom kernel to add one. Running with radeon.drm=0 allows it to boot and basically function, but with very poor 3D performance.
I'd be more than happy to provide any additional diagnostic information within my abilities to collect, and test any potential fixes.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #103 from Alex Deucher alexdeucher@gmail.com --- (In reply to samdenies from comment #102)
I want to add another data point for a card not yet mentioned in this bug. I have had this issue for quite some time, awaiting a fix. I run a fully-updated Debian testing, and my card is described below.
XFX R9 270X Vendor ID: 1002 Device ID: 6810 Subsystem Vendor ID: 1682 Subsystem Device ID: 9275
I don't believe this matches the existing quirk, and I haven't created a custom kernel to add one. Running with radeon.drm=0 allows it to boot and basically function, but with very poor 3D performance.
I'd be more than happy to provide any additional diagnostic information within my abilities to collect, and test any potential fixes.
Please attach the output of lspci -vnn
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #104 from samdenies@zhentarim.net --- Created attachment 122942 --> https://bugs.freedesktop.org/attachment.cgi?id=122942&action=edit XFX R9 270X lspci -xnn results
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #105 from Alex Deucher alexdeucher@gmail.com --- Created attachment 122946 --> https://bugs.freedesktop.org/attachment.cgi?id=122946&action=edit possible fix
(In reply to samdenies from comment #102)
I want to add another data point for a card not yet mentioned in this bug. I have had this issue for quite some time, awaiting a fix. I run a fully-updated Debian testing, and my card is described below.
XFX R9 270X Vendor ID: 1002 Device ID: 6810 Subsystem Vendor ID: 1682 Subsystem Device ID: 9275
I don't believe this matches the existing quirk, and I haven't created a custom kernel to add one. Running with radeon.drm=0 allows it to boot and basically function, but with very poor 3D performance.
I'd be more than happy to provide any additional diagnostic information within my abilities to collect, and test any potential fixes.
Does this attached patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #106 from samdenies@zhentarim.net --- (In reply to Alex Deucher from comment #105)
Does this attached patch help?
I was not able to apply the patch itself as it didn't match the source for 4.5.1 that I downloaded. However, adding the line manually did fix my problem. I am able to boot without radeon.dpm=0 and have good 3d performance. Thanks!
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Michael Rosile mike@rosile.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #107 from Michael Rosile mike@rosile.com --- Thank you Alex Deucher! I have the same graphics card as samdenies (XFX R9 270X), and was looking through various mailing lists to find an answer (I wasn't expecting to find an answer at bugs.freedesktop.org). I knew the issue was related to the memory clock speed, but didn't know how to change it in Linux, until now.
I manually added the required 'quirk' line to a custom 4.5.2 kernel, and it's working great!
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Benjamin Bellec b.bellec@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|FIXED |--- Status|RESOLVED |REOPENED
--- Comment #108 from Benjamin Bellec b.bellec@gmail.com --- (In reply to Michael Rosile from comment #107)
Thank you Alex Deucher! I have the same graphics card as samdenies (XFX R9 270X), and was looking through various mailing lists to find an answer (I wasn't expecting to find an answer at bugs.freedesktop.org). I knew the issue was related to the memory clock speed, but didn't know how to change it in Linux, until now.
I manually added the required 'quirk' line to a custom 4.5.2 kernel, and it's working great!
This is not fixed at all: - there is probably several other videocards from other vendors which don't works (the Gigabyte "GV-R737WF2OC-2GD" for instance) - the quirk added underclocks the mclock from 5600 MHz to 4800 MHz, so you don't get the full performance you are expecting
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #109 from Gustavo Lopes mail@geleia.net --- Not to mention that even with the quirk I would get (last time I tried) a hang every 1-2 days. Catalyst has been quite stable for me.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
thirdloop@teknik.io changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |thirdloop@teknik.io
--- Comment #110 from thirdloop@teknik.io --- Created attachment 123371 --> https://bugs.freedesktop.org/attachment.cgi?id=123371&action=edit sapphire nitro r7 370 4gb lspci -vnn output
I'm on ubuntu 16.04 (can't use the fglrx driver anymore) and I have been trying the most recent kernels, but I think the SAPPHIRE NITRO R7 370 4GB still suffers from this bug. Product link just in case... http://www.newegg.com/Product/Product.aspx?Item=N82E16814202152&cm_re=sa... Can anyone help me out please? Attaching lspci -vnn output.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #111 from Alex Deucher alexdeucher@gmail.com --- (In reply to thirdloop from comment #110)
Created attachment 123371 [details] sapphire nitro r7 370 4gb lspci -vnn output
I'm on ubuntu 16.04 (can't use the fglrx driver anymore) and I have been trying the most recent kernels, but I think the SAPPHIRE NITRO R7 370 4GB still suffers from this bug. Product link just in case... http://www.newegg.com/Product/Product. aspx?Item=N82E16814202152&cm_re=sapphire_nitro_r7_370-_-14-202-152-_-Product Can anyone help me out please? Attaching lspci -vnn output.
Already fixed in this patch: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0e...
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #112 from Daniel Exner dex+fdobugzilla@dragonslave.de --- If I read that correct R9 270X is a GCN 1.0 card and thus should be supported by experimental drm-next-4.8-wip-si branch.
Is it worth trying? AMDGPU is using a yet another PM system (PowerPlay) , so perhaps it works better, without having to blacklist?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #113 from Alex Deucher alexdeucher@gmail.com --- (In reply to Daniel Exner from comment #112)
If I read that correct R9 270X is a GCN 1.0 card and thus should be supported by experimental drm-next-4.8-wip-si branch.
Is it worth trying? AMDGPU is using a yet another PM system (PowerPlay) , so perhaps it works better, without having to blacklist?
That tree is using the same code power management as radeon, just ported to amdgpu.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #114 from Daniel Exner dex+fdobugzilla@dragonslave.de --- (In reply to Alex Deucher from comment #113)
That tree is using the same code power management as radeon, just ported to amdgpu.
Ok, thx for the clarification. Then I'll patiently wait for a proper fix.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
AmarildoJr amarildosjr@riseup.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |amarildosjr@riseup.net
--- Comment #115 from AmarildoJr amarildosjr@riseup.net --- Created attachment 125334 --> https://bugs.freedesktop.org/attachment.cgi?id=125334&action=edit Patch that I use myself
Would this patch help? I also have DPM problems with my R9 270X and this patch fixes it for me.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #116 from Alex Deucher alexdeucher@gmail.com --- Created attachment 126814 --> https://bugs.freedesktop.org/attachment.cgi?id=126814&action=edit possible fix
Does this patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #117 from Daniel Exner dex+fdobugzilla@dragonslave.de --- (In reply to Alex Deucher from comment #116)
Created attachment 126814 [details] [review] possible fix
Does this patch help?
I applied the patch on Kernel 4.8.0-rc8-00771-g8ab293e: result is a stable system as before, so at least it didn't introduce a regression.
Then I disabled the override for my card below:
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index e6abc09..bcaa675 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2924,7 +2924,6 @@ struct si_dpm_quirk { /* cards with dpm stability problems */ static struct si_dpm_quirk si_dpm_quirk_list[] = { /* PITCAIRN - https://bugs.freedesktop.org/show_bug.cgi?id=76490 */ - { PCI_VENDOR_ID_ATI, 0x6810, 0x1462, 0x3036, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0xe271, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0x2015, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 },
Result is the same as without your patch: black screen and non responsive system.
Should I also revert "drm/radeon: load different smc firmware on some SI variants"?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #118 from Alex Deucher alexdeucher@gmail.com --- (In reply to Daniel Exner from comment #117)
(In reply to Alex Deucher from comment #116)
Created attachment 126814 [details] [review] [review] possible fix
Does this patch help?
I applied the patch on Kernel 4.8.0-rc8-00771-g8ab293e: result is a stable system as before, so at least it didn't introduce a regression.
Then I disabled the override for my card below:
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index e6abc09..bcaa675 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2924,7 +2924,6 @@ struct si_dpm_quirk { /* cards with dpm stability problems */ static struct si_dpm_quirk si_dpm_quirk_list[] = { /* PITCAIRN - https://bugs.freedesktop.org/show_bug.cgi?id=76490 */
{ PCI_VENDOR_ID_ATI, 0x6810, 0x1462, 0x3036, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0xe271, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0x2015, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 },
Result is the same as without your patch: black screen and non responsive system.
Ok.
Should I also revert "drm/radeon: load different smc firmware on some SI variants"?
No.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #119 from Daniel Exner dex+fdobugzilla@dragonslave.de --- Good news!
With kernel 4.10.0-rc5-00071-ga4685d2f58e2 that includes:
drm/radeon/si: load special ucode for certain MC configs
from drm-fixes-4.10 branch and the si58_mc.bin file from
https://people.freedesktop.org/~agd5f/radeon_ucode/
I could boot fine.
This small change I made indeed showed it is using the file for my card: + { + DRM_INFO("Loading special si58_mc Microcode\n"); snprintf(fw_name, sizeof(fw_name), "radeon/si58_mc.bin"); + }
Then I could remove the quirk I needed!
- { PCI_VENDOR_ID_ATI, 0x6810, 0x1462, 0x3036, 0, 120000 },
I guess 3h portal 2 are enough to verify everything works now as it should.
Perhaps others can test their quirk lines, too?
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #120 from Elia Argentieri elia.argentieri@openmailbox.org --- Yes! My graphics card can finally unleash all its potential! Following your suggestion, I downloaded linux 4.10 master, removed this from quirks (R7 370):
{ PCI_VENDOR_ID_ATI, 0x6811, 0x1462, 0x2015, 0, 120000 },
then I compiled and downloaded si58_mc.bin to /lib/firmware.
After reboot, I couldn't believe it! Performance improved a LOT, it feels like I have a brand new gpu. Also another commit fixed VM faults, so it is also more stable.
While I was at it, I compiled support for amdgpu too, and it works fine on Wayland for me, but if I start X, my monitor reports frequency not supported.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #121 from Franc[e]sco lolisamurai@tfwno.gf --- I removed the quirks for my r9 270x and I have no stability issues whatsoever, it's a really nice performance boost.
this is the line I commented out for my card: { PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 },
and here's full info on my system on this forum post: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/amd-linux...
let me know if you need any more testing on this, but I'm pretty sure it's stable
https://bugs.freedesktop.org/show_bug.cgi?id=76490
--- Comment #122 from Franc[e]sco lolisamurai@tfwno.gf --- I also edited this piece of code (still in si_dpm.c) to let my memory clock hit 1400 MHz which is stock speed for this card, and I'm still running rock solid:
/* limit all SI kickers */ if (rdev->family == CHIP_PITCAIRN) { if ((rdev->pdev->revision == 0x81) || (rdev->pdev->device == 0x6810) || (rdev->pdev->device == 0x6811) || (rdev->pdev->device == 0x6816) || (rdev->pdev->device == 0x6817) || (rdev->pdev->device == 0x6806)) max_mclk = 145000; } else if (rdev->family == CHIP_VERDE) { ...
Not sure why it doesn't hit my 1450MHz overclock (which is flashed to the card's bios), but I'm very pleased compared to the previous 1200MHz.
https://bugs.freedesktop.org/show_bug.cgi?id=76490
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution|--- |FIXED
dri-devel@lists.freedesktop.org