https://bugs.freedesktop.org/show_bug.cgi?id=92974
Bug ID: 92974 Summary: Fiji Nano long boot up and long X startup with amdgpu-powerplay enabled Product: DRI Version: DRI git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: bug0xa3d2@hushmail.com
Created attachment 119727 --> https://bugs.freedesktop.org/attachment.cgi?id=119727&action=edit Output of dmesg after startup.
With AMD Powerplay enabled in the kernel (cloned from http://cgit.freedesktop.org/~agd5f/linux/?h=amdgpu-powerplay)--when the computer boots there is text, as usual, before modesetting tries to switch to the native resolution of the monitor. At this point, the monitor backlight turns off for about 5 seconds and then stays on for approximately 2 minutes displaying nothing as if the computer locked up with no other boot activity occuring. The Alt+PrintScreen+E/I/U/R button combo still works at this time. If left alone the system finishes booting normally after about 2 minutes. dmesg gives these new error/messages: "Failed to send Previous Message." and ..."[drm] ib test on ring 12 succeeded" that I have not noticed when AMD Powerplay is disabled.
Trying to run, "startx" at the console also simulates a system freeze for about 2 minutes or so then X starts normally.
To work around the long boot and long X startup bug the kernel option, "Device Drivers/Graphics support/Direct Rendering Manager/Enable legacy fbdev support for your modesetting driver" must be set with, "n". This gives a black screen or no console after the machine boots to log in blind with. X can also be started blind and displayed normally with no startup delay. Another option is to autologin with KDM or such like to boot directly into X after legacy fbdev has been disabled in the kernel options.
With AMD Powerplay disabled in the kernel there are no startup issues.
Toggling or including/excluding these kernel boot, "lilo.conf" options have no effect on the bug: "amdgpu.enable_scheduler=0 ; radeon.modeset=1 ; radeon.hw_i2c=1 ; radeon.disp_priority=2 ; radeon.fastfb=1 ; radeon.backlight=1 ; radeon.pcie_gen2=-1 ; radeon.hard_reset=0"
These patches (applied and not applied) have no effect on the bug: http://people.freedesktop.org/~agd5f/0001-radeonsi-fix-fiji-raster-config.pa... http://people.freedesktop.org/~agd5f/0001-drm-amdgpu-update-Fiji-s-tiling-mo...
Hardware: Fury Nano git: http://cgit.freedesktop.org/~agd5f/linux/commit/?h=amdgpu-powerplay&id=2...
Sidenote: hang check timer is disabled in the kernel.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #1 from Michel Dänzer michel@daenzer.net --- (In reply to charlie from comment #0)
With AMD Powerplay disabled in the kernel there are no startup issues.
Do you mean CONFIG_DRM_AMD_POWERPLAY=n in .config or amdgpu.powerplay=0 on the kernel command line?
radeon.* parameters don't have any effect with the amdgpu driver.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #2 from charlie bug0xa3d2@hushmail.com --- Sorry for the delay in response. My computer was down.
I mean with CONFIG_DRM_AMD_POWERPLAY=n in kernel the ".config" file the computer starts up normally and X starts normally too without a 2 minute delay after each.
With CONFIG_DRM_AMD_POWERPLAY=y in the kernel .config file there are two minute delays at boot and also after type, "startx".
However with CONFIG_DRM_AMD_POWERPLAY=y in kernel this delay/bug can be bypassed if, "Device Drivers/Graphics support/Direct Rendering Manager/Enable legacy fbdev support for your modesetting driver" is set to, "n". I don't know what ".config" line name is for that option which I set using, "make menuconfig". Bypassing the bug this way just causes a blank screen however I can still login blind and start X to a normal display. Or I can use KDM or SDDM to auto-login for me.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #3 from Michel Dänzer michel@daenzer.net --- What if you enable building the PowerPlay code with CONFIG_DRM_AMD_POWERPLAY=y but disable it at runtime with amdgpu.powerplay=0 on the kernel command line? Does the problem occur then or not?
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #4 from charlie bug0xa3d2@hushmail.com --- With these options there is no ~2 minute kernel boot or startx delay bug:
CONFIG_DRM_AMD_POWERPLAY=y in kernel ".config"
"Device Drivers/Graphics support/Direct Rendering Manager/Enable legacy fbdev support for your modesetting driver" set to, "y" in kernel "make menuconfig".
"amdgpu.powerplay=0" on lilo append line.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #5 from charlie bug0xa3d2@hushmail.com --- I recently tried out kernel code up to this version: http://cgit.freedesktop.org/~agd5f/linux/commit/?h=amdgpu-powerplay&id=a...
There has been no change--bug remains.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #6 from charlie bug0xa3d2@hushmail.com --- Created attachment 120362 --> https://bugs.freedesktop.org/attachment.cgi?id=120362&action=edit dmesg Friday Dec. 4, 2015
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #7 from charlie bug0xa3d2@hushmail.com --- I recently compiled this kernel version: http://cgit.freedesktop.org/~agd5f/linux/commit/?h=amdgpu-powerplay&id=a...
There was no change in the bug. I'll submit my kernel ".config" as well.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #8 from charlie bug0xa3d2@hushmail.com --- Created attachment 120363 --> https://bugs.freedesktop.org/attachment.cgi?id=120363&action=edit Kernel config Dec. 4, 2015
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #9 from charlie bug0xa3d2@hushmail.com --- This commit:
"drm/amdgpu/powerplay/fiji: query supported pcie info from cgs (v2)"
http://cgit.freedesktop.org/~agd5f/linux/commit/?h=amdgpu-powerplay&id=6...
and everything after the above commit has the long boot bug.
______________________
This commit:
"drm/amdgpu/powerplay/tonga: query supported pcie info from cgs (v2)"
http://cgit.freedesktop.org/~agd5f/linux/commit/?h=amdgpu-powerplay&id=2...
does not have the long boot bug on Fiji Nano.
______________________
I did a "git reset --hard HEAD~49" (among others...) from the current git at "amd/powerplay: don't enable ucode fan control if vbios has no fan table" to eventually get to a normal boot.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #10 from Alex Deucher alexdeucher@gmail.com --- Created attachment 120642 --> https://bugs.freedesktop.org/attachment.cgi?id=120642&action=edit disable pcie dpm
Does this patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #11 from Alex Deucher alexdeucher@gmail.com --- Created attachment 120643 --> https://bugs.freedesktop.org/attachment.cgi?id=120643&action=edit disable pcie gen3 switching
Please try this patch independent of the previous one.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #12 from charlie bug0xa3d2@hushmail.com --- Both patches ("disable_gen3.diff" and "fiji_disable_pcie_dpm.diff") applied independently of each other work. The kernel boots normally and X starts normally.
These commands were issued before each *.diff was applied and compiled: "git clean -dxf ; git fetch --all ; git reset --hard origin/amdgpu-powerplay"
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #13 from charlie bug0xa3d2@hushmail.com --- I'm now using "drm-next-4.8-wip" (from https://cgit.freedesktop.org/~agd5f/linux/). I still require "fiji_disable_pcie_dpm.diff" to overcome the bug. I can't remember if "disable_gen3.diff" no longer patches cleanly or does not work once the kernel is compiled. In any case, "disable_gen3.diff" is no longer effective.
Is this a bios issue? If so, then I can upgrade to the latest bios for my motherboard to see if the bug persists without patching the kernel.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #14 from Alex Deucher alexdeucher@gmail.com --- (In reply to charlie from comment #13)
It could be. Does a new bios help? We've seen similar issues with certain boards internally. What CPU/motherboard is this? If your board has options for configuring the default pcie gen or generic pcie performance options does changing any of them help?
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #15 from Alex Deucher alexdeucher@gmail.com --- (In reply to charlie from comment #13)
On newer kernels you can configure the supported pcie gen modes via module option. E.g., append: amdgpu.pcie_gen_cap=0x00030003 to the kernel command line in grub to limit the bus and the card to pcie gen2.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #16 from charlie bug0xa3d2@hushmail.com --- Motherboard: Asus A88X-PRO APU: AMD A10-7850K (monitor only receiving R9 Nano output)
I will update the bios and see if there are any "new pcie gen or generic pcie performance options".
I report back on the use of "amdgpu.pcie_gen_cap=0x00030003" in lilo although "PCIe 3.0" is printed on the motherboard.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
--- Comment #17 from charlie bug0xa3d2@hushmail.com --- The bios was updated from 0801 to 2603. The machine booted normally with an unpatched kernel. I restored previous overclocking settings and the long boot bug returned. Among testing a few bios parameters I found that adjusting "NB Configuration--PCIEX16_1"--like forcing auto(PCIEX16_2), X16 or X8--has no effect on the bug.
"APU Frequency" or base clock is the only setting found to effect the long boot/startx bug. Any base clock values greater than 100 causes the bug. Underclocking CPU, NB and RAM has no effect when base clock is set to 101--the bug remains.
This behavior did not occur before the commit mentioned earlier in this bug report thread as the base clock remained the same since before I first submitted the bug and until my recent testing today.
With "amdgpu.pcie_gen_cap=0x00030003" applied through LILO this machine is capable of running at 104 base clock stable with system RAM near 2500mhz--without the long boot/startx bug.
https://bugs.freedesktop.org/show_bug.cgi?id=92974
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #18 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/59.
dri-devel@lists.freedesktop.org