Hi, We are working with new laptops that have the AMD Ravenl Ridge chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3. However, I still get stuck on the following messages during boot very often "" [ 4.998241] endless kernel: [drm] amdgpu kernel modesetting enabled. [ 4.998288] endless kernel: checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 4.998289] endless kernel: fb: switching to amdgpudrmfb from EFI VGA "" I turned on drm.debug=0xe while booting, but no more information at this point. Anything I can do at this point?
And there's 1 more information may be helpful. Sometimes the system boots OK with the blank screen, I can't switch to virtual console, but it did respond to the magic sys-rq key. The dmesg with drm.debug=0xe is here https://gist.github.com/mschiu77/291e47b1f07dc52be9461c55c820464c.
I'm pretty sure it's due to the amdgpu driver. Because when I boot with my own kernel which disables the amdgpu driver, all these symptoms went away. Please suggest anything I can do for this. Thanks
Chris
On 2018-01-31 09:31 AM, Chris Chiu wrote:
Hi, We are working with new laptops that have the AMD Ravenl Ridge chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different
panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
Hi Chris,
what are the steps to reproduce this oops?
Does it reproduce all the time or is it intermittent?
Can you send a dmesg with amdgpu.dc_log=1, in addition to drm.debug=0xe?
Thanks, Harry
However, I still get stuck on the following messages during boot very often "" [ 4.998241] endless kernel: [drm] amdgpu kernel modesetting enabled. [ 4.998288] endless kernel: checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 4.998289] endless kernel: fb: switching to amdgpudrmfb from EFI VGA "" I turned on drm.debug=0xe while booting, but no more information at this point. Anything I can do at this point?
And there's 1 more information may be helpful. Sometimes the
system boots OK with the blank screen, I can't switch to virtual console, but it did respond to the magic sys-rq key. The dmesg with drm.debug=0xe is here https://gist.github.com/mschiu77/291e47b1f07dc52be9461c55c820464c.
I'm pretty sure it's due to the amdgpu driver. Because when I boot
with my own kernel which disables the amdgpu driver, all these symptoms went away. Please suggest anything I can do for this. Thanks
Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
On Thu, Feb 1, 2018 at 12:08 AM, Harry Wentland harry.wentland@amd.com wrote:
On 2018-01-31 09:31 AM, Chris Chiu wrote:
Hi, We are working with new laptops that have the AMD Ravenl Ridge chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different
panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
Hi Chris,
what are the steps to reproduce this oops?
Does it reproduce all the time or is it intermittent?
Can you send a dmesg with amdgpu.dc_log=1, in addition to drm.debug=0xe?
Thanks, Harry
I did nothing special to reproduce the oops. Boot and sometimes it just shows blank screen but still responds to magic sysrq. So I reboot and take the journal log.
It's intermittent, I ran into it 2 times during 13 reboots. The logs are listed as follows https://gist.github.com/mschiu77/9307d1ca0acd046cc6817f8cad63d79c https://gist.github.com/mschiu77/fa81110f93428721f017cb9fbfd06fbe
One more log here. It enters X OK but after few minutes the display went black and only a mouse cursor left. But the mouse cursor can't even move. So I do a sysrq reboot again. The last error is "" [ 636.312759] endless kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:41:crtc-0] flip_done timed out [ 646.552344] endless kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:crtc-0] flip_done timed out "" full log here https://gist.github.com/mschiu77/c8696e5fefb17bb1c53598214fb4e382
Only 4 times I can login X, blank screen or hangs w/o responding to magic sysrq for the rest. I took a picture of the only panic although I think it's not about amdgpu. It's here. https://pasteboard.co/H5CUvxk.jpg
Hope they can be helpful.
Chris
However, I still get stuck on the following messages during boot very often "" [ 4.998241] endless kernel: [drm] amdgpu kernel modesetting enabled. [ 4.998288] endless kernel: checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 4.998289] endless kernel: fb: switching to amdgpudrmfb from EFI VGA "" I turned on drm.debug=0xe while booting, but no more information at this point. Anything I can do at this point?
And there's 1 more information may be helpful. Sometimes the
system boots OK with the blank screen, I can't switch to virtual console, but it did respond to the magic sys-rq key. The dmesg with drm.debug=0xe is here https://gist.github.com/mschiu77/291e47b1f07dc52be9461c55c820464c.
I'm pretty sure it's due to the amdgpu driver. Because when I boot
with my own kernel which disables the amdgpu driver, all these symptoms went away. Please suggest anything I can do for this. Thanks
Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
On Thu, Feb 1, 2018 at 9:13 PM, Chris Chiu chiu@endlessm.com wrote:
On Thu, Feb 1, 2018 at 12:08 AM, Harry Wentland harry.wentland@amd.com wrote:
On 2018-01-31 09:31 AM, Chris Chiu wrote:
Hi, We are working with new laptops that have the AMD Ravenl Ridge chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different
panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
Hi Chris,
what are the steps to reproduce this oops?
Does it reproduce all the time or is it intermittent?
Can you send a dmesg with amdgpu.dc_log=1, in addition to drm.debug=0xe?
Thanks, Harry
I did nothing special to reproduce the oops. Boot and sometimes it just shows blank screen but still responds to magic sysrq. So I reboot and take the journal log.
It's intermittent, I ran into it 2 times during 13 reboots. The logs are listed as follows https://gist.github.com/mschiu77/9307d1ca0acd046cc6817f8cad63d79c https://gist.github.com/mschiu77/fa81110f93428721f017cb9fbfd06fbe
One more log here. It enters X OK but after few minutes the display went black and only a mouse cursor left. But the mouse cursor can't even move. So I do a sysrq reboot again. The last error is "" [ 636.312759] endless kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:41:crtc-0] flip_done timed out [ 646.552344] endless kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:41:crtc-0] flip_done timed out "" full log here https://gist.github.com/mschiu77/c8696e5fefb17bb1c53598214fb4e382
Only 4 times I can login X, blank screen or hangs w/o responding to magic sysrq for the rest. I took a picture of the only panic although I think it's not about amdgpu. It's here. https://pasteboard.co/H5CUvxk.jpg
Hope they can be helpful.
Chris
However, I still get stuck on the following messages during boot very often "" [ 4.998241] endless kernel: [drm] amdgpu kernel modesetting enabled. [ 4.998288] endless kernel: checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 4.998289] endless kernel: fb: switching to amdgpudrmfb from EFI VGA "" I turned on drm.debug=0xe while booting, but no more information at this point. Anything I can do at this point?
And there's 1 more information may be helpful. Sometimes the
system boots OK with the blank screen, I can't switch to virtual console, but it did respond to the magic sys-rq key. The dmesg with drm.debug=0xe is here https://gist.github.com/mschiu77/291e47b1f07dc52be9461c55c820464c.
I'm pretty sure it's due to the amdgpu driver. Because when I boot
with my own kernel which disables the amdgpu driver, all these symptoms went away. Please suggest anything I can do for this. Thanks
Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Gentle ping, cheers.
Chris
Hi,
We are working with new laptops that have the AMD Ravenl Ridge
chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different
panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Thanks! Daniel
On Tue, Feb 20, 2018 at 2:26 AM, Daniel Drake drake@endlessm.com wrote:
Hi,
We are working with new laptops that have the AMD Ravenl Ridge
chipset with this `/proc/cpuinfo` https://gist.github.com/mschiu77/b06dba574e89b9a30cf4c450eaec49bc
With the latest kernel 4.15, there're lots of different
panics/oops during boot so no chance to get into X. It also happens during shutdown. Then I tried to build kernel from git://people.freedesktop.org/~agd5f/linux on branch amd-staging-drm-next with head on commit "drm: Fix trailing semicolon" and update the linux-firmware. Things seem to get better, only 1 oops observed. Here's the oops https://gist.github.com/mschiu77/1a68f27272b24775b2040acdb474cdd3.
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Please file bugs: https://bugs.freedesktop.org
Thanks,
Alex
Thanks! Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi Alex,
On Tue, Feb 20, 2018 at 10:18 PM, Alex Deucher alexdeucher@gmail.com wrote:
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Please file bugs: https://bugs.freedesktop.org
Sorry for the delayed response. We're still seeing serious instability here even on the latest kernel. Filed https://bugs.freedesktop.org/show_bug.cgi?id=105684
Thanks, Daniel
On Thu, Mar 22, 2018 at 3:09 AM, Daniel Drake drake@endlessm.com wrote:
On Tue, Feb 20, 2018 at 10:18 PM, Alex Deucher alexdeucher@gmail.com wrote:
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Please file bugs: https://bugs.freedesktop.org
Sorry for the delayed response. We're still seeing serious instability here even on the latest kernel. Filed https://bugs.freedesktop.org/show_bug.cgi?id=105684
No progress made on that bug report so far. What can we do to help this advance?
Thanks, Daniel
On Tue, Apr 3, 2018 at 11:31 AM, Daniel Drake drake@endlessm.com wrote:
On Thu, Mar 22, 2018 at 3:09 AM, Daniel Drake drake@endlessm.com wrote:
On Tue, Feb 20, 2018 at 10:18 PM, Alex Deucher alexdeucher@gmail.com wrote:
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Please file bugs: https://bugs.freedesktop.org
Sorry for the delayed response. We're still seeing serious instability here even on the latest kernel. Filed https://bugs.freedesktop.org/show_bug.cgi?id=105684
No progress made on that bug report so far. What can we do to help this advance?
Ping, any news here? How can we help advance on this bug?
Thanks Daniel
On Thu, Apr 19, 2018 at 6:08 PM, drake drake@endlessm.com wrote:
On Tue, Apr 3, 2018 at 11:31 AM, Daniel Drake drake@endlessm.com wrote:
On Thu, Mar 22, 2018 at 3:09 AM, Daniel Drake drake@endlessm.com wrote:
On Tue, Feb 20, 2018 at 10:18 PM, Alex Deucher alexdeucher@gmail.com wrote:
It seems that we are not alone seeing amdgpu-induced stability problems on multiple Raven Ridge platforms. https://www.phoronix.com/scan.php?page=news_item&px=AMD-Raven-Ridge-Mobo...
AMD, what can we do to help?
Please file bugs: https://bugs.freedesktop.org
Sorry for the delayed response. We're still seeing serious instability here even on the latest kernel. Filed https://bugs.freedesktop.org/show_bug.cgi?id=105684
No progress made on that bug report so far. What can we do to help this advance?
Ping, any news here? How can we help advance on this bug?
Can you try one of these branches? https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.18-wip do they work any better?
Thanks Daniel
WHi Alex,
On Thu, Apr 19, 2018 at 4:13 PM, Alex Deucher alexdeucher@gmail.com wrote:
No progress made on that bug report so far. What can we do to help this advance?
Ping, any news here? How can we help advance on this bug?
Can you try one of these branches? https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.18-wip do they work any better?
It's been over 3 months since we reported this bug by email, over 6 weeks since we reported it on bugzilla, and still there has been no meaningful diagnostics help from AMD. This follows a similar pattern to what we have seen with other issues prior to this one.
What can we do so that this bug gets some attention from your team?
Secondarily https://bugs.freedesktop.org/show_bug.cgi?id=106228 is another bug that needs attention. We have a growing number of consumer platforms affected by this. When booted, the amdgpu screen brightness value is incorrectly read back as 0, which systemd will then store on shutdown. On next boot, it restores the very low brightness level. This can reproduce out of the box on Fedora, Ubuntu, etc.
Thanks, Daniel
dri-devel@lists.freedesktop.org