Hi,
Add more information.
We got occasionally "GPU lockup" after resuming from suspend(on mipsel platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 64bit). Related kernel message: /* return from STR */ [ 156.152343] radeon 0000:01:05.0: WB enabled [ 156.187500] [drm] ring test succeeded in 0 usecs [ 156.187500] [drm] ib test succeeded in 0 usecs [ 156.398437] ata2: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata3: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata4: SATA link down (SStatus 0 SControl 300) [ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 156.597656] ata1.00: configured for UDMA/133 [ 156.613281] usb 1-5: reset high speed USB device number 4 using ehci_hcd [ 157.027343] usb 3-2: reset low speed USB device number 2 using ohci_hcd [ 157.609375] usb 3-3: reset low speed USB device number 3 using ohci_hcd [ 157.683593] r8169 0000:02:00.0: eth0: link up [ 165.621093] PM: resume of devices complete after 9679.556 msecs [ 165.628906] Restarting tasks ... done. [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than 10019msec [ 177.089843] ------------[ cut here ]------------ [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x25c/0x33c() [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) [ 177.113281] Modules linked in: psmouse serio_raw [ 177.117187] Call Trace: [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl+0x80/0x114 [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- [ 177.187500] radeon 0000:01:05.0: GPU softreset [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.367187] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE [ 177.390625] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 [ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040 [ 177.433593] radeon 0000:01:05.0: GPU reset succeed [ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.804687] radeon 0000:01:05.0: WB enabled [ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD) [ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume [ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5). [ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! [ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6). ...
What may cause a "GPU lockup"? Why reset didn't work? Any idea?
BTW, one question: I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes need_dma32 was set. Is it correct? (drivers/char/agp is not available on mips, could that be the reason?)
[ 177.179687]在 2011年9月28日 下午3:23, chenhc@lemote.com写道:
Hi Alex,
When we do STR (S3) with a RS780E radeon card on MIPS platform. "GPU reset" may happen after resume (the possibility is about 5%). After that, X is unusuable.
We know there is a "ring test" at system resume time and GPU reset time. Whether GPU reset happens, the "ring test" at system resume time is always successful. But the "ring test" at GPU reset time usually fails.
We use the latest kernel (3.1.0-RC8 from git) and X.org is 7.6.
Any ideas?
Best regards, Huacai Chen
Regards, - Chen Jie
On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote:
We got occasionally "GPU lockup" after resuming from suspend(on mipsel platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 64bit). Related kernel message:
[...]
[ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than 10019msec [ 177.089843] ------------[ cut here ]------------ [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x25c/0x33c() [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) [ 177.113281] Modules linked in: psmouse serio_raw [ 177.117187] Call Trace: [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl +0x80/0x114 [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- [ 177.187500] radeon 0000:01:05.0: GPU softreset [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
[...]
What may cause a "GPU lockup"?
Lots of things... The most common cause is an incorrect command stream sent to the GPU by userspace or the kernel.
Why reset didn't work?
Might be related to 'Wait for MC idle timedout !', but I don't know offhand what could be up with that.
BTW, one question: I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes need_dma32 was set. Is it correct? (drivers/char/agp is not available on mips, could that be the reason?)
Not sure, Alex?
2011/10/5 Michel Dänzer michel@daenzer.net:
On Don, 2011-09-29 at 17:17 +0800, Chen Jie wrote:
We got occasionally "GPU lockup" after resuming from suspend(on mipsel platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 64bit). Related kernel message:
[...]
[ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than 10019msec [ 177.089843] ------------[ cut here ]------------ [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x25c/0x33c() [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) [ 177.113281] Modules linked in: psmouse serio_raw [ 177.117187] Call Trace: [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl +0x80/0x114 [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- [ 177.187500] radeon 0000:01:05.0: GPU softreset [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout !
[...]
What may cause a "GPU lockup"?
Lots of things... The most common cause is an incorrect command stream sent to the GPU by userspace or the kernel.
Why reset didn't work?
Might be related to 'Wait for MC idle timedout !', but I don't know offhand what could be up with that.
BTW, one question: I got 'RADEON_IS_PCI | RADEON_IS_IGP' in rdev->flags, which causes need_dma32 was set. Is it correct? (drivers/char/agp is not available on mips, could that be the reason?)
Not sure, Alex?
You don't AGP for newer IGP cards (rs4xx+). It gets set by default if the card is not AGP or PCIE. That should be changed as only the legacy r1xx PCI GART block has that limitation. I'll send a patch out shortly.
Alex
From: Alex Deucher alexander.deucher@amd.com
If a card wasn't PCIE, we always set the DMA mask to 32 bits. This is only applies to the old rage128/r1xx gart block on early radeon asics (~r1xx-r4xx). Newer PCI and IGP cards can handle 40 bits just fine.
Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: Chen Jie chenj@lemote.com --- drivers/gpu/drm/radeon/radeon_device.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index b51e157..2c3429d 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -750,14 +750,15 @@ int radeon_device_init(struct radeon_device *rdev,
/* set DMA mask + need_dma32 flags. * PCIE - can handle 40-bits. - * IGP - can handle 40-bits (in theory) + * IGP - can handle 40-bits * AGP - generally dma32 is safest - * PCI - only dma32 + * PCI - dma32 for legacy pci gart, 40 bits on newer asics */ rdev->need_dma32 = false; if (rdev->flags & RADEON_IS_AGP) rdev->need_dma32 = true; - if (rdev->flags & RADEON_IS_PCI) + if ((rdev->flags & RADEON_IS_PCI) && + (rdev->family < CHIP_RS400)) rdev->need_dma32 = true;
dma_bits = rdev->need_dma32 ? 32 : 40;
Hi Alex,
Sorry for the late reply. I tried the patch on our mipsel platform, but got the following: [ 1.335937] [drm] Loading RS780 Microcode [ 1.910156] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD) [ 1.917968] radeon 0000:01:05.0: disabling GPU acceleration
The platform is equipped with 1G memory, and the physical address layout is: [0-256M] physical memory [256M - 4352M] hole [4352M - ] physical memory After applying the patch, the ring buffer BO is allocated at physical address(and is equal to the bus address) near 5G.
I doubt RS780 fails to access such high bus address? (I can't validate it on X86+rs780e, since I doesn't have >4G memory at hand, could somebody please to validate it?)
BTW, I found radeon_gart_bind() will call pci_map_page(), it hooks to swiotlb_map_page on our platform, which seems allocates and returns dma_addr_t of a new page from pool if not meet dma_mask. Seems a bug, since the BO backed by one set of pages, but mapped to GART was another set of pages?
Regards, -- cee1
2011/10/5 alexdeucher@gmail.com
From: Alex Deucher alexander.deucher@amd.com
If a card wasn't PCIE, we always set the DMA mask to 32 bits. This is only applies to the old rage128/r1xx gart block on early radeon asics (~r1xx-r4xx). Newer PCI and IGP cards can handle 40 bits just fine.
Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: Chen Jie chenj@lemote.com
drivers/gpu/drm/radeon/radeon_device.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index b51e157..2c3429d 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -750,14 +750,15 @@ int radeon_device_init(struct radeon_device *rdev,
/* set DMA mask + need_dma32 flags. * PCIE - can handle 40-bits.
* IGP - can handle 40-bits (in theory)
* IGP - can handle 40-bits * AGP - generally dma32 is safest
* PCI - only dma32
* PCI - dma32 for legacy pci gart, 40 bits on newer asics */ rdev->need_dma32 = false; if (rdev->flags & RADEON_IS_AGP) rdev->need_dma32 = true;
if (rdev->flags & RADEON_IS_PCI)
if ((rdev->flags & RADEON_IS_PCI) &&
(rdev->family < CHIP_RS400)) rdev->need_dma32 = true; dma_bits = rdev->need_dma32 ? 32 : 40;
-- 1.7.1.1
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi,
Some status update. 在 2011年9月29日 下午5:17,Chen Jie chenj@lemote.com 写道:
Hi, Add more information. We got occasionally "GPU lockup" after resuming from suspend(on mipsel platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 64bit). Related kernel message: /* return from STR */ [ 156.152343] radeon 0000:01:05.0: WB enabled [ 156.187500] [drm] ring test succeeded in 0 usecs [ 156.187500] [drm] ib test succeeded in 0 usecs [ 156.398437] ata2: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata3: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata4: SATA link down (SStatus 0 SControl 300) [ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 156.597656] ata1.00: configured for UDMA/133 [ 156.613281] usb 1-5: reset high speed USB device number 4 using ehci_hcd [ 157.027343] usb 3-2: reset low speed USB device number 2 using ohci_hcd [ 157.609375] usb 3-3: reset low speed USB device number 3 using ohci_hcd [ 157.683593] r8169 0000:02:00.0: eth0: link up [ 165.621093] PM: resume of devices complete after 9679.556 msecs [ 165.628906] Restarting tasks ... done. [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than 10019msec [ 177.089843] ------------[ cut here ]------------ [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x25c/0x33c() [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) [ 177.113281] Modules linked in: psmouse serio_raw [ 177.117187] Call Trace: [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl+0x80/0x114 [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- [ 177.187500] radeon 0000:01:05.0: GPU softreset [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.367187] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE [ 177.390625] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 [ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040 [ 177.433593] radeon 0000:01:05.0: GPU reset succeed [ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.804687] radeon 0000:01:05.0: WB enabled [ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD)
After pinned ring in VRAM, it warned an ib test failure. It seems something wrong with accessing through GTT.
We dump gart table just after stopped cp, and compare gart table with the dumped one just after r600_pcie_gart_enable, and don't find any difference.
Any idea?
[ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume [ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5). [ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! [ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6). ...
Regards, -- Chen Jie
On Tue, Nov 08, 2011 at 03:33:03PM +0800, Chen Jie wrote:
Hi,
Some status update. 在 2011年9月29日 下午5:17,Chen Jie chenj@lemote.com 写道:
Hi, Add more information. We got occasionally "GPU lockup" after resuming from suspend(on mipsel platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 64bit). Related kernel message: /* return from STR */ [ 156.152343] radeon 0000:01:05.0: WB enabled [ 156.187500] [drm] ring test succeeded in 0 usecs [ 156.187500] [drm] ib test succeeded in 0 usecs [ 156.398437] ata2: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata3: SATA link down (SStatus 0 SControl 300) [ 156.398437] ata4: SATA link down (SStatus 0 SControl 300) [ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 156.597656] ata1.00: configured for UDMA/133 [ 156.613281] usb 1-5: reset high speed USB device number 4 using ehci_hcd [ 157.027343] usb 3-2: reset low speed USB device number 2 using ohci_hcd [ 157.609375] usb 3-3: reset low speed USB device number 3 using ohci_hcd [ 157.683593] r8169 0000:02:00.0: eth0: link up [ 165.621093] PM: resume of devices complete after 9679.556 msecs [ 165.628906] Restarting tasks ... done. [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than 10019msec [ 177.089843] ------------[ cut here ]------------ [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x25c/0x33c() [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) [ 177.113281] Modules linked in: psmouse serio_raw [ 177.117187] Call Trace: [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl+0x80/0x114 [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- [ 177.187500] radeon 0000:01:05.0: GPU softreset [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.367187] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE [ 177.390625] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 [ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040 [ 177.433593] radeon 0000:01:05.0: GPU reset succeed [ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 177.804687] radeon 0000:01:05.0: WB enabled [ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD)
After pinned ring in VRAM, it warned an ib test failure. It seems something wrong with accessing through GTT.
We dump gart table just after stopped cp, and compare gart table with the dumped one just after r600_pcie_gart_enable, and don't find any difference.
Any idea?
[ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume [ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5). [ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! [ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6). ...
Do you have any kind of iommu ? Is the gart table programmed with proper physical address for the page ? Is the GPU PCI master (iirc a PCI device need to be master to be able initiate request to memory). Then there could be a lot other PCI things getting in the way.
Cheers, Jerome
dri-devel@lists.freedesktop.org