Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this: - disable async DMA in Mesa - call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek
Hi Marek,
I've noticed this before as well, and I agree that it looks like a memory corruption. Not sure if the async DMA on the GPU or the CPU is overwriting something because of a race condition or something like this.
Anyway, can you come up with a simple test case to reproduce the issue? For me it occurred only randomly while working on UVD support for R7xx. If you have something more reliable I could dig into it with my RV710.
Christian.
Am 19.04.2014 01:48, schrieb Marek Olšák:
Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this:
- disable async DMA in Mesa
- call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Sat, Apr 19, 2014 at 5:54 AM, Christian König deathsimple@vodafone.de wrote:
Hi Marek,
I've noticed this before as well, and I agree that it looks like a memory corruption. Not sure if the async DMA on the GPU or the CPU is overwriting something because of a race condition or something like this.
Anyway, can you come up with a simple test case to reproduce the issue? For me it occurred only randomly while working on UVD support for R7xx. If you have something more reliable I could dig into it with my RV710.
Double check the code in the kernel that copies the IB from the user copy to the kernel copy. We had a bug when we first merged DMA support where we weren't properly copying the across the IBs. There may still be a case where this is an issue: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=de...
Alex
Christian.
Am 19.04.2014 01:48, schrieb Marek Olšák:
Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this:
- disable async DMA in Mesa
- call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
This test always reproduces the issue for me:
piglit/bin/arb_vertex_buffer_object-vbo-subdata-many drawarrays -fbo -auto
There are rejected IBs and it hangs sometimes.
It started to fail with this commit: http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d434252e239bc872549e59c64e...
which is probably unrelated to the issue, but it makes the graphics IB a little bit bigger.
Also, I think R700 is generally in a bad shape. I haven't been able to run piglit with concurrency and without hangs, and I have already disabled async DMA, geometry shaders, and pipelined buffer uploads.
Marek
On Sat, Apr 19, 2014 at 11:54 AM, Christian König deathsimple@vodafone.de wrote:
Hi Marek,
I've noticed this before as well, and I agree that it looks like a memory corruption. Not sure if the async DMA on the GPU or the CPU is overwriting something because of a race condition or something like this.
Anyway, can you come up with a simple test case to reproduce the issue? For me it occurred only randomly while working on UVD support for R7xx. If you have something more reliable I could dig into it with my RV710.
Christian.
Am 19.04.2014 01:48, schrieb Marek Olšák:
Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this:
- disable async DMA in Mesa
- call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Sat, Apr 19, 2014 at 11:07 AM, Marek Olšák maraeo@gmail.com wrote:
This test always reproduces the issue for me:
piglit/bin/arb_vertex_buffer_object-vbo-subdata-many drawarrays -fbo -auto
There are rejected IBs and it hangs sometimes.
It started to fail with this commit: http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d434252e239bc872549e59c64e...
which is probably unrelated to the issue, but it makes the graphics IB a little bit bigger.
Also, I think R700 is generally in a bad shape. I haven't been able to run piglit with concurrency and without hangs, and I have already disabled async DMA, geometry shaders, and pipelined buffer uploads.
See if disabling dpm helps.
Alex
Marek
On Sat, Apr 19, 2014 at 11:54 AM, Christian König deathsimple@vodafone.de wrote:
Hi Marek,
I've noticed this before as well, and I agree that it looks like a memory corruption. Not sure if the async DMA on the GPU or the CPU is overwriting something because of a race condition or something like this.
Anyway, can you come up with a simple test case to reproduce the issue? For me it occurred only randomly while working on UVD support for R7xx. If you have something more reliable I could dig into it with my RV710.
Christian.
Am 19.04.2014 01:48, schrieb Marek Olšák:
Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this:
- disable async DMA in Mesa
- call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Disabling DPM makes no difference.
Marek
On Sat, Apr 19, 2014 at 5:19 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Sat, Apr 19, 2014 at 11:07 AM, Marek Olšák maraeo@gmail.com wrote:
This test always reproduces the issue for me:
piglit/bin/arb_vertex_buffer_object-vbo-subdata-many drawarrays -fbo -auto
There are rejected IBs and it hangs sometimes.
It started to fail with this commit: http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d434252e239bc872549e59c64e...
which is probably unrelated to the issue, but it makes the graphics IB a little bit bigger.
Also, I think R700 is generally in a bad shape. I haven't been able to run piglit with concurrency and without hangs, and I have already disabled async DMA, geometry shaders, and pipelined buffer uploads.
See if disabling dpm helps.
Alex
Marek
On Sat, Apr 19, 2014 at 11:54 AM, Christian König deathsimple@vodafone.de wrote:
Hi Marek,
I've noticed this before as well, and I agree that it looks like a memory corruption. Not sure if the async DMA on the GPU or the CPU is overwriting something because of a race condition or something like this.
Anyway, can you come up with a simple test case to reproduce the issue? For me it occurred only randomly while working on UVD support for R7xx. If you have something more reliable I could dig into it with my RV710.
Christian.
Am 19.04.2014 01:48, schrieb Marek Olšák:
Hi,
If you submit a lot of graphics and DMA IBs interleaved, the graphics CS checker sometimes fails with this message:
[ 3846.435661] Forbidden register 0x0014 in cs at 9 [ 3846.435664] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
This error is only used for type-0 packets, but we don't use these packets on R700 at all. Somehow, the graphics CS checker received either the DMA IB or random garbage. My guess is there is memory corruption happening during IB uploading and/or IB checking in the kernel. Also, if you are unlucky, the GPU hangs instead.
The CS thread offloading was disabled in Mesa, so the user space was single-threaded.
There are 2 ways to fix this:
- disable async DMA in Mesa
- call usleep(1) after the RADEON_CS ioctl returns
This is just a heads-up. In the worst case, we can disable async DMA for R700 in Mesa.
Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org