Hi Alexander,
in the ring test we write the value 0xDEADBEEF and 0xCAFEDEAD into registers, not VRAM.
And the register bar shouldn't be accessed write combined, cause that could lead to a couple of ordering problems. Why do you think the access is done write combined?
For VRAM it is true that we have a couple of different caches between the CPU and the actually memory, which need to be flushed explicitly if you want to see a value written by the GPU.
Regards, Christian.
Am 09.10.2014 um 13:39 schrieb Alexander Fyodorov:
Hi David,
I'm using 3.10.53-rt56 kernel and encounter a problem in r600_dma_ring_test() when vram memory is mapped as write-combining: no matter how long the polling is done, old value (0xCAFEDEAD) is read.
Looking with hardware analyzer at what actually happens in the PCI-E bus, the memory is accessed with 32-byte loads (8 words at a time). That is, when the memory is mapped as write-combining, the processor converts every readl() into a 32-bytes load transaction.
After doing some more experiments, it seems that Radeon has some kind of cache that keeps the old value (0xCAFEDEAD), and this cache is invalidated when:
- Some other VRAM address is accessed, or
- Processor issues a 4-byte load transaction.
The problem is that as long as the memory is write-combining, all loads will be converted to be 32-bytes long by the CPU, so the test fails with timeout. But if I comment out this particular ring test, everything seems to be working fine (tested with Doom 3).
Is it possible that the situation r600_dma_ring_test() checks for does not happen in real life, and I should be OK commenting it out?
Or maybe the test is broken and some cache-flushing command must be written into the ring buffer?
BTW this is an out-of-tree architecture, so bisecting is not possible. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel