Quoting Matthew Auld (2020-11-30 17:17:16)
On 27/11/2020 13:55, Chris Wilson wrote:
Quoting Matthew Auld (2020-11-27 12:06:40)
From: Michel Thierry michel.thierry@intel.com
Rationale goes here.
Is this wise? HWSP is very frequently read by the CPU, and expected to be cached on the CPU.
What do the performance profiles indicate?
Do you have a recommendation for an existing selftest or IGT to help measure this?
Also are you suggesting moving this to system memory, or just using a different mapping type, if it's placed in local memory? Or maybe try both? Although I'm pretty sceptical about !wc for local memory.
A lot of worries go out of the window if this can be in system memory and snooped.
For measuring, I suspect there is a lot of chaff that needs to be removed before individual microbenchmarks like perf/request discern any difference; although that would be a starting point. We do a lot of completion checking during execlists interrupt processing, and there we (cpu profiles at least) are sensitive to uncached reads.
We can trivially construct a benchmark that only shows the impact of the WC reads; but the point where I think we would first notice from userspace is client wakeup latency scaling: benchmarks/gem_latency, which was once a point of major concern. Nowadays, we can couple that with a second concern about inducing system latency from interrupt processing time. -Chris