On Tue, 2022-03-22 at 11:26 +0100, Thomas Hellström wrote:
On Tue, 2022-03-22 at 10:13 +0000, Tvrtko Ursulin wrote:
On 21/03/2022 15:15, Thomas Hellström wrote:
On Mon, 2022-03-21 at 14:43 +0000, Tvrtko Ursulin wrote:
On 21/03/2022 13:40, Thomas Hellström wrote:
Hi,
On Mon, 2022-03-21 at 13:12 +0000, Tvrtko Ursulin wrote:
On 21/03/2022 12:33, Thomas Hellström wrote: > On Mon, 2022-03-21 at 12:22 +0000, Tvrtko Ursulin wrote: > > > > On 21/03/2022 11:03, Thomas Hellström wrote: > > > Hi, Tvrtko. > > > > > > On 3/21/22 11:27, Tvrtko Ursulin wrote: > > > > > > > > On 19/03/2022 19:42, Michael Cheng wrote: > > > > > To align with the discussion in [1][2], this > > > > > patch > > > > > series > > > > > drops > > > > > all > > > > > usage of > > > > > wbvind_on_all_cpus within i915 by either > > > > > replacing > > > > > the > > > > > call > > > > > with certain > > > > > drm clflush helpers, or reverting to a previous > > > > > logic. > > > > > > > > AFAIU, complaint from [1] was that it is wrong to > > > > provide > > > > non > > > > x86 > > > > implementations under the wbinvd_on_all_cpus name. > > > > Instead an > > > > arch > > > > agnostic helper which achieves the same effect > > > > could > > > > be > > > > created. > > > > Does > > > > Arm have such concept? > > > > > > I also understand Linus' email like we shouldn't leak > > > incoherent > > > IO > > > to > > > other architectures, meaning any remaining wbinvd()s > > > should > > > be > > > X86 > > > only. > > > > The last part is completely obvious since it is a x86 > > instruction > > name. > > Yeah, I meant the function implementing wbinvd() > semantics. > > > > > But I think we can't pick a solution until we know how > > the > > concept > > maps > > to Arm and that will also include seeing how the > > drm_clflush_sg for > > Arm > > would look. Is there a range based solution, or just a > > big > > hammer > > there. > > If the latter, then it is no good to churn all these > > reverts > > but > > instead > > an arch agnostic wrapper, with a generic name, would be > > the > > way to > > go. > > But my impression was that ARM would not need the range- > based > interface > either, because ARM is only for discrete and with > discrete > we're > always > coherent.
Not sure what you mean here - what about flushing system memory objects on discrete? Those still need flushing on paths like suspend which this series touches. Am I missing something?
System bos on discrete should always have
I915_BO_CACHE_COHERENT_FOR_READ | I915_BO_CACHE_COHERENT_FOR_WRITE
either by the gpu being fully cache coherent (or us mapping system write-combined). Hence no need for cache clflushes or wbinvd() for incoherent IO.
Hmm so you are talking about the shmem ttm backend. It ends up depending on the result of i915_ttm_cache_level, yes? It cannot end up with I915_CACHE_NONE from that function?
If the object is allocated with allowable placement in either LMEM or SYSTEM, and it ends in system, it gets allocated with I915_CACHE_NONE, but then the shmem ttm backend isn't used but TTM's wc pools, and the object should *always* be mapped wc. Even in system.
I am not familiar with neither TTM backend or wc pools so maybe a missed question - if obj->cache_level can be set to none, and obj->cache_coherency to zero, then during object lifetime helpers which consult those fields (like i915_gem_cpu_write_needs_clflush, __start_cpu_write, etc) are giving out incorrect answers? That is, it is irrelevant that they would say flushes are required, since in actuality those objects can never ever and from anywhere be mapped other than WC so flushes aren't actually required?
If we map other than WC somewhere in these situations, that should be a bug needing a fix. It might be that some of these helpers that you mention might still flag that a clflush is needed, and in that case that's an oversight that also needs fixing.
Actually, it seems like most of these has a IS_DGFX() in them, in particular i915_gem_clflush_object(), but it looks like some sort of cleanup might be needed here. In particular we might want to introduce an IS_COHERENT() in case we change the api at some point also for integrated.
/Thomas