New subject: [PATCH v3] drm/virtio: trace total gem bo for virtio

21 Jan 2021


      On Wed, Jan 20, 2021 at 10:52:11AM -0800, Yiwei Zhang wrote:
...
On Wed, Jan 20, 2021 at 5:33 AM Gerd Hoffmann kraxel@redhat.com wrote:
...
Hi,
...
...
...
...
> +       select TRACE_GPU_MEM
...
...
...
...
> +#ifdef CONFIG_TRACE_GPU_MEM
That doesn't make sense btw.
Do you recommend we just select it or leave it an option?
The patch selects it (which makes sense given the small size).
The #ifdef is pointless then ...
...
...
...
...
...
...
> +#ifdef CONFIG_TRACE_GPU_MEM
> +static inline void virtio_gpu_trace_total_mem(struct virtio_gpu_device *vgdev,
> +                                             s64 delta)
> +{
> +       u64 total_mem = atomic64_add_return(delta, &vgdev->total_mem);
> +
> +       trace_gpu_mem_total(0, 0, total_mem);
Hmm, so no per process tracking (pid arg hard-coded to zero)?
Any plans for that?
The cgroups patches mentioned by Daniel should address that btw.
Android GPU vendors do report the totals for each process as well. For
Cuttlefish virtual platform, we haven't yet required that, and want to
get the global total in place first.
That means no plans yet?
...
...
...
...
...
Android relies on this tracepoint + eBPF to make the GPU memory totals
available at runtime on production devices, which has been enforced
already. Not only game developers can have a reliable kernel total GPU
memory to look at, but also Android leverages this to take GPU memory
usage out from the system lost ram.
Sounds like you define "gpu memory" as "system memory used to store gpu
data".  Is that correct?  What about device memory?
The total definition does include all device memory being used as well
for numa devices.(If my understanding of your question is correct.)
device memory == gpu-owned memory, typically exposed to as pci memory bar.
qemu stdvga for example stores gem objects in device memory (unless it
runs out of vram, then ttm allocates from / moves into main memory).
...
...
...
...
...
I'm not sure whether the other DRM drivers would like to integrate
this tracepoint(maybe upstream drivers will move away from debugfs
later as well?), but at least we hope virtio-gpu can take this.
Well, it is basically the same for all drivers using the gem shmem
helpers.  So I see little reason why we should do that at virtio-gpu
level.
This can be a starting point. Another reason would be I'm fearing that
this tracepoint approach might be more difficult to get upstreamed at
drm layer level, since later we may want to get to per-process total
tracking, which would be making more sense at device driver level.
Tracking in __drm_gem_shmem_create + drm_gem_shmem_free_object should
give you pretty much the same results, with the major difference being
that it works for all shmem-based drivers.
Of course just moving the trace points doesn't solve the other issues
discussed.
...
...
...
Android GPU vendors have integrated this tracepoint to track gpu
memory usage total(mapped into the gpu address space), which consists
of below:
(1) directly allocated via physical page allocator
(2) imported external memory backed by dma-bufs
(3) allocated exportable memory backed by dma-bufs
Hmm, the tracepoint doesn't track which of the three groups the memory
belongs to.  Which I think is important, specifically group (2) because
that might already be accounted for by the exporting driver ...
The tracepoint only cares about a total number, but I'm not against
the idea to extend the tracepoint with categorization. However, I
believe the dma-bufs core can track which dma-buf gets attached/mapped
to some devices. So that those overlap between dma-buf heaps and the
gpu memory total we are tracking here can be canceled out.
Yep, maybe.  Which is *exactly* why Daniel keeps asking for the big
picture and how this integrates/interacts with the dma-buf accounting
which seems to be in the works too.
Note that dma-bufs are not only used for cross-device sharing.  They are
also used to pass handles from one application to another (application
to wayland compositor or x server for example).  Which doesn't matter
much for the totals, but for per-process accounting you need a plan how
to account these shared buffers.
...
...
I suspect once you figured that you'll notice that this little hack is
rather incomplete.
Despite the dma-buf side effort, we still wish to have this tracepoint
integrated in virtio-gpu just for a global total at this moment.
I don't feel like merging patches with obvious shortcomings which have
a high chance to end up as technical dept.
The question how this interacts with dma-buf accounting must be
clarified.
I'd also suggest to join forces with the cgroups people.  The problem
space has alot of overlap.  Even if we end up with multiple ways to
export the accounting data the spots you have to hook into to actually
do the accounting should be largely identical.
take care,
  Gerd

Re: [PATCH v2] drm/virtio: Track total GPU memory for virtio driver

Signed-off-by: Yiwei Zhang zzyiwei@android.com

Signed-off-by: Yiwei Zhang zzyiwei@android.com