Well in my opinion exposing it through fdinfo turned out to be a really clean approach.
It describes exactly the per file descriptor information we need.
Making that device driver independent is potentially useful as well.
Regards, Christian.
Am 14.05.21 um 09:22 schrieb Nieto, David M:
[AMD Official Use Only - Internal Distribution Only]
We had entertained the idea of exposing the processes as sysfs nodes as you proposed, but we had concerns about exposing process info in there, especially since /proc already exists for that purpose.
I think if you were to follow that approach, we could have tools like top that support exposing GPU engine usage.
*From:* Alex Deucher alexdeucher@gmail.com *Sent:* Thursday, May 13, 2021 10:58 PM *To:* Tvrtko Ursulin tvrtko.ursulin@linux.intel.com; Nieto, David M David.Nieto@amd.com; Koenig, Christian Christian.Koenig@amd.com *Cc:* Intel Graphics Development Intel-gfx@lists.freedesktop.org; Maling list - DRI developers dri-devel@lists.freedesktop.org; Daniel Vetter daniel@ffwll.ch *Subject:* Re: [PATCH 0/7] Per client engine busyness
- David, Christian
On Thu, May 13, 2021 at 12:41 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
Hi,
On 13/05/2021 16:48, Alex Deucher wrote:
On Thu, May 13, 2021 at 7:00 AM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Resurrect of the previosuly merged per client engine busyness
patches. In a
nutshell it enables intel_gpu_top to be more top(1) like useful
and show not
only physical GPU engine usage but per process view as well.
Example screen capture:
> >> intel-gpu-top - 906/ 955 MHz; 0% RC6; 5.30 Watts; 933 irqs/s > >> > >> IMC reads: 4414 MiB/s > >> IMC writes: 3805 MiB/s > >> > >> ENGINE BUSY MI_SEMA MI_WAIT > >> Render/3D/0 93.46% |████████████████████████████████▋ | 0% 0% > >> Blitter/0 0.00% | | 0% 0% > >> Video/0 0.00% | | 0% 0% > >> VideoEnhance/0 0.00% | | 0% 0% > >> > >> PID NAME Render/3D Blitter Video VideoEnhance > >> 2733 neverball |██████▌ || || || | > >> 2047 Xorg |███▊ || || || | > >> 2737 glxgears |█▍ || || || | > >> 2128 xfwm4 | || || || | > >> 2047 Xorg | || || || | > >>
Internally we track time spent on engines for each struct
intel_context, both
for current and past contexts belonging to each open DRM file.
This can serve as a building block for several features from the
wanted list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context
functionality
wanted by some customers, setrlimit(2) like controls, cgroups
controller,
dynamic SSEU tuning, ...
To enable userspace access to the tracked data, we expose time
spent on GPU per
client and per engine class in sysfs with a hierarchy like the below:
# cd /sys/class/drm/card0/clients/ # tree . ├── 7 │ ├── busy │ │ ├── 0 │ │ ├── 1 │ │ ├── 2 │ │ └── 3 │ ├── name │ └── pid ├── 8 │ ├── busy │ │ ├── 0 │ │ ├── 1 │ │ ├── 2 │ │ └── 3 │ ├── name │ └── pid └── 9 ├── busy │ ├── 0 │ ├── 1 │ ├── 2 │ └── 3 ├── name └── pid
Files in 'busy' directories are numbered using the engine class
ABI values and
they contain accumulated nanoseconds each client spent on engines
of a
respective class.
We did something similar in amdgpu using the gpu scheduler. We then expose the data via fdinfo. See
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freed... https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&reserved=0
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freed... https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&reserved=0
Interesting!
Is yours wall time or actual GPU time taking preemption and such into account? Do you have some userspace tools parsing this data and how to do you client discovery? Presumably there has to be a better way that going through all open file descriptors?
Wall time. It uses the fences in the scheduler to calculate engine time. We have some python scripts to make it look pretty, but mainly just reading the files directly. If you know the process, you can look it up in procfs.
Our implementation was merged in January but Daniel took it out recently because he wanted to have discussion about a common vendor framework for this whole story on dri-devel. I think. +Daniel to comment.
I couldn't find the patch you pasted on the mailing list to see if there was any such discussion around your version.
It was on the amd-gfx mailing list.
Alex
Regards,
Tvrtko
Alex
Tvrtko Ursulin (7): drm/i915: Expose list of clients in sysfs drm/i915: Update client name on context create drm/i915: Make GEM contexts track DRM clients drm/i915: Track runtime spent in closed and unreachable GEM
contexts
drm/i915: Track all user contexts per client drm/i915: Track context current active time drm/i915: Expose per-engine client busyness
drivers/gpu/drm/i915/Makefile | 5 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 61 ++- .../gpu/drm/i915/gem/i915_gem_context_types.h | 16 +- drivers/gpu/drm/i915/gt/intel_context.c | 27 +- drivers/gpu/drm/i915/gt/intel_context.h | 15 +- drivers/gpu/drm/i915/gt/intel_context_types.h | 24 +- .../drm/i915/gt/intel_execlists_submission.c | 23 +- .../gpu/drm/i915/gt/intel_gt_clock_utils.c | 4 + drivers/gpu/drm/i915/gt/intel_lrc.c | 27 +- drivers/gpu/drm/i915/gt/intel_lrc.h | 24 ++ drivers/gpu/drm/i915/gt/selftest_lrc.c | 10 +- drivers/gpu/drm/i915/i915_drm_client.c | 365
++++++++++++++++++
drivers/gpu/drm/i915/i915_drm_client.h | 123 ++++++ drivers/gpu/drm/i915/i915_drv.c | 6 + drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_gem.c | 21 +- drivers/gpu/drm/i915/i915_gpu_error.c | 31 +- drivers/gpu/drm/i915/i915_gpu_error.h | 2 +- drivers/gpu/drm/i915/i915_sysfs.c | 8 + 19 files changed, 716 insertions(+), 81 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h
-- 2.30.2