On Wed, May 23, 2018 at 11:44 PM, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 23, 2018 at 3:52 PM, Qiang Yu yuq825@gmail.com wrote:
On Wed, May 23, 2018 at 5:29 PM, Christian König ckoenig.leichtzumerken@gmail.com wrote:
Am 18.05.2018 um 11:27 schrieb Qiang Yu:
Kernel DRM driver for ARM Mali 400/450 GPUs.
This implementation mainly take amdgpu DRM driver as reference.
- Mali 4xx GPUs have two kinds of processors GP and PP. GP is for OpenGL vertex shader processing and PP is for fragment shader processing. Each processor has its own MMU so prcessors work in virtual address space.
- There's only one GP but multiple PP (max 4 for mali 400 and 8 for mali 450) in the same mali 4xx GPU. All PPs are grouped togather to handle a single fragment shader task divided by FB output tiled pixels. Mali 400 user space driver is responsible for assign target tiled pixels to each PP, but mali 450 has a HW module called DLBU to dynamically balance each PP's load.
- User space driver allocate buffer object and map into GPU virtual address space, upload command stream and draw data with CPU mmap of the buffer object, then submit task to GP/PP with a register frame indicating where is the command stream and misc settings.
- There's no command stream validation/relocation due to each user process has its own GPU virtual address space. GP/PP's MMU switch virtual address space before running two tasks from different user process. Error or evil user space code just get MMU fault or GP/PP error IRQ, then the HW/SW will be recovered.
- Use TTM as MM. TTM_PL_TT type memory is used as the content of lima buffer object which is allocated from TTM page pool. all lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when allocation, so there's no buffer eviction and swap for now. We need reverse engineering to see if and how GP/PP support MMU fault recovery (continue execution). Otherwise we have to pin/unpin each envolved buffer when task creation/deletion.
Well pinning all memory is usually a no-go for upstreaming. But since you are already using the drm_sched for GPU task scheduling why are you actually needing this?
The scheduler should take care of signaling all fences when the hardware is done with it's magic and that is enough for TTM to note that a buffer object is movable again (e.g. unpin them).
Please correct me if I'm wrong.
One way to implement eviction/swap is like this: call validation on each buffers involved in a task, but this won't prevent it from eviction/swap when executing, so a GPU MMU fault may happen and in the handler we need to recover the buffer evicted/swapped.
Another way is pin/unpin buffers evolved when task create/free.
First way is better when memory load is low and second way is better when memory load is high. First way also need less memory.
So I'd prefer first way but due to the GPU MMU fault HW op need reverse engineering, I have to pin all buffers now. After the HW op is clear, I can choose one way to implement.
All the drivers using ttm have something that looks like vram, or a requirement to move buffers around. Afaiui that includes virtio drm driver.
Does virtio drm driver need to move buffers around? amdgpu also has no vram when APU.
From your description you don't have such a requirement, and then doing what etnaviv has done would be a lot simpler. Everything that's not related to buffer movement handling is also available outside of ttm already.
Yeah, I could do like etnaviv, but it's not simpler than using ttm directly especially want some optimization (like ttm page pool, ttm_eu_reserve_buffers, ttm_bo_mmap). If I have/want to implement them, why not just use TTM directly with all those helper functions.
Regards, Qiang
-Daniel
Regards, Qiang
Christian.
- Use drm_sched for GPU task schedule. Each OpenGL context should have a lima context object in the kernel to distinguish tasks from different user. drm_sched gets task from each lima context in a fair way.
Not implemented:
- Dump buffer support
- Power management
- Performance counter
This patch serial just pack a pair of .c/.h files in each patch. For whole history of this driver's development, see: https://github.com/yuq/linux-lima/commits/lima-4.17-rc4
Mesa driver is still in development and not ready for daily usage, but can run some simple tests like kmscube and glamrk2, see: https://github.com/yuq/mesa-lima
Andrei Paulau (1): arm64/dts: add switch-delay for meson mali
Lima Project Developers (10): drm/lima: add mali 4xx GPU hardware regs drm/lima: add lima core driver drm/lima: add GPU device functions drm/lima: add PMU related functions drm/lima: add PP related functions drm/lima: add MMU related functions drm/lima: add GPU virtual memory space handing drm/lima: add GEM related functions drm/lima: add GEM Prime related functions drm/lima: add makefile and kconfig
Qiang Yu (12): dt-bindings: add switch-delay property for mali-utgard arm64/dts: add switch-delay for meson mali Revert "drm: Nerf the preclose callback for modern drivers" drm/lima: add lima uapi header drm/lima: add L2 cache functions drm/lima: add GP related functions drm/lima: add BCAST related function drm/lima: add DLBU related functions drm/lima: add TTM subsystem functions drm/lima: add buffer object functions drm/lima: add GPU schedule using DRM_SCHED drm/lima: add context related functions
Simon Shields (1): ARM: dts: add gpu node to exynos4
.../bindings/gpu/arm,mali-utgard.txt | 4 + arch/arm/boot/dts/exynos4.dtsi | 33 ++ arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 1 + .../boot/dts/amlogic/meson-gxl-mali.dtsi | 1 + drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/drm_file.c | 8 +- drivers/gpu/drm/lima/Kconfig | 9 + drivers/gpu/drm/lima/Makefile | 19 + drivers/gpu/drm/lima/lima_bcast.c | 65 +++ drivers/gpu/drm/lima/lima_bcast.h | 34 ++ drivers/gpu/drm/lima/lima_ctx.c | 143 +++++ drivers/gpu/drm/lima/lima_ctx.h | 51 ++ drivers/gpu/drm/lima/lima_device.c | 407 ++++++++++++++ drivers/gpu/drm/lima/lima_device.h | 136 +++++ drivers/gpu/drm/lima/lima_dlbu.c | 75 +++ drivers/gpu/drm/lima/lima_dlbu.h | 37 ++ drivers/gpu/drm/lima/lima_drv.c | 466 ++++++++++++++++ drivers/gpu/drm/lima/lima_drv.h | 77 +++ drivers/gpu/drm/lima/lima_gem.c | 459 ++++++++++++++++ drivers/gpu/drm/lima/lima_gem.h | 41 ++ drivers/gpu/drm/lima/lima_gem_prime.c | 66 +++ drivers/gpu/drm/lima/lima_gem_prime.h | 31 ++ drivers/gpu/drm/lima/lima_gp.c | 293 +++++++++++ drivers/gpu/drm/lima/lima_gp.h | 34 ++ drivers/gpu/drm/lima/lima_l2_cache.c | 98 ++++ drivers/gpu/drm/lima/lima_l2_cache.h | 32 ++ drivers/gpu/drm/lima/lima_mmu.c | 154 ++++++ drivers/gpu/drm/lima/lima_mmu.h | 34 ++ drivers/gpu/drm/lima/lima_object.c | 120 +++++ drivers/gpu/drm/lima/lima_object.h | 87 +++ drivers/gpu/drm/lima/lima_pmu.c | 85 +++ drivers/gpu/drm/lima/lima_pmu.h | 30 ++ drivers/gpu/drm/lima/lima_pp.c | 418 +++++++++++++++ drivers/gpu/drm/lima/lima_pp.h | 37 ++ drivers/gpu/drm/lima/lima_regs.h | 304 +++++++++++ drivers/gpu/drm/lima/lima_sched.c | 497 ++++++++++++++++++ drivers/gpu/drm/lima/lima_sched.h | 126 +++++ drivers/gpu/drm/lima/lima_ttm.c | 409 ++++++++++++++ drivers/gpu/drm/lima/lima_ttm.h | 44 ++ drivers/gpu/drm/lima/lima_vm.c | 312 +++++++++++ drivers/gpu/drm/lima/lima_vm.h | 73 +++ include/drm/drm_drv.h | 23 +- include/uapi/drm/lima_drm.h | 195 +++++++ 44 files changed, 5565 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/lima/Kconfig create mode 100644 drivers/gpu/drm/lima/Makefile create mode 100644 drivers/gpu/drm/lima/lima_bcast.c create mode 100644 drivers/gpu/drm/lima/lima_bcast.h create mode 100644 drivers/gpu/drm/lima/lima_ctx.c create mode 100644 drivers/gpu/drm/lima/lima_ctx.h create mode 100644 drivers/gpu/drm/lima/lima_device.c create mode 100644 drivers/gpu/drm/lima/lima_device.h create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h create mode 100644 drivers/gpu/drm/lima/lima_drv.c create mode 100644 drivers/gpu/drm/lima/lima_drv.h create mode 100644 drivers/gpu/drm/lima/lima_gem.c create mode 100644 drivers/gpu/drm/lima/lima_gem.h create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h create mode 100644 drivers/gpu/drm/lima/lima_gp.c create mode 100644 drivers/gpu/drm/lima/lima_gp.h create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h create mode 100644 drivers/gpu/drm/lima/lima_mmu.c create mode 100644 drivers/gpu/drm/lima/lima_mmu.h create mode 100644 drivers/gpu/drm/lima/lima_object.c create mode 100644 drivers/gpu/drm/lima/lima_object.h create mode 100644 drivers/gpu/drm/lima/lima_pmu.c create mode 100644 drivers/gpu/drm/lima/lima_pmu.h create mode 100644 drivers/gpu/drm/lima/lima_pp.c create mode 100644 drivers/gpu/drm/lima/lima_pp.h create mode 100644 drivers/gpu/drm/lima/lima_regs.h create mode 100644 drivers/gpu/drm/lima/lima_sched.c create mode 100644 drivers/gpu/drm/lima/lima_sched.h create mode 100644 drivers/gpu/drm/lima/lima_ttm.c create mode 100644 drivers/gpu/drm/lima/lima_ttm.h create mode 100644 drivers/gpu/drm/lima/lima_vm.c create mode 100644 drivers/gpu/drm/lima/lima_vm.h create mode 100644 include/uapi/drm/lima_drm.h
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch