After an unfortunately long pause (covid work-schedule burnout), I wanted to revive and resubmit this series.
As before, the point of this series is trying to add both a page pool as well as deferred-freeingto the DMA-BUF system heap to improve allocation performance (so that it can match or beat the old ION system heaps performance).
The combination of the page pool along with deferred freeing allows us to offload page-zeroing out of the allocation hot path. This was done originally with ION and this patch series allows the DMA-BUF system heap to match ION's system heap allocation performance in a simple microbenchmark [1] (ION re-added to the kernel for comparision, running on an x86 vm image):
./dmabuf-heap-bench -i 0 1 system Testing dmabuf system vs ion heaptype 0 (flags: 0x1) --------------------------------------------- dmabuf heap: alloc 4096 bytes 5000 times in 79314244 ns 15862 ns/call ion heap: alloc 4096 bytes 5000 times in 107390769 ns 21478 ns/call dmabuf heap: alloc 1048576 bytes 5000 times in 259083419 ns 51816 ns/call ion heap: alloc 1048576 bytes 5000 times in 340497344 ns 68099 ns/call dmabuf heap: alloc 8388608 bytes 5000 times in 2603105563 ns 520621 ns/call ion heap: alloc 8388608 bytes 5000 times in 3613592860 ns 722718 ns/call dmabuf heap: alloc 33554432 bytes 5000 times in 12212492979 ns 2442498 ns/call ion heap: alloc 33554432 bytes 5000 times in 14584157792 ns 2916831 ns/call
Daniel didn't like earlier attempts to re-use the network page-pool code to achieve this, and suggested the ttm_pool be used instead, so this series pulls the page pool functionality out of the ttm_pool logic and creates a generic page pool that can be shared.
New in v9: * Tried to address Christian König's feedback on the page pool changes (Kerneldoc, static functions, locking issues, duplicative order tracking) * Fix up Kconfig dependency issue as Reported-by: kernel test robot lkp@intel.com * Fix compiler warning Reported-by: kernel test robot lkp@intel.com
I know Christian had some less specific feedback on the deferred free work that I'd like to revisit, but I wanted to restart the discussion with this new series, rather then trying to dregdge up and reply to a ~4mo old thread.
Input would be greatly appreciated. Testing as well, as I don't have any development hardware that utilizes the ttm pool.
Thanks -john
[1] https://android.googlesource.com/platform/system/memory/libdmabufheap/+/refs...
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org
John Stultz (5): drm: Add a sharable drm page-pool implementation drm: ttm_pool: Rework ttm_pool to use drm_page_pool dma-buf: system_heap: Add drm pagepool support to system heap dma-buf: heaps: Add deferred-free-helper library code dma-buf: system_heap: Add deferred freeing to the system heap
drivers/dma-buf/heaps/Kconfig | 5 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++++++++ drivers/dma-buf/heaps/deferred-free-helper.h | 55 ++++ drivers/dma-buf/heaps/system_heap.c | 46 ++- drivers/gpu/drm/Kconfig | 4 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/page_pool.c | 297 +++++++++++++++++++ drivers/gpu/drm/ttm/ttm_pool.c | 167 ++--------- include/drm/page_pool.h | 68 +++++ include/drm/ttm/ttm_pool.h | 14 +- 11 files changed, 643 insertions(+), 154 deletions(-) create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h create mode 100644 drivers/gpu/drm/page_pool.c create mode 100644 include/drm/page_pool.h
This adds a shrinker controlled page pool, extracted out of the ttm_pool logic, and abstracted out a bit so it can be used by other non-ttm drivers.
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org --- v8: * Completely rewritten from scratch, using only the ttm_pool logic so it can be dual licensed. v9: * Add Kerneldoc comments similar to tmm implementation as suggested by ChristianK * Mark some functions static as suggested by ChristianK * Fix locking issue ChristianK pointed out * Add methods to block the shrinker so users can make atomic calculations across multiple pools, as suggested by ChristianK * Fix up Kconfig dependency issue as Reported-by: kernel test robot lkp@intel.com --- drivers/gpu/drm/Kconfig | 3 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/page_pool.c | 297 ++++++++++++++++++++++++++++++++++++ include/drm/page_pool.h | 68 +++++++++ 4 files changed, 370 insertions(+) create mode 100644 drivers/gpu/drm/page_pool.c create mode 100644 include/drm/page_pool.h
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 3c16bd1afd87..52d9ba92b35e 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -177,6 +177,9 @@ config DRM_DP_CEC Note: not all adapters support this feature, and even for those that do support this they often do not hook up the CEC pin.
+config DRM_PAGE_POOL + bool + config DRM_TTM tristate depends on DRM && MMU diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 5279db4392df..affa4ca3a08e 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -39,6 +39,8 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o drm_ttm_helper-y := drm_gem_ttm_helper.o obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
+drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o + drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \ drm_dsc.o drm_probe_helper.o \ drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \ diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c new file mode 100644 index 000000000000..c07bbe3afc32 --- /dev/null +++ b/drivers/gpu/drm/page_pool.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Sharable page pool implementation + * + * Extracted from drivers/gpu/drm/ttm/ttm_pool.c + * Copyright 2020 Advanced Micro Devices, Inc. + * Copyright 2021 Linaro Ltd. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Christian König, John Stultz + */ + +#include <linux/module.h> +#include <linux/highmem.h> +#include <linux/shrinker.h> +#include <drm/page_pool.h> + +static unsigned long page_pool_size; /* max size of the pool */ + +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644); + +static atomic_long_t nr_managed_pages; + +static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker; + +/** + * drm_page_pool_set_max - Sets maximum size of all pools + * + * Sets the maximum number of pages allows in all pools. + * This can only be set once, and the first caller wins. + */ +void drm_page_pool_set_max(unsigned long max) +{ + if (!page_pool_size) + page_pool_size = max; +} + +/** + * drm_page_pool_get_max - Maximum size of all pools + * + * Return the maximum number of pages allows in all pools + */ +unsigned long drm_page_pool_get_max(void) +{ + return page_pool_size; +} + +/** + * drm_page_pool_get_total - Current size of all pools + * + * Return the number of pages in all managed pools + */ +unsigned long drm_page_pool_get_total(void) +{ + return atomic_long_read(&nr_managed_pages); +} + +/** + * drm_page_pool_get_size - Get the number of pages in the pool + * + * @pool: Pool to calculate the size of + * + * Return the number of pages in the specified pool + */ +unsigned long drm_page_pool_get_size(struct drm_page_pool *pool) +{ + unsigned long size; + + spin_lock(&pool->lock); + size = pool->page_count; + spin_unlock(&pool->lock); + return size; +} + +/** + * drm_page_pool_add - Add a page to a pool + * + * @pool: Pool to add page to + * @page: Page to be added to the pool + * + * Gives specified page into a specific pool + */ +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p) +{ + unsigned int i, num_pages = 1 << pool->order; + + /* Make sure we won't grow larger then the max pool size */ + if (page_pool_size && + ((drm_page_pool_get_total()) + num_pages > page_pool_size)) { + pool->free(pool, p); + return; + } + + /* Be sure to zero pages before adding them to the pool */ + for (i = 0; i < num_pages; ++i) { + if (PageHighMem(p)) + clear_highpage(p + i); + else + clear_page(page_address(p + i)); + } + + spin_lock(&pool->lock); + list_add(&p->lru, &pool->pages); + pool->page_count += 1 << pool->order; + spin_unlock(&pool->lock); + atomic_long_add(1 << pool->order, &nr_managed_pages); + +} + +/** + * drm_page_pool_remove - Remove page from pool + * + * @pool: Pool to pull the page from + * + * Take pages from a specific pool, return NULL when nothing available + */ +struct page *drm_page_pool_remove(struct drm_page_pool *pool) +{ + struct page *p; + + spin_lock(&pool->lock); + p = list_first_entry_or_null(&pool->pages, typeof(*p), lru); + if (p) { + atomic_long_sub(1 << pool->order, &nr_managed_pages); + pool->page_count -= 1 << pool->order; + list_del(&p->lru); + } + spin_unlock(&pool->lock); + + return p; +} + +/** + * drm_page_pool_init - Initialize a pool + * + * @pool: the pool to initialize + * @order: page allocation order + * @free_page: function pointer to free the pool's pages + * + * Initialize and add a pool type to the global shrinker list + */ +void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order, + void (*free_page)(struct drm_page_pool *pool, struct page *p)) +{ + pool->order = order; + spin_lock_init(&pool->lock); + INIT_LIST_HEAD(&pool->pages); + pool->free = free_page; + pool->page_count = 0; + + mutex_lock(&shrinker_lock); + list_add_tail(&pool->shrinker_list, &shrinker_list); + mutex_unlock(&shrinker_lock); +} + +/** + * drm_page_pool_fini - Cleanup a pool + * + * @pool: the pool to clean up + * + * Remove a pool_type from the global shrinker list and free all pages + */ +void drm_page_pool_fini(struct drm_page_pool *pool) +{ + struct page *p; + + mutex_lock(&shrinker_lock); + list_del(&pool->shrinker_list); + mutex_unlock(&shrinker_lock); + + while ((p = drm_page_pool_remove(pool))) + pool->free(pool, p); +} + +/** + * drm_page_pool_shrink - Shrink the drm page pool + * + * Free pages using the global shrinker list. Returns + * the number of pages free + */ +unsigned int drm_page_pool_shrink(void) +{ + struct drm_page_pool *pool; + unsigned int num_freed; + struct page *p; + + mutex_lock(&shrinker_lock); + pool = list_first_entry(&shrinker_list, typeof(*pool), shrinker_list); + + p = drm_page_pool_remove(pool); + if (p) { + pool->free(pool, p); + num_freed = 1 << pool->order; + } else { + num_freed = 0; + } + + list_move_tail(&pool->shrinker_list, &shrinker_list); + mutex_unlock(&shrinker_lock); + + return num_freed; +} + +/* As long as pages are available make sure to release at least one */ +static unsigned long drm_page_pool_shrinker_scan(struct shrinker *shrink, + struct shrink_control *sc) +{ + unsigned long num_freed = 0; + + do + num_freed += drm_page_pool_shrink(); + while (!num_freed && atomic_long_read(&nr_managed_pages)); + + return num_freed; +} + +/* Return the number of pages available or SHRINK_EMPTY if we have none */ +static unsigned long drm_page_pool_shrinker_count(struct shrinker *shrink, + struct shrink_control *sc) +{ + unsigned long num_pages = atomic_long_read(&nr_managed_pages); + + return num_pages ? num_pages : SHRINK_EMPTY; +} + +/** + * dma_page_pool_lock_shrinker - Take the shrinker lock + * + * Takes the shrinker lock, preventing the shrinker from making + * changes to the pools + */ +void dma_page_pool_lock_shrinker(void) +{ + mutex_lock(&shrinker_lock); +} + +/** + * dma_page_pool_unlock_shrinker - Release the shrinker lock + * + * Releases the shrinker lock, allowing the shrinker to free + * pages + */ +void dma_page_pool_unlock_shrinker(void) +{ + mutex_unlock(&shrinker_lock); +} + +/** + * drm_page_pool_shrinker_init - Initialize globals + * + * Initialize the global locks and lists for the shrinker. + */ +static int drm_page_pool_shrinker_init(void) +{ + mutex_init(&shrinker_lock); + INIT_LIST_HEAD(&shrinker_list); + + mm_shrinker.count_objects = drm_page_pool_shrinker_count; + mm_shrinker.scan_objects = drm_page_pool_shrinker_scan; + mm_shrinker.seeks = 1; + return register_shrinker(&mm_shrinker); +} + +/** + * drm_page_pool_shrinker_fini - Finalize globals + * + * Unregister the shrinker. + */ +static void drm_page_pool_shrinker_fini(void) +{ + unregister_shrinker(&mm_shrinker); + WARN_ON(!list_empty(&shrinker_list)); +} + +module_init(drm_page_pool_shrinker_init); +module_exit(drm_page_pool_shrinker_fini); +MODULE_LICENSE("Dual MIT/GPL"); diff --git a/include/drm/page_pool.h b/include/drm/page_pool.h new file mode 100644 index 000000000000..860cf6c92aab --- /dev/null +++ b/include/drm/page_pool.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Extracted from include/drm/ttm/ttm_pool.h + * Copyright 2020 Advanced Micro Devices, Inc. + * Copyright 2021 Linaro Ltd + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Christian König, John Stultz + */ + +#ifndef _DRM_PAGE_POOL_H_ +#define _DRM_PAGE_POOL_H_ + +#include <linux/mmzone.h> +#include <linux/llist.h> +#include <linux/spinlock.h> + +/** + * drm_page_pool - Page Pool for a certain memory type + * + * @order: the allocation order our pages have + * @pages: the list of pages in the pool + * @shrinker_list: our place on the global shrinker list + * @lock: protection of the page list + * @page_count: number of pages currently in the pool + * @free: Function pointer to free the page + */ +struct drm_page_pool { + unsigned int order; + struct list_head pages; + struct list_head shrinker_list; + spinlock_t lock; + + unsigned long page_count; + void (*free)(struct drm_page_pool *pool, struct page *p); +}; + +void drm_page_pool_set_max(unsigned long max); +unsigned long drm_page_pool_get_max(void); +unsigned long drm_page_pool_get_total(void); +unsigned int drm_page_pool_shrink(void); +unsigned long drm_page_pool_get_size(struct drm_page_pool *pool); +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p); +struct page *drm_page_pool_remove(struct drm_page_pool *pool); +void dma_page_pool_lock_shrinker(void); +void dma_page_pool_unlock_shrinker(void); +void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order, + void (*free_page)(struct drm_page_pool *pool, struct page *p)); +void drm_page_pool_fini(struct drm_page_pool *pool); + +#endif
Am 30.06.21 um 03:34 schrieb John Stultz:
This adds a shrinker controlled page pool, extracted out of the ttm_pool logic, and abstracted out a bit so it can be used by other non-ttm drivers.
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org
v8:
- Completely rewritten from scratch, using only the ttm_pool logic so it can be dual licensed.
v9:
- Add Kerneldoc comments similar to tmm implementation as suggested by ChristianK
- Mark some functions static as suggested by ChristianK
- Fix locking issue ChristianK pointed out
- Add methods to block the shrinker so users can make atomic calculations across multiple pools, as suggested by ChristianK
- Fix up Kconfig dependency issue as Reported-by: kernel test robot lkp@intel.com
drivers/gpu/drm/Kconfig | 3 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/page_pool.c | 297 ++++++++++++++++++++++++++++++++++++ include/drm/page_pool.h | 68 +++++++++ 4 files changed, 370 insertions(+) create mode 100644 drivers/gpu/drm/page_pool.c create mode 100644 include/drm/page_pool.h
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 3c16bd1afd87..52d9ba92b35e 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -177,6 +177,9 @@ config DRM_DP_CEC Note: not all adapters support this feature, and even for those that do support this they often do not hook up the CEC pin.
+config DRM_PAGE_POOL
- bool
- config DRM_TTM tristate depends on DRM && MMU
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 5279db4392df..affa4ca3a08e 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -39,6 +39,8 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o drm_ttm_helper-y := drm_gem_ttm_helper.o obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
+drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
- drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \ drm_dsc.o drm_probe_helper.o \ drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \
diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c new file mode 100644 index 000000000000..c07bbe3afc32 --- /dev/null +++ b/drivers/gpu/drm/page_pool.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/*
- Sharable page pool implementation
- Extracted from drivers/gpu/drm/ttm/ttm_pool.c
- Copyright 2020 Advanced Micro Devices, Inc.
- Copyright 2021 Linaro Ltd.
- Permission is hereby granted, free of charge, to any person obtaining a
- copy of this software and associated documentation files (the "Software"),
- to deal in the Software without restriction, including without limitation
- the rights to use, copy, modify, merge, publish, distribute, sublicense,
- and/or sell copies of the Software, and to permit persons to whom the
- Software is furnished to do so, subject to the following conditions:
- The above copyright notice and this permission notice shall be included in
- all copies or substantial portions of the Software.
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- OTHER DEALINGS IN THE SOFTWARE.
- Authors: Christian König, John Stultz
- */
+#include <linux/module.h> +#include <linux/highmem.h> +#include <linux/shrinker.h> +#include <drm/page_pool.h>
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
- if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
- return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
+/**
- drm_page_pool_get_total - Current size of all pools
- Return the number of pages in all managed pools
- */
+unsigned long drm_page_pool_get_total(void) +{
- return atomic_long_read(&nr_managed_pages);
+}
+/**
- drm_page_pool_get_size - Get the number of pages in the pool
- @pool: Pool to calculate the size of
- Return the number of pages in the specified pool
- */
+unsigned long drm_page_pool_get_size(struct drm_page_pool *pool) +{
- unsigned long size;
- spin_lock(&pool->lock);
- size = pool->page_count;
- spin_unlock(&pool->lock);
- return size;
+}
+/**
- drm_page_pool_add - Add a page to a pool
- @pool: Pool to add page to
- @page: Page to be added to the pool
- Gives specified page into a specific pool
- */
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *p) +{
- unsigned int i, num_pages = 1 << pool->order;
- /* Make sure we won't grow larger then the max pool size */
- if (page_pool_size &&
((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
pool->free(pool, p);
return;
- }
That is not a good idea. See how ttm_pool_free() does this.
First the page is given back to the pool, then all pools are shrinked if they are above the global limit.
Regards, Christian.
- /* Be sure to zero pages before adding them to the pool */
- for (i = 0; i < num_pages; ++i) {
if (PageHighMem(p))
clear_highpage(p + i);
else
clear_page(page_address(p + i));
- }
- spin_lock(&pool->lock);
- list_add(&p->lru, &pool->pages);
- pool->page_count += 1 << pool->order;
- spin_unlock(&pool->lock);
- atomic_long_add(1 << pool->order, &nr_managed_pages);
+}
+/**
- drm_page_pool_remove - Remove page from pool
- @pool: Pool to pull the page from
- Take pages from a specific pool, return NULL when nothing available
- */
+struct page *drm_page_pool_remove(struct drm_page_pool *pool) +{
- struct page *p;
- spin_lock(&pool->lock);
- p = list_first_entry_or_null(&pool->pages, typeof(*p), lru);
- if (p) {
atomic_long_sub(1 << pool->order, &nr_managed_pages);
pool->page_count -= 1 << pool->order;
list_del(&p->lru);
- }
- spin_unlock(&pool->lock);
- return p;
+}
+/**
- drm_page_pool_init - Initialize a pool
- @pool: the pool to initialize
- @order: page allocation order
- @free_page: function pointer to free the pool's pages
- Initialize and add a pool type to the global shrinker list
- */
+void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order,
void (*free_page)(struct drm_page_pool *pool, struct page *p))
+{
- pool->order = order;
- spin_lock_init(&pool->lock);
- INIT_LIST_HEAD(&pool->pages);
- pool->free = free_page;
- pool->page_count = 0;
- mutex_lock(&shrinker_lock);
- list_add_tail(&pool->shrinker_list, &shrinker_list);
- mutex_unlock(&shrinker_lock);
+}
+/**
- drm_page_pool_fini - Cleanup a pool
- @pool: the pool to clean up
- Remove a pool_type from the global shrinker list and free all pages
- */
+void drm_page_pool_fini(struct drm_page_pool *pool) +{
- struct page *p;
- mutex_lock(&shrinker_lock);
- list_del(&pool->shrinker_list);
- mutex_unlock(&shrinker_lock);
- while ((p = drm_page_pool_remove(pool)))
pool->free(pool, p);
+}
+/**
- drm_page_pool_shrink - Shrink the drm page pool
- Free pages using the global shrinker list. Returns
- the number of pages free
- */
+unsigned int drm_page_pool_shrink(void) +{
- struct drm_page_pool *pool;
- unsigned int num_freed;
- struct page *p;
- mutex_lock(&shrinker_lock);
- pool = list_first_entry(&shrinker_list, typeof(*pool), shrinker_list);
- p = drm_page_pool_remove(pool);
- if (p) {
pool->free(pool, p);
num_freed = 1 << pool->order;
- } else {
num_freed = 0;
- }
- list_move_tail(&pool->shrinker_list, &shrinker_list);
- mutex_unlock(&shrinker_lock);
- return num_freed;
+}
+/* As long as pages are available make sure to release at least one */ +static unsigned long drm_page_pool_shrinker_scan(struct shrinker *shrink,
struct shrink_control *sc)
+{
- unsigned long num_freed = 0;
- do
num_freed += drm_page_pool_shrink();
- while (!num_freed && atomic_long_read(&nr_managed_pages));
- return num_freed;
+}
+/* Return the number of pages available or SHRINK_EMPTY if we have none */ +static unsigned long drm_page_pool_shrinker_count(struct shrinker *shrink,
struct shrink_control *sc)
+{
- unsigned long num_pages = atomic_long_read(&nr_managed_pages);
- return num_pages ? num_pages : SHRINK_EMPTY;
+}
+/**
- dma_page_pool_lock_shrinker - Take the shrinker lock
- Takes the shrinker lock, preventing the shrinker from making
- changes to the pools
- */
+void dma_page_pool_lock_shrinker(void) +{
- mutex_lock(&shrinker_lock);
+}
+/**
- dma_page_pool_unlock_shrinker - Release the shrinker lock
- Releases the shrinker lock, allowing the shrinker to free
- pages
- */
+void dma_page_pool_unlock_shrinker(void) +{
- mutex_unlock(&shrinker_lock);
+}
+/**
- drm_page_pool_shrinker_init - Initialize globals
- Initialize the global locks and lists for the shrinker.
- */
+static int drm_page_pool_shrinker_init(void) +{
- mutex_init(&shrinker_lock);
- INIT_LIST_HEAD(&shrinker_list);
- mm_shrinker.count_objects = drm_page_pool_shrinker_count;
- mm_shrinker.scan_objects = drm_page_pool_shrinker_scan;
- mm_shrinker.seeks = 1;
- return register_shrinker(&mm_shrinker);
+}
+/**
- drm_page_pool_shrinker_fini - Finalize globals
- Unregister the shrinker.
- */
+static void drm_page_pool_shrinker_fini(void) +{
- unregister_shrinker(&mm_shrinker);
- WARN_ON(!list_empty(&shrinker_list));
+}
+module_init(drm_page_pool_shrinker_init); +module_exit(drm_page_pool_shrinker_fini); +MODULE_LICENSE("Dual MIT/GPL"); diff --git a/include/drm/page_pool.h b/include/drm/page_pool.h new file mode 100644 index 000000000000..860cf6c92aab --- /dev/null +++ b/include/drm/page_pool.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/*
- Extracted from include/drm/ttm/ttm_pool.h
- Copyright 2020 Advanced Micro Devices, Inc.
- Copyright 2021 Linaro Ltd
- Permission is hereby granted, free of charge, to any person obtaining a
- copy of this software and associated documentation files (the "Software"),
- to deal in the Software without restriction, including without limitation
- the rights to use, copy, modify, merge, publish, distribute, sublicense,
- and/or sell copies of the Software, and to permit persons to whom the
- Software is furnished to do so, subject to the following conditions:
- The above copyright notice and this permission notice shall be included in
- all copies or substantial portions of the Software.
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- OTHER DEALINGS IN THE SOFTWARE.
- Authors: Christian König, John Stultz
- */
+#ifndef _DRM_PAGE_POOL_H_ +#define _DRM_PAGE_POOL_H_
+#include <linux/mmzone.h> +#include <linux/llist.h> +#include <linux/spinlock.h>
+/**
- drm_page_pool - Page Pool for a certain memory type
- @order: the allocation order our pages have
- @pages: the list of pages in the pool
- @shrinker_list: our place on the global shrinker list
- @lock: protection of the page list
- @page_count: number of pages currently in the pool
- @free: Function pointer to free the page
- */
+struct drm_page_pool {
- unsigned int order;
- struct list_head pages;
- struct list_head shrinker_list;
- spinlock_t lock;
- unsigned long page_count;
- void (*free)(struct drm_page_pool *pool, struct page *p);
+};
+void drm_page_pool_set_max(unsigned long max); +unsigned long drm_page_pool_get_max(void); +unsigned long drm_page_pool_get_total(void); +unsigned int drm_page_pool_shrink(void); +unsigned long drm_page_pool_get_size(struct drm_page_pool *pool); +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p); +struct page *drm_page_pool_remove(struct drm_page_pool *pool); +void dma_page_pool_lock_shrinker(void); +void dma_page_pool_unlock_shrinker(void); +void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order,
void (*free_page)(struct drm_page_pool *pool, struct page *p));
+void drm_page_pool_fini(struct drm_page_pool *pool);
+#endif
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz:
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Your earlier suggestion to have it as an argument to the drm_page_pool_shrinker_init() didn't make much sense to me, as then we may have multiple subsystems trying to initialize the pool shrinker, which doesn't seem ideal. So I'm not sure what would be preferred.
I guess we could have subsystems allocate their own pool managers with their own shrinkers and per-manager-size-limits? But that also feels like a fair increase in complexity, for I'm not sure how much benefit.
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *p) +{
unsigned int i, num_pages = 1 << pool->order;
/* Make sure we won't grow larger then the max pool size */
if (page_pool_size &&
((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
pool->free(pool, p);
return;
}
That is not a good idea. See how ttm_pool_free() does this.
First the page is given back to the pool, then all pools are shrinked if they are above the global limit.
Ok, initially my thought was it seemed like wasteful overhead to add the page to the pool and then shrink the pool, so just freeing the given page directly if the pool was full seemed more straightforward. But on consideration here I do realize having most-recently-used hot pages in the pool would be good for performance, so I'll switch that back. Thanks for pointing this out!
Thanks again so much for the review and feedback! -john
Am 01.07.21 um 00:24 schrieb John Stultz:
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz:
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Yeah, exactly that's what I'm trying to prevent.
See if we have the same functionality used by different use cases we should not have different behavior depending on what drivers are loaded.
Is it a problem if we restrict the ION pool to 50% of system memory as well? If yes than I would rather drop the limit from TTM and only rely on the shrinker there as well.
Your earlier suggestion to have it as an argument to the drm_page_pool_shrinker_init() didn't make much sense to me, as then we may have multiple subsystems trying to initialize the pool shrinker, which doesn't seem ideal. So I'm not sure what would be preferred.
I guess we could have subsystems allocate their own pool managers with their own shrinkers and per-manager-size-limits? But that also feels like a fair increase in complexity, for I'm not sure how much benefit.
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *p) +{
unsigned int i, num_pages = 1 << pool->order;
/* Make sure we won't grow larger then the max pool size */
if (page_pool_size &&
((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
pool->free(pool, p);
return;
}
That is not a good idea. See how ttm_pool_free() does this.
First the page is given back to the pool, then all pools are shrinked if they are above the global limit.
Ok, initially my thought was it seemed like wasteful overhead to add the page to the pool and then shrink the pool, so just freeing the given page directly if the pool was full seemed more straightforward. But on consideration here I do realize having most-recently-used hot pages in the pool would be good for performance, so I'll switch that back. Thanks for pointing this out!
The even bigger problem is that you then always drop pages from the active pools.
E.g. a pool which just allocated and then freed 2MiB during driver load for some firmware upload will never see pressure if you do it this way.
So those 2MiB would never be recycled while they could be well used in one of the active pools.
Regards, Christian.
Thanks again so much for the review and feedback! -john
On Wed, Jun 30, 2021 at 11:52 PM Christian König christian.koenig@amd.com wrote:
Am 01.07.21 um 00:24 schrieb John Stultz:
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz:
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Yeah, exactly that's what I'm trying to prevent.
See if we have the same functionality used by different use cases we should not have different behavior depending on what drivers are loaded.
Is it a problem if we restrict the ION pool to 50% of system memory as well? If yes than I would rather drop the limit from TTM and only rely on the shrinker there as well.
Would having the default value as a config option (still overridable via boot argument) be an acceptable solution?
Thanks again for the feedback!
thanks -john
On Tue, Jul 6, 2021 at 11:04 PM John Stultz john.stultz@linaro.org wrote:
On Wed, Jun 30, 2021 at 11:52 PM Christian König christian.koenig@amd.com wrote:
Am 01.07.21 um 00:24 schrieb John Stultz:
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz:
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Yeah, exactly that's what I'm trying to prevent.
See if we have the same functionality used by different use cases we should not have different behavior depending on what drivers are loaded.
Is it a problem if we restrict the ION pool to 50% of system memory as well? If yes than I would rather drop the limit from TTM and only rely on the shrinker there as well.
Would having the default value as a config option (still overridable via boot argument) be an acceptable solution?
We're also trying to get ttm over to the shrinker model, and a first cut of that even landed, but didn't really work out yet. So maybe just aiming for the shrinker? I do agree this should be consistent across the board, otherwise we're just sharing code but not actually sharing functionality, which is a recipe for disaster because one side will end up breaking the other side's use-case. -Daniel
On Tue, Jul 6, 2021 at 2:15 PM Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Jul 6, 2021 at 11:04 PM John Stultz john.stultz@linaro.org wrote:
On Wed, Jun 30, 2021 at 11:52 PM Christian König christian.koenig@amd.com wrote:
Am 01.07.21 um 00:24 schrieb John Stultz:
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz:
+static unsigned long page_pool_size; /* max size of the pool */
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); +module_param(page_pool_size, ulong, 0644);
+static atomic_long_t nr_managed_pages;
+static struct mutex shrinker_lock; +static struct list_head shrinker_list; +static struct shrinker mm_shrinker;
+/**
- drm_page_pool_set_max - Sets maximum size of all pools
- Sets the maximum number of pages allows in all pools.
- This can only be set once, and the first caller wins.
- */
+void drm_page_pool_set_max(unsigned long max) +{
if (!page_pool_size)
page_pool_size = max;
+}
+/**
- drm_page_pool_get_max - Maximum size of all pools
- Return the maximum number of pages allows in all pools
- */
+unsigned long drm_page_pool_get_max(void) +{
return page_pool_size;
+}
Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Yeah, exactly that's what I'm trying to prevent.
See if we have the same functionality used by different use cases we should not have different behavior depending on what drivers are loaded.
Is it a problem if we restrict the ION pool to 50% of system memory as well? If yes than I would rather drop the limit from TTM and only rely on the shrinker there as well.
Would having the default value as a config option (still overridable via boot argument) be an acceptable solution?
We're also trying to get ttm over to the shrinker model, and a first cut of that even landed, but didn't really work out yet. So maybe just aiming for the shrinker? I do agree this should be consistent across the board, otherwise we're just sharing code but not actually sharing functionality, which is a recipe for disaster because one side will end up breaking the other side's use-case.
Fair enough, maybe it would be best to remove the default limit, but leave the logic so it can still be set via the boot argument?
thanks -john
Am 06.07.21 um 23:19 schrieb John Stultz:
On Tue, Jul 6, 2021 at 2:15 PM Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Jul 6, 2021 at 11:04 PM John Stultz john.stultz@linaro.org wrote:
On Wed, Jun 30, 2021 at 11:52 PM Christian König christian.koenig@amd.com wrote:
Am 01.07.21 um 00:24 schrieb John Stultz:
On Wed, Jun 30, 2021 at 2:10 AM Christian König christian.koenig@amd.com wrote:
Am 30.06.21 um 03:34 schrieb John Stultz: > +static unsigned long page_pool_size; /* max size of the pool */ > + > +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool"); > +module_param(page_pool_size, ulong, 0644); > + > +static atomic_long_t nr_managed_pages; > + > +static struct mutex shrinker_lock; > +static struct list_head shrinker_list; > +static struct shrinker mm_shrinker; > + > +/** > + * drm_page_pool_set_max - Sets maximum size of all pools > + * > + * Sets the maximum number of pages allows in all pools. > + * This can only be set once, and the first caller wins. > + */ > +void drm_page_pool_set_max(unsigned long max) > +{ > + if (!page_pool_size) > + page_pool_size = max; > +} > + > +/** > + * drm_page_pool_get_max - Maximum size of all pools > + * > + * Return the maximum number of pages allows in all pools > + */ > +unsigned long drm_page_pool_get_max(void) > +{ > + return page_pool_size; > +} Well in general I don't think it is a good idea to have getters/setters for one line functionality, similar applies to locking/unlocking the mutex below.
Then in this specific case what those functions do is to aid initializing the general pool manager and that in turn should absolutely not be exposed.
The TTM pool manager exposes this as function because initializing the pool manager is done in one part of the module and calculating the default value for the pages in another one. But that is not something I would like to see here.
So, I guess I'm not quite clear on what you'd like to see...
Part of what I'm balancing here is the TTM subsystem normally sets a global max size, whereas the old ION pool didn't have caps (instead just relying on the shrinker when needed). So I'm trying to come up with a solution that can serve both uses. So I've got this drm_page_pool_set_max() function to optionally set the maximum value, which is called in the TTM initialization path or set the boot argument. But for systems that use the dmabuf system heap, but don't use TTM, no global limit is enforced.
Yeah, exactly that's what I'm trying to prevent.
See if we have the same functionality used by different use cases we should not have different behavior depending on what drivers are loaded.
Is it a problem if we restrict the ION pool to 50% of system memory as well? If yes than I would rather drop the limit from TTM and only rely on the shrinker there as well.
Would having the default value as a config option (still overridable via boot argument) be an acceptable solution?
We're also trying to get ttm over to the shrinker model, and a first cut of that even landed, but didn't really work out yet. So maybe just aiming for the shrinker? I do agree this should be consistent across the board, otherwise we're just sharing code but not actually sharing functionality, which is a recipe for disaster because one side will end up breaking the other side's use-case.
Fair enough, maybe it would be best to remove the default limit, but leave the logic so it can still be set via the boot argument?
Yeah, that would work for me and the shrinker implementation should already be good enough.
Regards, Christian.
thanks -john
This patch reworks the ttm_pool logic to utilize the recently added drm_page_pool code.
This adds drm_page_pool structures to the ttm_pool_type structures, and then removes all the ttm_pool_type shrinker logic (as its handled in the drm_page_pool shrinker).
NOTE: There is one mismatch in the interfaces I'm not totally happy with. The ttm_pool tracks all of its pooled pages across a number of different pools, and tries to keep this size under the specified page_pool_size value. With the drm_page_pool, there may other users, however there is still one global shrinker list of pools. So we can't easily reduce the ttm pool under the ttm specified size without potentially doing a lot of shrinking to other non-ttm pools. So either we can: 1) Try to split it so each user of drm_page_pools manages its own pool shrinking. 2) Push the max value into the drm_page_pool, and have it manage shrinking to fit under that global max. Then share those size/max values out so the ttm_pool debug output can have more context.
I've taken the second path in this patch set, but wanted to call it out so folks could look closely.
Thoughts would be greatly appreciated here!
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org --- v7: * Major refactoring to use drm_page_pools inside the ttm_pool_type structure. This allows us to use container_of to get the needed context to free a page. This also means less code is changed overall. v8: * Reworked to use the new cleanly rewritten drm_page_pool logic v9: * Renamed functions, and dropped duplicative order tracking, as suggested by ChristianK * Use new *_(un)lock_shrinker() hooks to fix atomic calculations for debugfs --- drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/ttm/ttm_pool.c | 167 ++++++--------------------------- include/drm/ttm/ttm_pool.h | 14 +-- 3 files changed, 33 insertions(+), 149 deletions(-)
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 52d9ba92b35e..6be5344c009c 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -183,6 +183,7 @@ config DRM_PAGE_POOL config DRM_TTM tristate depends on DRM && MMU + select DRM_PAGE_POOL help GPU memory management subsystem for devices with multiple GPU memory types. Will be enabled automatically if a device driver diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index cb38b1a17b09..7ae647bce551 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -40,6 +40,7 @@ #include <asm/set_memory.h> #endif
+#include <drm/page_pool.h> #include <drm/ttm/ttm_pool.h> #include <drm/ttm/ttm_bo_driver.h> #include <drm/ttm/ttm_tt.h> @@ -70,10 +71,6 @@ static struct ttm_pool_type global_uncached[MAX_ORDER]; static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER]; static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
-static struct mutex shrinker_lock; -static struct list_head shrinker_list; -static struct shrinker mm_shrinker; - /* Allocate pages of size 1 << order with the given gfp_flags */ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, unsigned int order) @@ -158,6 +155,15 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, kfree(dma); }
+static void ttm_pool_free_callback(struct drm_page_pool *subpool, + struct page *p) +{ + struct ttm_pool_type *pt; + + pt = container_of(subpool, struct ttm_pool_type, subpool); + return ttm_pool_free_page(pt->pool, pt->caching, subpool->order, p); +} + /* Apply a new caching to an array of pages */ static int ttm_pool_apply_caching(struct page **first, struct page **last, enum ttm_caching caching) @@ -219,66 +225,20 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr, DMA_BIDIRECTIONAL); }
-/* Give pages into a specific pool_type */ -static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p) -{ - unsigned int i, num_pages = 1 << pt->order; - - for (i = 0; i < num_pages; ++i) { - if (PageHighMem(p)) - clear_highpage(p + i); - else - clear_page(page_address(p + i)); - } - - spin_lock(&pt->lock); - list_add(&p->lru, &pt->pages); - spin_unlock(&pt->lock); - atomic_long_add(1 << pt->order, &allocated_pages); -} - -/* Take pages from a specific pool_type, return NULL when nothing available */ -static struct page *ttm_pool_type_take(struct ttm_pool_type *pt) -{ - struct page *p; - - spin_lock(&pt->lock); - p = list_first_entry_or_null(&pt->pages, typeof(*p), lru); - if (p) { - atomic_long_sub(1 << pt->order, &allocated_pages); - list_del(&p->lru); - } - spin_unlock(&pt->lock); - - return p; -} - /* Initialize and add a pool type to the global shrinker list */ static void ttm_pool_type_init(struct ttm_pool_type *pt, struct ttm_pool *pool, enum ttm_caching caching, unsigned int order) { pt->pool = pool; pt->caching = caching; - pt->order = order; - spin_lock_init(&pt->lock); - INIT_LIST_HEAD(&pt->pages);
- mutex_lock(&shrinker_lock); - list_add_tail(&pt->shrinker_list, &shrinker_list); - mutex_unlock(&shrinker_lock); + drm_page_pool_init(&pt->subpool, order, ttm_pool_free_callback); }
/* Remove a pool_type from the global shrinker list and free all pages */ static void ttm_pool_type_fini(struct ttm_pool_type *pt) { - struct page *p; - - mutex_lock(&shrinker_lock); - list_del(&pt->shrinker_list); - mutex_unlock(&shrinker_lock); - - while ((p = ttm_pool_type_take(pt))) - ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); + drm_page_pool_fini(&pt->subpool); }
/* Return the pool_type to use for the given caching and order */ @@ -309,30 +269,6 @@ static struct ttm_pool_type *ttm_pool_select_type(struct ttm_pool *pool, return NULL; }
-/* Free pages using the global shrinker list */ -static unsigned int ttm_pool_shrink(void) -{ - struct ttm_pool_type *pt; - unsigned int num_freed; - struct page *p; - - mutex_lock(&shrinker_lock); - pt = list_first_entry(&shrinker_list, typeof(*pt), shrinker_list); - - p = ttm_pool_type_take(pt); - if (p) { - ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); - num_freed = 1 << pt->order; - } else { - num_freed = 0; - } - - list_move_tail(&pt->shrinker_list, &shrinker_list); - mutex_unlock(&shrinker_lock); - - return num_freed; -} - /* Return the allocation order based for a page */ static unsigned int ttm_pool_page_order(struct ttm_pool *pool, struct page *p) { @@ -389,7 +325,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, struct ttm_pool_type *pt;
pt = ttm_pool_select_type(pool, tt->caching, order); - p = pt ? ttm_pool_type_take(pt) : NULL; + p = pt ? drm_page_pool_remove(&pt->subpool) : NULL; if (p) { apply_caching = true; } else { @@ -471,16 +407,13 @@ void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
pt = ttm_pool_select_type(pool, tt->caching, order); if (pt) - ttm_pool_type_give(pt, tt->pages[i]); + drm_page_pool_add(&pt->subpool, tt->pages[i]); else ttm_pool_free_page(pool, tt->caching, order, tt->pages[i]);
i += num_pages; } - - while (atomic_long_read(&allocated_pages) > page_pool_size) - ttm_pool_shrink(); } EXPORT_SYMBOL(ttm_pool_free);
@@ -532,44 +465,7 @@ void ttm_pool_fini(struct ttm_pool *pool) } }
-/* As long as pages are available make sure to release at least one */ -static unsigned long ttm_pool_shrinker_scan(struct shrinker *shrink, - struct shrink_control *sc) -{ - unsigned long num_freed = 0; - - do - num_freed += ttm_pool_shrink(); - while (!num_freed && atomic_long_read(&allocated_pages)); - - return num_freed; -} - -/* Return the number of pages available or SHRINK_EMPTY if we have none */ -static unsigned long ttm_pool_shrinker_count(struct shrinker *shrink, - struct shrink_control *sc) -{ - unsigned long num_pages = atomic_long_read(&allocated_pages); - - return num_pages ? num_pages : SHRINK_EMPTY; -} - #ifdef CONFIG_DEBUG_FS -/* Count the number of pages available in a pool_type */ -static unsigned int ttm_pool_type_count(struct ttm_pool_type *pt) -{ - unsigned int count = 0; - struct page *p; - - spin_lock(&pt->lock); - /* Only used for debugfs, the overhead doesn't matter */ - list_for_each_entry(p, &pt->pages, lru) - ++count; - spin_unlock(&pt->lock); - - return count; -} - /* Print a nice header for the order */ static void ttm_pool_debugfs_header(struct seq_file *m) { @@ -588,7 +484,8 @@ static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt, unsigned int i;
for (i = 0; i < MAX_ORDER; ++i) - seq_printf(m, " %8u", ttm_pool_type_count(&pt[i])); + seq_printf(m, " %8lu", + drm_page_pool_get_size(&pt[i].subpool)); seq_puts(m, "\n"); }
@@ -596,7 +493,10 @@ static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt, static void ttm_pool_debugfs_footer(struct seq_file *m) { seq_printf(m, "\ntotal\t: %8lu of %8lu\n", - atomic_long_read(&allocated_pages), page_pool_size); + atomic_long_read(&allocated_pages), + drm_page_pool_get_max()); + seq_printf(m, "(%8lu in non-ttm pools)\n", drm_page_pool_get_total() - + atomic_long_read(&allocated_pages)); }
/* Dump the information for the global pools */ @@ -604,7 +504,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file *m, void *data) { ttm_pool_debugfs_header(m);
- mutex_lock(&shrinker_lock); + dma_page_pool_lock_shrinker(); seq_puts(m, "wc\t:"); ttm_pool_debugfs_orders(global_write_combined, m); seq_puts(m, "uc\t:"); @@ -613,7 +513,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file *m, void *data) ttm_pool_debugfs_orders(global_dma32_write_combined, m); seq_puts(m, "uc 32\t:"); ttm_pool_debugfs_orders(global_dma32_uncached, m); - mutex_unlock(&shrinker_lock); + dma_page_pool_unlock_shrinker();
ttm_pool_debugfs_footer(m);
@@ -640,7 +540,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)
ttm_pool_debugfs_header(m);
- mutex_lock(&shrinker_lock); + dma_page_pool_lock_shrinker(); for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) { seq_puts(m, "DMA "); switch (i) { @@ -656,7 +556,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m) } ttm_pool_debugfs_orders(pool->caching[i].orders, m); } - mutex_unlock(&shrinker_lock); + dma_page_pool_unlock_shrinker();
ttm_pool_debugfs_footer(m); return 0; @@ -666,13 +566,10 @@ EXPORT_SYMBOL(ttm_pool_debugfs); /* Test the shrinker functions and dump the result */ static int ttm_pool_debugfs_shrink_show(struct seq_file *m, void *data) { - struct shrink_control sc = { .gfp_mask = GFP_NOFS }; - fs_reclaim_acquire(GFP_KERNEL); - seq_printf(m, "%lu/%lu\n", ttm_pool_shrinker_count(&mm_shrinker, &sc), - ttm_pool_shrinker_scan(&mm_shrinker, &sc)); + seq_printf(m, "%lu/%lu\n", drm_page_pool_get_total(), + (unsigned long)drm_page_pool_shrink()); fs_reclaim_release(GFP_KERNEL); - return 0; } DEFINE_SHOW_ATTRIBUTE(ttm_pool_debugfs_shrink); @@ -693,8 +590,7 @@ int ttm_pool_mgr_init(unsigned long num_pages) if (!page_pool_size) page_pool_size = num_pages;
- mutex_init(&shrinker_lock); - INIT_LIST_HEAD(&shrinker_list); + drm_page_pool_set_max(page_pool_size);
for (i = 0; i < MAX_ORDER; ++i) { ttm_pool_type_init(&global_write_combined[i], NULL, @@ -713,11 +609,7 @@ int ttm_pool_mgr_init(unsigned long num_pages) debugfs_create_file("page_pool_shrink", 0400, ttm_debugfs_root, NULL, &ttm_pool_debugfs_shrink_fops); #endif - - mm_shrinker.count_objects = ttm_pool_shrinker_count; - mm_shrinker.scan_objects = ttm_pool_shrinker_scan; - mm_shrinker.seeks = 1; - return register_shrinker(&mm_shrinker); + return 0; }
/** @@ -736,7 +628,4 @@ void ttm_pool_mgr_fini(void) ttm_pool_type_fini(&global_dma32_write_combined[i]); ttm_pool_type_fini(&global_dma32_uncached[i]); } - - unregister_shrinker(&mm_shrinker); - WARN_ON(!list_empty(&shrinker_list)); } diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h index 4321728bdd11..c854a81491da 100644 --- a/include/drm/ttm/ttm_pool.h +++ b/include/drm/ttm/ttm_pool.h @@ -30,6 +30,7 @@ #include <linux/llist.h> #include <linux/spinlock.h> #include <drm/ttm/ttm_caching.h> +#include <drm/page_pool.h>
struct device; struct ttm_tt; @@ -39,22 +40,15 @@ struct ttm_operation_ctx; /** * ttm_pool_type - Pool for a certain memory type * - * @pool: the pool we belong to, might be NULL for the global ones - * @order: the allocation order our pages have + * @pool: the ttm pool we belong to, might be NULL for the global ones * @caching: the caching type our pages have - * @shrinker_list: our place on the global shrinker list - * @lock: protection of the page list - * @pages: the list of pages in the pool + * @subpool: the dma_page_pool that we use to manage the pages */ struct ttm_pool_type { struct ttm_pool *pool; - unsigned int order; enum ttm_caching caching;
- struct list_head shrinker_list; - - spinlock_t lock; - struct list_head pages; + struct drm_page_pool subpool; };
/**
Utilize the drm pagepool code to speed up allocation performance.
This is similar to the ION pagepool usage, but tries to utilize generic code instead of a custom implementation.
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org --- v2: * Fix build issue caused by selecting PAGE_POOL w/o NET as Reported-by: kernel test robot lkp@intel.com v3: * Simplify the page zeroing logic a bit by using kmap_atomic instead of vmap as suggested by Daniel Mentz v5: * Shift away from networking page pool completely to dmabuf page pool implementation v6: * Switch again to using the drm_page_pool code shared w/ ttm_pool v7: * Slight rework for drm_page_pool changes v8: * Rework to use the rewritten drm_page_pool logic * Drop explicit buffer zeroing, as the drm_page_pool handles that v9: * Fix compiler warning Reported-by: kernel test robot lkp@intel.com --- drivers/dma-buf/heaps/Kconfig | 1 + drivers/dma-buf/heaps/system_heap.c | 26 +++++++++++++++++++++++--- 2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index a5eef06c4226..f19bf1f82bc2 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -1,6 +1,7 @@ config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS + select DRM_PAGE_POOL help Choose this option to enable the system dmabuf heap. The system heap is backed by pages from the buddy allocator. If in doubt, say Y. diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index f57a39ddd063..85ceca2ed61d 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -21,6 +21,8 @@ #include <linux/slab.h> #include <linux/vmalloc.h>
+#include <drm/page_pool.h> + static struct dma_heap *sys_heap;
struct system_heap_buffer { @@ -54,6 +56,7 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP}; */ static const unsigned int orders[] = {8, 4, 0}; #define NUM_ORDERS ARRAY_SIZE(orders) +struct drm_page_pool pools[NUM_ORDERS];
static struct sg_table *dup_sg_table(struct sg_table *table) { @@ -282,18 +285,27 @@ static void system_heap_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map) dma_buf_map_clear(map); }
+static void system_heap_free_pages(struct drm_page_pool *pool, struct page *p) +{ + __free_pages(p, pool->order); +} + static void system_heap_dma_buf_release(struct dma_buf *dmabuf) { struct system_heap_buffer *buffer = dmabuf->priv; struct sg_table *table; struct scatterlist *sg; - int i; + int i, j;
table = &buffer->sg_table; for_each_sg(table->sgl, sg, table->nents, i) { struct page *page = sg_page(sg);
- __free_pages(page, compound_order(page)); + for (j = 0; j < NUM_ORDERS; j++) { + if (compound_order(page) == orders[j]) + break; + } + drm_page_pool_add(&pools[j], page); } sg_free_table(table); kfree(buffer); @@ -324,7 +336,9 @@ static struct page *alloc_largest_available(unsigned long size, if (max_order < orders[i]) continue;
- page = alloc_pages(order_flags[i], orders[i]); + page = drm_page_pool_remove(&pools[i]); + if (!page) + page = alloc_pages(order_flags[i], orders[i]); if (!page) continue; return page; @@ -425,6 +439,12 @@ static const struct dma_heap_ops system_heap_ops = { static int system_heap_create(void) { struct dma_heap_export_info exp_info; + int i; + + for (i = 0; i < NUM_ORDERS; i++) { + drm_page_pool_init(&pools[i], orders[i], + system_heap_free_pages); + }
exp_info.name = "system"; exp_info.ops = &system_heap_ops;
This patch provides infrastructure for deferring buffer frees.
This is a feature ION provided which when used with some form of a page pool, provides a nice performance boost in an allocation microbenchmark. The reason it helps is it allows the page-zeroing to be done out of the normal allocation/free path, and pushed off to a kthread.
As not all heaps will find this useful, its implemented as a optional helper library that heaps can utilize.
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org --- v2: * Fix sleep in atomic issue from using a mutex, by switching to a spinlock as Reported-by: kernel test robot oliver.sang@intel.com * Cleanup API to use a reason enum for clarity and add some documentation comments as suggested by Suren Baghdasaryan. v3: * Minor tweaks so it can be built as a module * A few small fixups suggested by Daniel Mentz v4: * Tweak from Daniel Mentz to make sure the shrinker count/freed values are tracked in pages not bytes v5: * Fix up page count tracking as suggested by Suren Baghdasaryan v7: * Rework accounting to use nr_pages rather then size, as suggested by Suren Baghdasaryan --- drivers/dma-buf/heaps/Kconfig | 3 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++++++++++++++++++ drivers/dma-buf/heaps/deferred-free-helper.h | 55 ++++++++ 4 files changed, 197 insertions(+) create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index f19bf1f82bc2..7e28934e0def 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -1,3 +1,6 @@ +config DMABUF_HEAPS_DEFERRED_FREE + tristate + config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile index 974467791032..4e7839875615 100644 --- a/drivers/dma-buf/heaps/Makefile +++ b/drivers/dma-buf/heaps/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_DMABUF_HEAPS_DEFERRED_FREE) += deferred-free-helper.o obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o diff --git a/drivers/dma-buf/heaps/deferred-free-helper.c b/drivers/dma-buf/heaps/deferred-free-helper.c new file mode 100644 index 000000000000..e19c8b68dfeb --- /dev/null +++ b/drivers/dma-buf/heaps/deferred-free-helper.c @@ -0,0 +1,138 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Deferred dmabuf freeing helper + * + * Copyright (C) 2020 Linaro, Ltd. + * + * Based on the ION page pool code + * Copyright (C) 2011 Google, Inc. + */ + +#include <linux/freezer.h> +#include <linux/list.h> +#include <linux/slab.h> +#include <linux/swap.h> +#include <linux/sched/signal.h> + +#include "deferred-free-helper.h" + +static LIST_HEAD(free_list); +static size_t list_nr_pages; +wait_queue_head_t freelist_waitqueue; +struct task_struct *freelist_task; +static DEFINE_SPINLOCK(free_list_lock); + +void deferred_free(struct deferred_freelist_item *item, + void (*free)(struct deferred_freelist_item*, + enum df_reason), + size_t nr_pages) +{ + unsigned long flags; + + INIT_LIST_HEAD(&item->list); + item->nr_pages = nr_pages; + item->free = free; + + spin_lock_irqsave(&free_list_lock, flags); + list_add(&item->list, &free_list); + list_nr_pages += nr_pages; + spin_unlock_irqrestore(&free_list_lock, flags); + wake_up(&freelist_waitqueue); +} +EXPORT_SYMBOL_GPL(deferred_free); + +static size_t free_one_item(enum df_reason reason) +{ + unsigned long flags; + size_t nr_pages; + struct deferred_freelist_item *item; + + spin_lock_irqsave(&free_list_lock, flags); + if (list_empty(&free_list)) { + spin_unlock_irqrestore(&free_list_lock, flags); + return 0; + } + item = list_first_entry(&free_list, struct deferred_freelist_item, list); + list_del(&item->list); + nr_pages = item->nr_pages; + list_nr_pages -= nr_pages; + spin_unlock_irqrestore(&free_list_lock, flags); + + item->free(item, reason); + return nr_pages; +} + +static unsigned long get_freelist_nr_pages(void) +{ + unsigned long nr_pages; + unsigned long flags; + + spin_lock_irqsave(&free_list_lock, flags); + nr_pages = list_nr_pages; + spin_unlock_irqrestore(&free_list_lock, flags); + return nr_pages; +} + +static unsigned long freelist_shrink_count(struct shrinker *shrinker, + struct shrink_control *sc) +{ + return get_freelist_nr_pages(); +} + +static unsigned long freelist_shrink_scan(struct shrinker *shrinker, + struct shrink_control *sc) +{ + unsigned long total_freed = 0; + + if (sc->nr_to_scan == 0) + return 0; + + while (total_freed < sc->nr_to_scan) { + size_t pages_freed = free_one_item(DF_UNDER_PRESSURE); + + if (!pages_freed) + break; + + total_freed += pages_freed; + } + + return total_freed; +} + +static struct shrinker freelist_shrinker = { + .count_objects = freelist_shrink_count, + .scan_objects = freelist_shrink_scan, + .seeks = DEFAULT_SEEKS, + .batch = 0, +}; + +static int deferred_free_thread(void *data) +{ + while (true) { + wait_event_freezable(freelist_waitqueue, + get_freelist_nr_pages() > 0); + + free_one_item(DF_NORMAL); + } + + return 0; +} + +static int deferred_freelist_init(void) +{ + list_nr_pages = 0; + + init_waitqueue_head(&freelist_waitqueue); + freelist_task = kthread_run(deferred_free_thread, NULL, + "%s", "dmabuf-deferred-free-worker"); + if (IS_ERR(freelist_task)) { + pr_err("Creating thread for deferred free failed\n"); + return -1; + } + sched_set_normal(freelist_task, 19); + + return register_shrinker(&freelist_shrinker); +} +module_init(deferred_freelist_init); +MODULE_LICENSE("GPL v2"); + diff --git a/drivers/dma-buf/heaps/deferred-free-helper.h b/drivers/dma-buf/heaps/deferred-free-helper.h new file mode 100644 index 000000000000..11940328ce3f --- /dev/null +++ b/drivers/dma-buf/heaps/deferred-free-helper.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef DEFERRED_FREE_HELPER_H +#define DEFERRED_FREE_HELPER_H + +/** + * df_reason - enum for reason why item was freed + * + * This provides a reason for why the free function was called + * on the item. This is useful when deferred_free is used in + * combination with a pagepool, so under pressure the page can + * be immediately freed. + * + * DF_NORMAL: Normal deferred free + * + * DF_UNDER_PRESSURE: Free was called because the system + * is under memory pressure. Usually + * from a shrinker. Avoid allocating + * memory in the free call, as it may + * fail. + */ +enum df_reason { + DF_NORMAL, + DF_UNDER_PRESSURE, +}; + +/** + * deferred_freelist_item - item structure for deferred freelist + * + * This is to be added to the structure for whatever you want to + * defer freeing on. + * + * @nr_pages: number of pages used by item to be freed + * @free: function pointer to be called when freeing the item + * @list: list entry for the deferred list + */ +struct deferred_freelist_item { + size_t nr_pages; + void (*free)(struct deferred_freelist_item *i, + enum df_reason reason); + struct list_head list; +}; + +/** + * deferred_free - call to add item to the deferred free list + * + * @item: Pointer to deferred_freelist_item field of a structure + * @free: Function pointer to the free call + * @nr_pages: number of pages to be freed + */ +void deferred_free(struct deferred_freelist_item *item, + void (*free)(struct deferred_freelist_item *i, + enum df_reason reason), + size_t nr_pages); +#endif
Utilize the deferred free helper library in the system heap.
This provides a nice performance bump and puts the system heap performance on par with ION.
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz john.stultz@linaro.org --- v2: * Rework deferred-free api to use reason enum as suggested by Suren Baghdasaryan * Rework for deferred-free api change to use nr_pages rather than size as suggsted by Suren Baghdasaryan v8: * Reworked to drop buffer zeroing logic, as the drm_page_pool now handles that. --- drivers/dma-buf/heaps/Kconfig | 1 + drivers/dma-buf/heaps/system_heap.c | 28 ++++++++++++++++++++++------ 2 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index 7e28934e0def..10632ccfb4a5 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -5,6 +5,7 @@ config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS select DRM_PAGE_POOL + select DMABUF_HEAPS_DEFERRED_FREE help Choose this option to enable the system dmabuf heap. The system heap is backed by pages from the buddy allocator. If in doubt, say Y. diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 85ceca2ed61d..8a0170b0427e 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -22,6 +22,7 @@ #include <linux/vmalloc.h>
#include <drm/page_pool.h> +#include "deferred-free-helper.h"
static struct dma_heap *sys_heap;
@@ -33,6 +34,7 @@ struct system_heap_buffer { struct sg_table sg_table; int vmap_cnt; void *vaddr; + struct deferred_freelist_item deferred_free; };
struct dma_heap_attachment { @@ -290,27 +292,41 @@ static void system_heap_free_pages(struct drm_page_pool *pool, struct page *p) __free_pages(p, pool->order); }
-static void system_heap_dma_buf_release(struct dma_buf *dmabuf) +static void system_heap_buf_free(struct deferred_freelist_item *item, + enum df_reason reason) { - struct system_heap_buffer *buffer = dmabuf->priv; + struct system_heap_buffer *buffer; struct sg_table *table; struct scatterlist *sg; int i, j;
+ buffer = container_of(item, struct system_heap_buffer, deferred_free); table = &buffer->sg_table; for_each_sg(table->sgl, sg, table->nents, i) { struct page *page = sg_page(sg);
- for (j = 0; j < NUM_ORDERS; j++) { - if (compound_order(page) == orders[j]) - break; + if (reason == DF_UNDER_PRESSURE) { + __free_pages(page, compound_order(page)); + } else { + for (j = 0; j < NUM_ORDERS; j++) { + if (compound_order(page) == orders[j]) + break; + } + drm_page_pool_add(&pools[j], page); } - drm_page_pool_add(&pools[j], page); } sg_free_table(table); kfree(buffer); }
+static void system_heap_dma_buf_release(struct dma_buf *dmabuf) +{ + struct system_heap_buffer *buffer = dmabuf->priv; + int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE; + + deferred_free(&buffer->deferred_free, system_heap_buf_free, npages); +} + static const struct dma_buf_ops system_heap_buf_ops = { .attach = system_heap_attach, .detach = system_heap_detach,
Am 30.06.21 um 03:34 schrieb John Stultz:
After an unfortunately long pause (covid work-schedule burnout), I wanted to revive and resubmit this series.
As before, the point of this series is trying to add both a page pool as well as deferred-freeingto the DMA-BUF system heap to improve allocation performance (so that it can match or beat the old ION system heaps performance).
The combination of the page pool along with deferred freeing allows us to offload page-zeroing out of the allocation hot path. This was done originally with ION and this patch series allows the DMA-BUF system heap to match ION's system heap allocation performance in a simple microbenchmark [1] (ION re-added to the kernel for comparision, running on an x86 vm image):
./dmabuf-heap-bench -i 0 1 system Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
dmabuf heap: alloc 4096 bytes 5000 times in 79314244 ns 15862 ns/call ion heap: alloc 4096 bytes 5000 times in 107390769 ns 21478 ns/call dmabuf heap: alloc 1048576 bytes 5000 times in 259083419 ns 51816 ns/call ion heap: alloc 1048576 bytes 5000 times in 340497344 ns 68099 ns/call dmabuf heap: alloc 8388608 bytes 5000 times in 2603105563 ns 520621 ns/call ion heap: alloc 8388608 bytes 5000 times in 3613592860 ns 722718 ns/call dmabuf heap: alloc 33554432 bytes 5000 times in 12212492979 ns 2442498 ns/call ion heap: alloc 33554432 bytes 5000 times in 14584157792 ns 2916831 ns/call
Daniel didn't like earlier attempts to re-use the network page-pool code to achieve this, and suggested the ttm_pool be used instead, so this series pulls the page pool functionality out of the ttm_pool logic and creates a generic page pool that can be shared.
New in v9:
- Tried to address Christian König's feedback on the page pool changes (Kerneldoc, static functions, locking issues, duplicative order tracking)
- Fix up Kconfig dependency issue as Reported-by: kernel test robot lkp@intel.com
- Fix compiler warning Reported-by: kernel test robot lkp@intel.com
I know Christian had some less specific feedback on the deferred free work that I'd like to revisit, but I wanted to restart the discussion with this new series, rather then trying to dregdge up and reply to a ~4mo old thread.
I was already wondering where this was left :)
The kernel test robot pointed out quite a number of bugs. I suggest to fix those first and then take a look at my comments on patch #1.
Regards, Christian.
Input would be greatly appreciated. Testing as well, as I don't have any development hardware that utilizes the ttm pool.
Thanks -john
[1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fandroid.go...
Cc: Daniel Vetter daniel@ffwll.ch Cc: Christian Koenig christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Liam Mark lmark@codeaurora.org Cc: Chris Goldsworthy cgoldswo@codeaurora.org Cc: Laura Abbott labbott@kernel.org Cc: Brian Starkey Brian.Starkey@arm.com Cc: Hridya Valsaraju hridya@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Daniel Mentz danielmentz@google.com Cc: Ørjan Eide orjan.eide@arm.com Cc: Robin Murphy robin.murphy@arm.com Cc: Ezequiel Garcia ezequiel@collabora.com Cc: Simon Ser contact@emersion.fr Cc: James Jones jajones@nvidia.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org
John Stultz (5): drm: Add a sharable drm page-pool implementation drm: ttm_pool: Rework ttm_pool to use drm_page_pool dma-buf: system_heap: Add drm pagepool support to system heap dma-buf: heaps: Add deferred-free-helper library code dma-buf: system_heap: Add deferred freeing to the system heap
drivers/dma-buf/heaps/Kconfig | 5 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++++++++ drivers/dma-buf/heaps/deferred-free-helper.h | 55 ++++ drivers/dma-buf/heaps/system_heap.c | 46 ++- drivers/gpu/drm/Kconfig | 4 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/page_pool.c | 297 +++++++++++++++++++ drivers/gpu/drm/ttm/ttm_pool.c | 167 ++--------- include/drm/page_pool.h | 68 +++++ include/drm/ttm/ttm_pool.h | 14 +- 11 files changed, 643 insertions(+), 154 deletions(-) create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h create mode 100644 drivers/gpu/drm/page_pool.c create mode 100644 include/drm/page_pool.h
dri-devel@lists.freedesktop.org