Hi,
As suggested by Christian, we should move non-x86 definitions into one common header, and it will make the codes readable. They are based on the improvement fix of Bas (already rebase Bas's patch to drm-next).
Changes from V1 -> V2: - add ttm_ prefix at this header. - use set_pages_wb instead of set_memory_wb.
Thanks, Ray
Bas Nieuwenhuizen (1): drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)
Huang Rui (3): drm/ttm: add ttm_set_memory header (v2) drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dma drm/ttm: clean up non-x86 definitions on ttm_page_alloc
drivers/gpu/drm/ttm/ttm_page_alloc.c | 62 ++------------- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 57 ++------------ include/drm/ttm/ttm_set_memory.h | 128 +++++++++++++++++++++++++++++++ 3 files changed, 140 insertions(+), 107 deletions(-) create mode 100644 include/drm/ttm/ttm_set_memory.h
This patch moves all non-x86 abstraction to the ttm_set_memory header. It is to make function calling more clearly.
(v2): add ttm_ prefix.
Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Huang Rui ray.huang@amd.com --- include/drm/ttm/ttm_set_memory.h | 128 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 128 insertions(+) create mode 100644 include/drm/ttm/ttm_set_memory.h
diff --git a/include/drm/ttm/ttm_set_memory.h b/include/drm/ttm/ttm_set_memory.h new file mode 100644 index 0000000..a70723c --- /dev/null +++ b/include/drm/ttm/ttm_set_memory.h @@ -0,0 +1,128 @@ +/************************************************************************** + * + * Copyright (c) 2018 Advanced Micro Devices, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ +/* + * Authors: Huang Rui ray.huang@amd.com + */ + +#ifndef TTM_SET_MEMORY +#define TTM_SET_MEMORY + +#include <linux/mm.h> + +#ifdef CONFIG_X86 + +#include <asm/set_memory.h> + +static inline int ttm_set_pages_array_wb(struct page **pages, int addrinarray) +{ + return set_pages_array_wb(pages, addrinarray); +} + +static inline int ttm_set_pages_array_wc(struct page **pages, int addrinarray) +{ + return set_pages_array_wc(pages, addrinarray); +} + +static inline int ttm_set_pages_array_uc(struct page **pages, int addrinarray) +{ + return set_pages_array_uc(pages, addrinarray); +} + +static inline int ttm_set_pages_wb(struct page *page, int numpages) +{ + return set_pages_wb(page, numpages); +} + +#else /* for CONFIG_X86 */ + +#if IS_ENABLED(CONFIG_AGP) + +#include <asm/agp.h> + +static inline int ttm_set_pages_array_wb(struct page **pages, int addrinarray) +{ + int i; + + for (i = 0; i < addrinarray; i++) + unmap_page_from_agp(pages[i]); + return 0; +} + +static inline int ttm_set_pages_array_wc(struct page **pages, int addrinarray) +{ + int i; + + for (i = 0; i < addrinarray; i++) + map_page_into_agp(pages[i]); + return 0; +} + +static inline int ttm_set_pages_array_uc(struct page **pages, int addrinarray) +{ + int i; + + for (i = 0; i < addrinarray; i++) + map_page_into_agp(pages[i]); + return 0; +} + +static inline int ttm_set_pages_wb(struct page *page, int numpages) +{ + int i; + + for (i = 0; i < numpages; i++) + unmap_page_from_agp(page++); + return 0; +} + +#else /* for CONFIG_AGP */ + +static inline int ttm_set_pages_array_wb(struct page **pages, int addrinarray) +{ + return 0; +} + +static inline int ttm_set_pages_array_wc(struct page **pages, int addrinarray) +{ + return 0; +} + +static inline int ttm_set_pages_array_uc(struct page **pages, int addrinarray) +{ + return 0; +} + +static inline int ttm_set_pages_wb(struct page *page, int numpages) +{ + return 0; +} + +#endif /* for CONFIG_AGP */ + +#endif /* for CONFIG_X86 */ + +#endif
All non-x86 definitions are moved to ttm_set_memory header, so remove it from ttm_page_alloc_dma.c.
Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Huang Rui ray.huang@amd.com --- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 48 +++----------------------------- 1 file changed, 4 insertions(+), 44 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c index 3f14c1c..f31148a 100644 --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c @@ -50,12 +50,7 @@ #include <linux/kthread.h> #include <drm/ttm/ttm_bo_driver.h> #include <drm/ttm/ttm_page_alloc.h> -#if IS_ENABLED(CONFIG_AGP) -#include <asm/agp.h> -#endif -#ifdef CONFIG_X86 -#include <asm/set_memory.h> -#endif +#include <drm/ttm/ttm_set_memory.h>
#define NUM_PAGES_TO_ALLOC (PAGE_SIZE/sizeof(struct page *)) #define SMALL_ALLOCATION 4 @@ -268,41 +263,6 @@ static struct kobj_type ttm_pool_kobj_type = { .default_attrs = ttm_pool_attrs, };
-#ifndef CONFIG_X86 -static int set_pages_array_wb(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - unmap_page_from_agp(pages[i]); -#endif - return 0; -} - -static int set_pages_array_wc(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - map_page_into_agp(pages[i]); -#endif - return 0; -} - -static int set_pages_array_uc(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - map_page_into_agp(pages[i]); -#endif - return 0; -} -#endif /* for !CONFIG_X86 */ - static int ttm_set_pages_caching(struct dma_pool *pool, struct page **pages, unsigned cpages) { @@ -315,7 +275,7 @@ static int ttm_set_pages_caching(struct dma_pool *pool, pool->dev_name, cpages); } if (pool->type & IS_WC) { - r = set_pages_array_wc(pages, cpages); + r = ttm_set_pages_array_wc(pages, cpages); if (r) pr_err("%s: Failed to set %d pages to wc!\n", pool->dev_name, cpages); @@ -395,7 +355,7 @@ static void ttm_dma_page_put(struct dma_pool *pool, struct dma_page *d_page) if (!(pool->type & IS_CACHED)) { num_pages = pool->size / PAGE_SIZE; for (i = 0; i < num_pages; ++i, ++page) { - if (set_pages_array_wb(&page, 1)) { + if (ttm_set_pages_array_wb(&page, 1)) { pr_err("%s: Failed to set %d pages to wb!\n", pool->dev_name, 1); } @@ -420,7 +380,7 @@ static void ttm_dma_pages_put(struct dma_pool *pool, struct list_head *d_pages,
/* Don't set WB on WB page pool. */ if (npages && !(pool->type & IS_CACHED) && - set_pages_array_wb(pages, npages)) + ttm_set_pages_array_wb(pages, npages)) pr_err("%s: Failed to set %d pages to wb!\n", pool->dev_name, npages);
All non-x86 definitions are moved to ttm_set_memory header, so remove it from ttm_page_alloc.c.
Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Huang Rui ray.huang@amd.com --- drivers/gpu/drm/ttm/ttm_page_alloc.c | 62 +++--------------------------------- 1 file changed, 5 insertions(+), 57 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c index 6e2d130..f841acc 100644 --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c @@ -47,13 +47,7 @@
#include <drm/ttm/ttm_bo_driver.h> #include <drm/ttm/ttm_page_alloc.h> - -#if IS_ENABLED(CONFIG_AGP) -#include <asm/agp.h> -#endif -#ifdef CONFIG_X86 -#include <asm/set_memory.h> -#endif +#include <drm/ttm/ttm_set_memory.h>
#define NUM_PAGES_TO_ALLOC (PAGE_SIZE/sizeof(struct page *)) #define SMALL_ALLOCATION 16 @@ -222,52 +216,6 @@ static struct kobj_type ttm_pool_kobj_type = {
static struct ttm_pool_manager *_manager;
-#ifndef CONFIG_X86 -static int set_pages_wb(struct page *page, int numpages) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < numpages; i++) - unmap_page_from_agp(page++); -#endif - return 0; -} - -static int set_pages_array_wb(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - unmap_page_from_agp(pages[i]); -#endif - return 0; -} - -static int set_pages_array_wc(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - map_page_into_agp(pages[i]); -#endif - return 0; -} - -static int set_pages_array_uc(struct page **pages, int addrinarray) -{ -#if IS_ENABLED(CONFIG_AGP) - int i; - - for (i = 0; i < addrinarray; i++) - map_page_into_agp(pages[i]); -#endif - return 0; -} -#endif - /** * Select the right pool or requested caching state and ttm flags. */ static struct ttm_page_pool *ttm_get_pool(int flags, bool huge, @@ -302,13 +250,13 @@ static void ttm_pages_put(struct page *pages[], unsigned npages, unsigned int i, pages_nr = (1 << order);
if (order == 0) { - if (set_pages_array_wb(pages, npages)) + if (ttm_set_pages_array_wb(pages, npages)) pr_err("Failed to set %d pages to wb!\n", npages); }
for (i = 0; i < npages; ++i) { if (order > 0) { - if (set_pages_wb(pages[i], pages_nr)) + if (ttm_set_pages_wb(pages[i], pages_nr)) pr_err("Failed to set %d pages to wb!\n", pages_nr); } __free_pages(pages[i], order); @@ -498,12 +446,12 @@ static int ttm_set_pages_caching(struct page **pages, /* Set page caching */ switch (cstate) { case tt_uncached: - r = set_pages_array_uc(pages, cpages); + r = ttm_set_pages_array_uc(pages, cpages); if (r) pr_err("Failed to set %d pages to uc!\n", cpages); break; case tt_wc: - r = set_pages_array_wc(pages, cpages); + r = ttm_set_pages_array_wc(pages, cpages); if (r) pr_err("Failed to set %d pages to wc!\n", cpages); break;
From: Bas Nieuwenhuizen basni@chromium.org
Every set_pages_array_wb call resulted in cross-core interrupts and TLB flushes. Merge more of them for less overhead.
This reduces the time needed to free a 1.6 GiB GTT WC buffer as part of Vulkan CTS from ~2 sec to < 0.25 sec. (Allocation still takes more than 2 sec though)
(v2): use set_pages_wb instead of set_memory_wb.
Signed-off-by: Bas Nieuwenhuizen basni@chromium.org Signed-off-by: Huang Rui ray.huang@amd.com --- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c index f31148a..8304917 100644 --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c @@ -349,17 +349,14 @@ static void ttm_pool_update_free_locked(struct dma_pool *pool, static void ttm_dma_page_put(struct dma_pool *pool, struct dma_page *d_page) { struct page *page = d_page->p; - unsigned i, num_pages; + unsigned num_pages;
/* Don't set WB on WB page pool. */ if (!(pool->type & IS_CACHED)) { num_pages = pool->size / PAGE_SIZE; - for (i = 0; i < num_pages; ++i, ++page) { - if (ttm_set_pages_array_wb(&page, 1)) { - pr_err("%s: Failed to set %d pages to wb!\n", - pool->dev_name, 1); - } - } + if (ttm_set_pages_wb(page, num_pages)) + pr_err("%s: Failed to set %d pages to wb!\n", + pool->dev_name, num_pages); }
list_del(&d_page->page_list);
On Thu, Jul 26, 2018 at 1:52 PM, Huang Rui ray.huang@amd.com wrote:
Hi,
As suggested by Christian, we should move non-x86 definitions into one common header, and it will make the codes readable. They are based on the improvement fix of Bas (already rebase Bas's patch to drm-next).
Changes from V1 -> V2:
- add ttm_ prefix at this header.
- use set_pages_wb instead of set_memory_wb.
Thanks! For the series
Reviewed-by: Bas Nieuwenhuizen basni@chromium.org
Thanks, Ray
Bas Nieuwenhuizen (1): drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)
Huang Rui (3): drm/ttm: add ttm_set_memory header (v2) drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dma drm/ttm: clean up non-x86 definitions on ttm_page_alloc
drivers/gpu/drm/ttm/ttm_page_alloc.c | 62 ++------------- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 57 ++------------ include/drm/ttm/ttm_set_memory.h | 128 +++++++++++++++++++++++++++++++ 3 files changed, 140 insertions(+), 107 deletions(-) create mode 100644 include/drm/ttm/ttm_set_memory.h
-- 2.7.4
Am 26.07.2018 um 13:59 schrieb Bas Nieuwenhuizen:
On Thu, Jul 26, 2018 at 1:52 PM, Huang Rui ray.huang@amd.com wrote:
Hi,
As suggested by Christian, we should move non-x86 definitions into one common header, and it will make the codes readable. They are based on the improvement fix of Bas (already rebase Bas's patch to drm-next).
Changes from V1 -> V2:
- add ttm_ prefix at this header.
- use set_pages_wb instead of set_memory_wb.
Thanks! For the series
Agreed, really nice cleanup.
Reviewed-by: Bas Nieuwenhuizen basni@chromium.org
Reviewed-by: Christian König christian.koenig@amd.com as well.
Regards, Christian.
Thanks, Ray
Bas Nieuwenhuizen (1): drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)
Huang Rui (3): drm/ttm: add ttm_set_memory header (v2) drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dma drm/ttm: clean up non-x86 definitions on ttm_page_alloc
drivers/gpu/drm/ttm/ttm_page_alloc.c | 62 ++------------- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 57 ++------------ include/drm/ttm/ttm_set_memory.h | 128 +++++++++++++++++++++++++++++++ 3 files changed, 140 insertions(+), 107 deletions(-) create mode 100644 include/drm/ttm/ttm_set_memory.h
-- 2.7.4
On Thu, Jul 26, 2018 at 08:12:05PM +0800, Koenig, Christian wrote:
Am 26.07.2018 um 13:59 schrieb Bas Nieuwenhuizen:
On Thu, Jul 26, 2018 at 1:52 PM, Huang Rui ray.huang@amd.com wrote:
Hi,
As suggested by Christian, we should move non-x86 definitions into one common header, and it will make the codes readable. They are based on the improvement fix of Bas (already rebase Bas's patch to drm-next).
Changes from V1 -> V2:
- add ttm_ prefix at this header.
- use set_pages_wb instead of set_memory_wb.
Thanks! For the series
Agreed, really nice cleanup.
Reviewed-by: Bas Nieuwenhuizen basni@chromium.org
Reviewed-by: Christian König christian.koenig@amd.com as well.
Thank you, guys. :-) Already applied them.
Ray
Regards, Christian.
Thanks, Ray
Bas Nieuwenhuizen (1): drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)
Huang Rui (3): drm/ttm: add ttm_set_memory header (v2) drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dma drm/ttm: clean up non-x86 definitions on ttm_page_alloc
drivers/gpu/drm/ttm/ttm_page_alloc.c | 62 ++------------- drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 57 ++------------ include/drm/ttm/ttm_set_memory.h | 128 +++++++++++++++++++++++++++++++ 3 files changed, 140 insertions(+), 107 deletions(-) create mode 100644 include/drm/ttm/ttm_set_memory.h
-- 2.7.4
dri-devel@lists.freedesktop.org