From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets do that by adding a couple of Kconfig options.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% to ~2%.
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/Kconfig.profile | 46 +++++++++++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gemfs.c | 11 +++++-- 2 files changed, 55 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 39328567c200..c64c3d39a0f9 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -119,3 +119,49 @@ config DRM_I915_TIMESLICE_DURATION /sys/class/drm/card?/engine/*/timeslice_duration_ms
May be 0 to disable timeslicing. + +choice + prompt "Transparent Hugepage Support (native)" + default DRM_I915_THP_NATIVE_NEVER + help + Select the preferred method for allocating from Transparent Hugepages + when IOMMU is not enabled. + + config DRM_I915_THP_NATIVE_NEVER + bool "Never" + + config DRM_I915_THP_NATIVE_WITHIN + bool "Within" + + config DRM_I915_THP_NATIVE_ALWAYS + bool "Always" +endchoice + +config DRM_I915_THP_NATIVE + string + default "always" if DRM_I915_THP_NATIVE_ALWAYS + default "within_size" if DRM_I915_THP_NATIVE_WITHIN + default "never" if DRM_I915_THP_NATIVE_NEVER + +choice + prompt "Transparent Hugepage Support (IOMMU)" + default DRM_I915_THP_IOMMU_WITHIN + help + Select the preferred method for allocating from Transparent Hugepages + with IOMMU active. + + config DRM_I915_THP_IOMMU_NEVER + bool "Never" + + config DRM_I915_THP_IOMMU_WITHIN + bool "Within" + + config DRM_I915_THP_IOMMU_ALWAYS + bool "Always" +endchoice + +config DRM_I915_THP_IOMMU + string + default "always" if DRM_I915_THP_IOMMU_ALWAYS + default "within_size" if DRM_I915_THP_IOMMU_WITHIN + default "never" if DRM_I915_THP_IOMMU_NEVER diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..b71d2b2d2ada 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -13,8 +13,11 @@
int i915_gemfs_init(struct drm_i915_private *i915) { + char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; + char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; struct file_system_type *type; struct vfsmount *gemfs; + char *opts;
type = get_fs_type("tmpfs"); if (!type) @@ -26,15 +29,19 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size"). - * Currently unused due to bandwidth issues (slow reads) on Broadwell+. + * However, we only do so to offset the overhead of iommu lookups + * due to bandwidth issues (slow reads) on Broadwell+. */ + opts = intel_vtd_active() ? thp_iommu : thp_native;
- gemfs = kern_mount(type); + gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
i915->mm.gemfs = gemfs;
+ drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts); + return 0; }
On Wed, Jul 28, 2021 at 03:12:49PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets do that by adding a couple of Kconfig options.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% to ~2%.
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
drivers/gpu/drm/i915/Kconfig.profile | 46 +++++++++++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gemfs.c | 11 +++++-- 2 files changed, 55 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 39328567c200..c64c3d39a0f9 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -119,3 +119,49 @@ config DRM_I915_TIMESLICE_DURATION /sys/class/drm/card?/engine/*/timeslice_duration_ms
May be 0 to disable timeslicing.
+choice
- prompt "Transparent Hugepage Support (native)"
- default DRM_I915_THP_NATIVE_NEVER
- help
Select the preferred method for allocating from Transparent Hugepages
when IOMMU is not enabled.
- config DRM_I915_THP_NATIVE_NEVER
- bool "Never"
- config DRM_I915_THP_NATIVE_WITHIN
- bool "Within"
- config DRM_I915_THP_NATIVE_ALWAYS
- bool "Always"
+endchoice
+config DRM_I915_THP_NATIVE
- string
- default "always" if DRM_I915_THP_NATIVE_ALWAYS
- default "within_size" if DRM_I915_THP_NATIVE_WITHIN
- default "never" if DRM_I915_THP_NATIVE_NEVER
+choice
- prompt "Transparent Hugepage Support (IOMMU)"
- default DRM_I915_THP_IOMMU_WITHIN
- help
Select the preferred method for allocating from Transparent Hugepages
with IOMMU active.
- config DRM_I915_THP_IOMMU_NEVER
- bool "Never"
- config DRM_I915_THP_IOMMU_WITHIN
- bool "Within"
- config DRM_I915_THP_IOMMU_ALWAYS
- bool "Always"
+endchoice
+config DRM_I915_THP_IOMMU
- string
- default "always" if DRM_I915_THP_IOMMU_ALWAYS
- default "within_size" if DRM_I915_THP_IOMMU_WITHIN
- default "never" if DRM_I915_THP_IOMMU_NEVER
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..b71d2b2d2ada 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -13,8 +13,11 @@
int i915_gemfs_init(struct drm_i915_private *i915) {
char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE;
char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; struct file_system_type *type; struct vfsmount *gemfs;
char *opts;
type = get_fs_type("tmpfs"); if (!type)
@@ -26,15 +29,19 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size").
* Currently unused due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so to offset the overhead of iommu lookups
*/* due to bandwidth issues (slow reads) on Broadwell+.
- opts = intel_vtd_active() ? thp_iommu : thp_native;
at first sight I got confused on why we where having to configs, then very few drivers using the mount option there, so it took me a few minutes to realize it... but I know understood the goal...
Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com
- gemfs = kern_mount(type);
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
i915->mm.gemfs = gemfs;
drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts);
return 0;
}
-- 2.30.2
Hi Tvrtko,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on v5.14-rc3 next-20210730] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Tvrtko-Ursulin/drm-i915-Use-Transpa... base: git://anongit.freedesktop.org/drm-intel for-linux-next config: i386-randconfig-a015-20210728 (attached as .config) compiler: gcc-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/26ad4b5023a428526c23ce544b593eced191... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Tvrtko-Ursulin/drm-i915-Use-Transparent-Hugepages-when-IOMMU-is-enabled/20210728-221515 git checkout 26ad4b5023a428526c23ce544b593eced1916b49 # save the attached .config to linux build tree mkdir build_dir make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
drivers/gpu/drm/i915/gem/i915_gemfs.c: In function 'i915_gemfs_init':
drivers/gpu/drm/i915/gem/i915_gemfs.c:16:30: error: expected ',' or ';' before 'CONFIG_DRM_I915_THP_NATIVE'
16 | char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; | ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/gem/i915_gemfs.c:17:29: error: expected ',' or ';' before 'CONFIG_DRM_I915_THP_IOMMU'
17 | char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; | ^~~~~~~~~~~~~~~~~~~~~~~~~
vim +16 drivers/gpu/drm/i915/gem/i915_gemfs.c
13 14 int i915_gemfs_init(struct drm_i915_private *i915) 15 {
16 char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; 17 char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU;
18 struct file_system_type *type; 19 struct vfsmount *gemfs; 20 char *opts; 21 22 type = get_fs_type("tmpfs"); 23 if (!type) 24 return -ENODEV; 25 26 /* 27 * By creating our own shmemfs mountpoint, we can pass in 28 * mount flags that better match our usecase. 29 * 30 * One example, although it is probably better with a per-file 31 * control, is selecting huge page allocations ("huge=within_size"). 32 * However, we only do so to offset the overhead of iommu lookups 33 * due to bandwidth issues (slow reads) on Broadwell+. 34 */ 35 opts = intel_vtd_active() ? thp_iommu : thp_native; 36 37 gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); 38 if (IS_ERR(gemfs)) 39 return PTR_ERR(gemfs); 40 41 i915->mm.gemfs = gemfs; 42 43 drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts); 44 45 return 0; 46 } 47
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
dri-devel@lists.freedesktop.org