From: Matthew Auld matthew.auld@intel.com
Since the object might still be active here, the shrink_all will simply ignore it, which blows up in the test, since the pages will still be there. Currently THP is disabled which should result in the test being skipped, but if we ever re-enable THP we might start seeing the failure. Fix this by forcing I915_SHRINK_ACTIVE.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com --- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index a094f3ce1a90..acc435f14ac9 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -1572,12 +1572,15 @@ static int igt_shrink_thp(void *arg) goto out_put;
/* - * Now that the pages are *unpinned* shrink-all should invoke + * Now that the pages are *unpinned* shrinking should invoke * shmem to truncate our pages. */ - i915_gem_shrink_all(i915); + i915_gem_shrink(NULL, i915, -1UL, NULL, + I915_SHRINK_BOUND | + I915_SHRINK_UNBOUND | + I915_SHRINK_ACTIVE); if (i915_gem_object_has_pages(obj)) { - pr_err("shrink-all didn't truncate the pages\n"); + pr_err("shrinking didn't truncate the pages\n"); err = -EINVAL; goto out_put; }
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on).
v2: * Add Kconfig dependency to transparent hugepages and some help text. * Move to helper for easier handling of kernel build options.
v3: * Drop Kconfig. (Daniel)
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com # v1 --- drivers/gpu/drm/i915/gem/i915_gemfs.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..dbdbdc344d87 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -6,7 +6,6 @@
#include <linux/fs.h> #include <linux/mount.h> -#include <linux/pagemap.h>
#include "i915_drv.h" #include "i915_gemfs.h" @@ -15,6 +14,7 @@ int i915_gemfs_init(struct drm_i915_private *i915) { struct file_system_type *type; struct vfsmount *gemfs; + char *opts;
type = get_fs_type("tmpfs"); if (!type) @@ -26,10 +26,26 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size"). - * Currently unused due to bandwidth issues (slow reads) on Broadwell+. + * However, we only do so to offset the overhead of iommu lookups + * due to bandwidth issues (slow reads) on Broadwell+. */
- gemfs = kern_mount(type); + opts = NULL; + if (intel_vtd_active()) { + if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + static char huge_opt[] = "huge=within_size"; /* r/w */ + + opts = huge_opt; + drm_info(&i915->drm, + "Transparent Hugepage mode '%s'\n", + opts); + } else { + drm_notice(&i915->drm, + "Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n"); + } + } + + gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
On Thu, Jul 29, 2021 at 3:34 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on).
v2:
- Add Kconfig dependency to transparent hugepages and some help text.
- Move to helper for easier handling of kernel build options.
v3:
- Drop Kconfig. (Daniel)
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com # v1
On both patches: Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
drivers/gpu/drm/i915/gem/i915_gemfs.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..dbdbdc344d87 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -6,7 +6,6 @@
#include <linux/fs.h> #include <linux/mount.h> -#include <linux/pagemap.h>
#include "i915_drv.h" #include "i915_gemfs.h" @@ -15,6 +14,7 @@ int i915_gemfs_init(struct drm_i915_private *i915) { struct file_system_type *type; struct vfsmount *gemfs;
char *opts; type = get_fs_type("tmpfs"); if (!type)
@@ -26,10 +26,26 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size").
* Currently unused due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so to offset the overhead of iommu lookups
* due to bandwidth issues (slow reads) on Broadwell+. */
gemfs = kern_mount(type);
opts = NULL;
if (intel_vtd_active()) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
static char huge_opt[] = "huge=within_size"; /* r/w */
opts = huge_opt;
drm_info(&i915->drm,
"Transparent Hugepage mode '%s'\n",
opts);
} else {
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
}
}
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
-- 2.30.2
On 29/07/2021 15:06, Daniel Vetter wrote:
On Thu, Jul 29, 2021 at 3:34 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on).
v2:
- Add Kconfig dependency to transparent hugepages and some help text.
- Move to helper for easier handling of kernel build options.
v3:
- Drop Kconfig. (Daniel)
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com # v1
On both patches: Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
Eero's testing results at https://gitlab.freedesktop.org/drm/intel/-/issues/430 are looking good - seem to show this to be a net win for at least Gen9 and Gen12 platforms.
Is the ack enough to merge in this case or I should look for an r-b as well?
Regards,
Tvrtko
drivers/gpu/drm/i915/gem/i915_gemfs.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..dbdbdc344d87 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -6,7 +6,6 @@
#include <linux/fs.h> #include <linux/mount.h> -#include <linux/pagemap.h>
#include "i915_drv.h" #include "i915_gemfs.h" @@ -15,6 +14,7 @@ int i915_gemfs_init(struct drm_i915_private *i915) { struct file_system_type *type; struct vfsmount *gemfs;
char *opts; type = get_fs_type("tmpfs"); if (!type)
@@ -26,10 +26,26 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size").
* Currently unused due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so to offset the overhead of iommu lookups
* due to bandwidth issues (slow reads) on Broadwell+. */
gemfs = kern_mount(type);
opts = NULL;
if (intel_vtd_active()) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
static char huge_opt[] = "huge=within_size"; /* r/w */
opts = huge_opt;
drm_info(&i915->drm,
"Transparent Hugepage mode '%s'\n",
opts);
} else {
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
}
}
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
-- 2.30.2
On Fri, Sep 03, 2021 at 01:47:52PM +0100, Tvrtko Ursulin wrote:
On 29/07/2021 15:06, Daniel Vetter wrote:
On Thu, Jul 29, 2021 at 3:34 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on).
v2:
- Add Kconfig dependency to transparent hugepages and some help text.
- Move to helper for easier handling of kernel build options.
v3:
- Drop Kconfig. (Daniel)
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com # v1
On both patches: Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
Eero's testing results at https://gitlab.freedesktop.org/drm/intel/-/issues/430 are looking good - seem to show this to be a net win for at least Gen9 and Gen12 platforms.
Is the ack enough to merge in this case or I should look for an r-b as well?
Since your back to defacto v1 with the 2nd patch I think you have full r-b already. So more than enough I think.
Please do record the relative perf numbers from Eero in that issue in the commit message so that we have that on the git log record too. It's easier to find there than following the link and finding the right comment in the issue.
Thanks, Daniel
Regards,
Tvrtko
drivers/gpu/drm/i915/gem/i915_gemfs.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..dbdbdc344d87 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -6,7 +6,6 @@
#include <linux/fs.h> #include <linux/mount.h> -#include <linux/pagemap.h>
#include "i915_drv.h" #include "i915_gemfs.h" @@ -15,6 +14,7 @@ int i915_gemfs_init(struct drm_i915_private *i915) { struct file_system_type *type; struct vfsmount *gemfs;
char *opts; type = get_fs_type("tmpfs"); if (!type)
@@ -26,10 +26,26 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size").
* Currently unused due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so to offset the overhead of iommu lookups
* due to bandwidth issues (slow reads) on Broadwell+. */
gemfs = kern_mount(type);
opts = NULL;
if (intel_vtd_active()) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
static char huge_opt[] = "huge=within_size"; /* r/w */
opts = huge_opt;
drm_info(&i915->drm,
"Transparent Hugepage mode '%s'\n",
opts);
} else {
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
}
}
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs);
-- 2.30.2
On 07/09/2021 09:42, Daniel Vetter wrote:
On Fri, Sep 03, 2021 at 01:47:52PM +0100, Tvrtko Ursulin wrote:
On 29/07/2021 15:06, Daniel Vetter wrote:
On Thu, Jul 29, 2021 at 3:34 PM Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
Usage of Transparent Hugepages was disabled in 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it appears majority of performance regressions reported with an enabled IOMMU can be almost eliminated by turning them on, lets just do that.
To err on the side of safety we keep the current default in cases where IOMMU is not active, and only when it is default to the "huge=within_size" mode. Although there probably would be wins to enable them throughout, more extensive testing across benchmarks and platforms would need to be done.
With the patch and IOMMU enabled my local testing on a small Skylake part shows OglVSTangent regression being reduced from ~14% (IOMMU on versus IOMMU off) to ~2% (same comparison but with THP on).
v2:
- Add Kconfig dependency to transparent hugepages and some help text.
- Move to helper for easier handling of kernel build options.
v3:
- Drop Kconfig. (Daniel)
References: b901bb89324a ("drm/i915/gemfs: enable THP") References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 Co-developed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Eero Tamminen eero.t.tamminen@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com # v1
On both patches: Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
Eero's testing results at https://gitlab.freedesktop.org/drm/intel/-/issues/430 are looking good - seem to show this to be a net win for at least Gen9 and Gen12 platforms.
Is the ack enough to merge in this case or I should look for an r-b as well?
Since your back to defacto v1 with the 2nd patch I think you have full r-b already. So more than enough I think.
Just in case you missed it, v1 had Kconfig. But it's the same spirit so probably indeed fine as you say.
Please do record the relative perf numbers from Eero in that issue in the commit message so that we have that on the git log record too. It's easier to find there than following the link and finding the right comment in the issue.
Will do.
Regards,
Tvrtko
dri-devel@lists.freedesktop.org