From: Tvrtko Ursulin tvrtko.ursulin@intel.com
RC6 support cannot be simply established by looking at the static device HAS_RC6() flag. There are cases which disable RC6 at driver load time so use the status of those check when deciding whether to enumerate the rc6 counter.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reported-by: Eero T Tamminen eero.t.tamminen@intel.com --- drivers/gpu/drm/i915/i915_pmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 41651ac255fa..a75cd1db320b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -476,6 +476,8 @@ engine_event_status(struct intel_engine_cs *engine, static int config_status(struct drm_i915_private *i915, u64 config) { + struct intel_gt *gt = &i915->gt; + switch (config) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) @@ -489,7 +491,7 @@ config_status(struct drm_i915_private *i915, u64 config) case I915_PMU_INTERRUPTS: break; case I915_PMU_RC6_RESIDENCY: - if (!HAS_RC6(i915)) + if (!gt->rc6.supported) return -ENODEV; break; case I915_PMU_SOFTWARE_GT_AWAKE_TIME:
On Wed, Mar 31, 2021 at 11:18:50AM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
RC6 support cannot be simply established by looking at the static device HAS_RC6() flag. There are cases which disable RC6 at driver load time so use the status of those check when deciding whether to enumerate the rc6 counter.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reported-by: Eero T Tamminen eero.t.tamminen@intel.com
drivers/gpu/drm/i915/i915_pmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 41651ac255fa..a75cd1db320b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -476,6 +476,8 @@ engine_event_status(struct intel_engine_cs *engine, static int config_status(struct drm_i915_private *i915, u64 config) {
- struct intel_gt *gt = &i915->gt;
- switch (config) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
@@ -489,7 +491,7 @@ config_status(struct drm_i915_private *i915, u64 config) case I915_PMU_INTERRUPTS: break; case I915_PMU_RC6_RESIDENCY:
if (!HAS_RC6(i915))
if (!gt->rc6.supported)
Is this really going to remove any confusion? Right now it is there but with residency 0, but after this change the event is not there anymore so I wonder if we are not just changing to a different kind of confusion on users.
return -ENODEV;
would a different return help somehow?
break;
case I915_PMU_SOFTWARE_GT_AWAKE_TIME:
2.27.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 01/04/2021 10:19, Rodrigo Vivi wrote:
On Wed, Mar 31, 2021 at 11:18:50AM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
RC6 support cannot be simply established by looking at the static device HAS_RC6() flag. There are cases which disable RC6 at driver load time so use the status of those check when deciding whether to enumerate the rc6 counter.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reported-by: Eero T Tamminen eero.t.tamminen@intel.com
drivers/gpu/drm/i915/i915_pmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 41651ac255fa..a75cd1db320b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -476,6 +476,8 @@ engine_event_status(struct intel_engine_cs *engine, static int config_status(struct drm_i915_private *i915, u64 config) {
- struct intel_gt *gt = &i915->gt;
- switch (config) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
@@ -489,7 +491,7 @@ config_status(struct drm_i915_private *i915, u64 config) case I915_PMU_INTERRUPTS: break; case I915_PMU_RC6_RESIDENCY:
if (!HAS_RC6(i915))
if (!gt->rc6.supported)
Is this really going to remove any confusion? Right now it is there but with residency 0, but after this change the event is not there anymore so I wonder if we are not just changing to a different kind of confusion on users.
I think it is possible to argue both ways.
1) HAS_RC6 means hardware has RC6 so if we view PMU as very low level we can say always export it.
If i915 had to turn it off (rc6->supported == false) due firmware or GVT-g, then we could say reporting zero RC6 is accurate in that sense. Only the reason "why it is zero" is missing for PMU users.
2) Or if we go with this patch we could say that presence of the PMU metric means RC6 is active and enabled, while absence means it is either not supported due platform (or firmware) or how the platform is getting used (GVT-g).
So I think patch is a bit better. I don't see it is adding more confusion.
return -ENODEV;
would a different return help somehow?
Like distinguishing between not theoretically possible to support on this GPU, versus not active? Perhaps.. suggest an errno? :)
Regards,
Tvrtko
break;
case I915_PMU_SOFTWARE_GT_AWAKE_TIME:
2.27.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Thu, Apr 01, 2021 at 10:38:11AM +0100, Tvrtko Ursulin wrote:
On 01/04/2021 10:19, Rodrigo Vivi wrote:
On Wed, Mar 31, 2021 at 11:18:50AM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
RC6 support cannot be simply established by looking at the static device HAS_RC6() flag. There are cases which disable RC6 at driver load time so use the status of those check when deciding whether to enumerate the rc6 counter.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reported-by: Eero T Tamminen eero.t.tamminen@intel.com
drivers/gpu/drm/i915/i915_pmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 41651ac255fa..a75cd1db320b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -476,6 +476,8 @@ engine_event_status(struct intel_engine_cs *engine, static int config_status(struct drm_i915_private *i915, u64 config) {
- struct intel_gt *gt = &i915->gt;
- switch (config) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
@@ -489,7 +491,7 @@ config_status(struct drm_i915_private *i915, u64 config) case I915_PMU_INTERRUPTS: break; case I915_PMU_RC6_RESIDENCY:
if (!HAS_RC6(i915))
if (!gt->rc6.supported)
Is this really going to remove any confusion? Right now it is there but with residency 0, but after this change the event is not there anymore so I wonder if we are not just changing to a different kind of confusion on users.
I think it is possible to argue both ways.
HAS_RC6 means hardware has RC6 so if we view PMU as very low level we can say always export it.
If i915 had to turn it off (rc6->supported == false) due firmware or GVT-g, then we could say reporting zero RC6 is accurate in that sense. Only the reason "why it is zero" is missing for PMU users.
Or if we go with this patch we could say that presence of the PMU metric means RC6 is active and enabled, while absence means it is either not supported due platform (or firmware) or how the platform is getting used (GVT-g).
yeap, these 2 cases described well my mental conflict...
So I think patch is a bit better. I don't see it is adding more confusion.
As I said on the other patch I have no strong position on which is better, but if you and Eero feel that this works better for the current case, let's do it...
return -ENODEV;
would a different return help somehow?
Like distinguishing between not theoretically possible to support on this GPU, versus not active? Perhaps.. suggest an errno? :)
ENODATA? or EIDRM?
But only if it helps somehow... otherwise don't bother and move with this as is:
Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com
Regards,
Tvrtko
break;
case I915_PMU_SOFTWARE_GT_AWAKE_TIME:
2.27.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi,
On Thu, 2021-04-01 at 05:54 -0400, Rodrigo Vivi wrote:
On Thu, Apr 01, 2021 at 10:38:11AM +0100, Tvrtko Ursulin wrote:
...
I think it is possible to argue both ways.
HAS_RC6 means hardware has RC6 so if we view PMU as very low level we can say always export it.
If i915 had to turn it off (rc6->supported == false) due firmware or GVT-g, then we could say reporting zero RC6 is accurate in that sense. Only the reason "why it is zero" is missing for PMU users.
Or if we go with this patch we could say that presence of the PMU metric means RC6 is active and enabled, while absence means it is either not supported due platform (or firmware) or how the platform is getting used (GVT-g).
yeap, these 2 cases described well my mental conflict...
So I think patch is a bit better. I don't see it is adding more confusion.
As I said on the other patch I have no strong position on which is better, but if you and Eero feel that this works better for the current case, let's do it...
IMHO seeing case 1) i.e. zero RC6 could be slightly better from user point of view than not seeing RC6 at all, because:
A) user then knows that GPU is not entering RC6, and
B) then the question is why it's not going to RC6 => one can see from sysfs that it has been disabled
Whereas in case 2), the question is why there's no RC6 info, and user doesn't know whether GPU is suspended or not (i.e. why GPU power consumption is higher than expected). It would help if i-g-t could show e.g. "RC6 OFF" in that case.
- Eero
On 01/04/2021 11:24, Tamminen, Eero T wrote:
Hi,
On Thu, 2021-04-01 at 05:54 -0400, Rodrigo Vivi wrote:
On Thu, Apr 01, 2021 at 10:38:11AM +0100, Tvrtko Ursulin wrote:
...
I think it is possible to argue both ways.
HAS_RC6 means hardware has RC6 so if we view PMU as very low level we can say always export it.
If i915 had to turn it off (rc6->supported == false) due firmware or GVT-g, then we could say reporting zero RC6 is accurate in that sense. Only the reason "why it is zero" is missing for PMU users.
Or if we go with this patch we could say that presence of the PMU metric means RC6 is active and enabled, while absence means it is either not supported due platform (or firmware) or how the platform is getting used (GVT-g).
yeap, these 2 cases described well my mental conflict...
So I think patch is a bit better. I don't see it is adding more confusion.
As I said on the other patch I have no strong position on which is better, but if you and Eero feel that this works better for the current case, let's do it...
IMHO seeing case 1) i.e. zero RC6 could be slightly better from user point of view than not seeing RC6 at all, because:
A) user then knows that GPU is not entering RC6, and
B) then the question is why it's not going to RC6 => one can see from sysfs that it has been disabled
Whereas in case 2), the question is why there's no RC6 info, and user doesn't know whether GPU is suspended or not (i.e. why GPU power consumption is higher than expected). It would help if i-g-t could show e.g. "RC6 OFF" in that case.
So many options.. :)
It can be handle on the "presentation" layer (intel_gpu_top). If we go with this patch but different errnos it could indeed distinguish and either not show RC6 or say "RC6 OFF".
If we go with the other patch (https://patchwork.freedesktop.org/patch/426589/?series=88580&rev=1) then intel_gpu_top could really still do the same by looking at /sys/class/drm/card0/power/rc6_enable.
So strictly no i915 patch is even needed to provide clarity in intel_gpu_top.
But still one of those two i915 patches is required to improve how low-level Perf/PMU RC6 counter gets exposed (or not exposed). I don't have a strong preference which one to take either. :)
Regards,
Tvrtko
dri-devel@lists.freedesktop.org