From FPU documentation, developers must not use DC_FP_START/END in dml
files, but invoke it when calling FPU-associated functions (isolated in dml folder). Therefore, the first patch renames dcn10_validate_bandwidth in dml/calcs to dcn_ for generalization, declares dcn10_validate_bandwidth in dcn10 - that calls dcn_validate_bandwidth and wraps with DC_FP_* accordingly. The second patch removes invocations of DC_FP_* from dml files and properly wraps FPU functions in dc code outside dml folder.
Melissa Wen (2): drm/amd/display: detach fpu operations from dcn10_validate_bandwidth in calcs drm/amd/display: remove DC_FP_* wrapper from dml folder
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 10 ++++++++-- .../drm/amd/display/dc/dcn10/dcn10_resource.c | 16 ++++++++++++++++ .../drm/amd/display/dc/dml/calcs/dcn_calcs.c | 19 +------------------ .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 2 -- .../gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 5 files changed, 26 insertions(+), 23 deletions(-)
dcn10_validate_bandwidth is only used on dcn10 files, but is declared in dcn_calcs files. Rename dcn10_* to dcn_* in calcs, remove DC_FP_* wrapper inside DML folder and create an specific dcn10_validate_bandwidth in dcn10_resources that calls dcn_validate_bandwidth and properly wraps that FPU function with DC_FP_* macro.
Signed-off-by: Melissa Wen mwen@igalia.com --- .../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 14 ++++++++++++++ .../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 5 +---- drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 4048908dd265..1587a060b55a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -1141,6 +1141,20 @@ static void dcn10_destroy_resource_pool(struct resource_pool **pool) *pool = NULL; }
+static bool dcn10_validate_bandwidth( + struct dc *dc, + struct dc_state *context, + bool fast_validate) +{ + bool voltage_supported; + + DC_FP_START(); + voltage_supported = dcn_validate_bandwidth(dc, context, fast_validate); + DC_FP_END(); + + return voltage_supported; +} + static enum dc_status dcn10_validate_plane(const struct dc_plane_state *plane_state, struct dc_caps *caps) { if (plane_state->format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index e447c74be713..c25023f7d604 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -764,7 +764,7 @@ static unsigned int get_highest_allowed_voltage_level(uint32_t chip_family, return 4; }
-bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate) @@ -790,7 +790,6 @@ bool dcn10_validate_bandwidth( dcn_bw_sync_calcs_and_dml(dc);
memset(v, 0, sizeof(*v)); - DC_FP_START();
v->sr_exit_time = dc->dcn_soc->sr_exit_time; v->sr_enter_plus_exit_time = dc->dcn_soc->sr_enter_plus_exit_time; @@ -1323,8 +1322,6 @@ bool dcn10_validate_bandwidth( bw_limit = dc->dcn_soc->percent_disp_bw_limit * v->fabric_and_dram_bandwidth_vmax0p9; bw_limit_pass = (v->total_data_read_bandwidth / 1000.0) < bw_limit;
- DC_FP_END(); - PERFORMANCE_TRACE_END(); BW_VAL_TRACE_FINISH();
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h index 337c0161e72d..806f3041db14 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h @@ -619,7 +619,7 @@ struct dcn_ip_params { }; extern const struct dcn_ip_params dcn10_ip_defaults;
-bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate);
Am 26.03.22 um 21:24 schrieb Melissa Wen:
dcn10_validate_bandwidth is only used on dcn10 files, but is declared in dcn_calcs files. Rename dcn10_* to dcn_* in calcs, remove DC_FP_* wrapper inside DML folder and create an specific dcn10_validate_bandwidth in dcn10_resources that calls dcn_validate_bandwidth and properly wraps that FPU function with DC_FP_* macro.
Signed-off-by: Melissa Wen mwen@igalia.com
.../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 14 ++++++++++++++ .../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 5 +---- drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 4048908dd265..1587a060b55a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -1141,6 +1141,20 @@ static void dcn10_destroy_resource_pool(struct resource_pool **pool) *pool = NULL; }
+static bool dcn10_validate_bandwidth(
struct dc *dc,
struct dc_state *context,
bool fast_validate)
+{
- bool voltage_supported;
- DC_FP_START();
- voltage_supported = dcn_validate_bandwidth(dc, context, fast_validate);
- DC_FP_END();
- return voltage_supported;
+}
- static enum dc_status dcn10_validate_plane(const struct dc_plane_state *plane_state, struct dc_caps *caps) { if (plane_state->format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index e447c74be713..c25023f7d604 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -764,7 +764,7 @@ static unsigned int get_highest_allowed_voltage_level(uint32_t chip_family, return 4; }
-bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate) @@ -790,7 +790,6 @@ bool dcn10_validate_bandwidth( dcn_bw_sync_calcs_and_dml(dc);
memset(v, 0, sizeof(*v));
DC_FP_START();
v->sr_exit_time = dc->dcn_soc->sr_exit_time; v->sr_enter_plus_exit_time = dc->dcn_soc->sr_enter_plus_exit_time;
@@ -1323,8 +1322,6 @@ bool dcn10_validate_bandwidth( bw_limit = dc->dcn_soc->percent_disp_bw_limit * v->fabric_and_dram_bandwidth_vmax0p9; bw_limit_pass = (v->total_data_read_bandwidth / 1000.0) < bw_limit;
- DC_FP_END();
- PERFORMANCE_TRACE_END(); BW_VAL_TRACE_FINISH();
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h index 337c0161e72d..806f3041db14 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h @@ -619,7 +619,7 @@ struct dcn_ip_params { }; extern const struct dcn_ip_params dcn10_ip_defaults;
-bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate);
Just for the record: That's not really usual kernel coding style, but that's not topic of this patch set.
The series is Acked-by: Christian König christian.koenig@amd.com
And it would be really nice if we could make the DC_FP_* macros somehow fail in the dml folder.
Thanks, Christian.
On 03/28, Christian König wrote:
Am 26.03.22 um 21:24 schrieb Melissa Wen:
dcn10_validate_bandwidth is only used on dcn10 files, but is declared in dcn_calcs files. Rename dcn10_* to dcn_* in calcs, remove DC_FP_* wrapper inside DML folder and create an specific dcn10_validate_bandwidth in dcn10_resources that calls dcn_validate_bandwidth and properly wraps that FPU function with DC_FP_* macro.
Signed-off-by: Melissa Wen mwen@igalia.com
.../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 14 ++++++++++++++ .../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 5 +---- drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 4048908dd265..1587a060b55a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -1141,6 +1141,20 @@ static void dcn10_destroy_resource_pool(struct resource_pool **pool) *pool = NULL; } +static bool dcn10_validate_bandwidth(
struct dc *dc,
struct dc_state *context,
bool fast_validate)
+{
- bool voltage_supported;
- DC_FP_START();
- voltage_supported = dcn_validate_bandwidth(dc, context, fast_validate);
- DC_FP_END();
- return voltage_supported;
+}
- static enum dc_status dcn10_validate_plane(const struct dc_plane_state *plane_state, struct dc_caps *caps) { if (plane_state->format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index e447c74be713..c25023f7d604 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -764,7 +764,7 @@ static unsigned int get_highest_allowed_voltage_level(uint32_t chip_family, return 4; } -bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate) @@ -790,7 +790,6 @@ bool dcn10_validate_bandwidth( dcn_bw_sync_calcs_and_dml(dc); memset(v, 0, sizeof(*v));
- DC_FP_START(); v->sr_exit_time = dc->dcn_soc->sr_exit_time; v->sr_enter_plus_exit_time = dc->dcn_soc->sr_enter_plus_exit_time;
@@ -1323,8 +1322,6 @@ bool dcn10_validate_bandwidth( bw_limit = dc->dcn_soc->percent_disp_bw_limit * v->fabric_and_dram_bandwidth_vmax0p9; bw_limit_pass = (v->total_data_read_bandwidth / 1000.0) < bw_limit;
- DC_FP_END();
- PERFORMANCE_TRACE_END(); BW_VAL_TRACE_FINISH();
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h index 337c0161e72d..806f3041db14 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h @@ -619,7 +619,7 @@ struct dcn_ip_params { }; extern const struct dcn_ip_params dcn10_ip_defaults; -bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate);
Just for the record: That's not really usual kernel coding style, but that's not topic of this patch set.
Yeah. I didn't change the code style to ease any version conflict management.
The series is Acked-by: Christian König christian.koenig@amd.com
Thanks!
And it would be really nice if we could make the DC_FP_* macros somehow fail in the dml folder.
And if we include a kind of dc_assert_fp_disabled() in the dc_fpu_begin() (DC_FP_START) - more or less the reverse of dc_assert_fp_enabled(). Does it meet the `make the DC_FP_* macros somehow fail in the dml folder` ? It is not restricted to the dml folder, but I think it would work similarly... Does it make sense?
Melissa
Thanks, Christian.
Am 28.03.22 um 19:17 schrieb Melissa Wen:
On 03/28, Christian König wrote:
Am 26.03.22 um 21:24 schrieb Melissa Wen:
dcn10_validate_bandwidth is only used on dcn10 files, but is declared in dcn_calcs files. Rename dcn10_* to dcn_* in calcs, remove DC_FP_* wrapper inside DML folder and create an specific dcn10_validate_bandwidth in dcn10_resources that calls dcn_validate_bandwidth and properly wraps that FPU function with DC_FP_* macro.
Signed-off-by: Melissa Wen mwen@igalia.com
.../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 14 ++++++++++++++ .../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 5 +---- drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 4048908dd265..1587a060b55a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -1141,6 +1141,20 @@ static void dcn10_destroy_resource_pool(struct resource_pool **pool) *pool = NULL; } +static bool dcn10_validate_bandwidth(
struct dc *dc,
struct dc_state *context,
bool fast_validate)
+{
- bool voltage_supported;
- DC_FP_START();
- voltage_supported = dcn_validate_bandwidth(dc, context, fast_validate);
- DC_FP_END();
- return voltage_supported;
+}
- static enum dc_status dcn10_validate_plane(const struct dc_plane_state *plane_state, struct dc_caps *caps) { if (plane_state->format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN
diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index e447c74be713..c25023f7d604 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -764,7 +764,7 @@ static unsigned int get_highest_allowed_voltage_level(uint32_t chip_family, return 4; } -bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate) @@ -790,7 +790,6 @@ bool dcn10_validate_bandwidth( dcn_bw_sync_calcs_and_dml(dc); memset(v, 0, sizeof(*v));
- DC_FP_START(); v->sr_exit_time = dc->dcn_soc->sr_exit_time; v->sr_enter_plus_exit_time = dc->dcn_soc->sr_enter_plus_exit_time;
@@ -1323,8 +1322,6 @@ bool dcn10_validate_bandwidth( bw_limit = dc->dcn_soc->percent_disp_bw_limit * v->fabric_and_dram_bandwidth_vmax0p9; bw_limit_pass = (v->total_data_read_bandwidth / 1000.0) < bw_limit;
- DC_FP_END();
- PERFORMANCE_TRACE_END(); BW_VAL_TRACE_FINISH();
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h index 337c0161e72d..806f3041db14 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h +++ b/drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h @@ -619,7 +619,7 @@ struct dcn_ip_params { }; extern const struct dcn_ip_params dcn10_ip_defaults; -bool dcn10_validate_bandwidth( +bool dcn_validate_bandwidth( struct dc *dc, struct dc_state *context, bool fast_validate);
Just for the record: That's not really usual kernel coding style, but that's not topic of this patch set.
Yeah. I didn't change the code style to ease any version conflict management.
The series is Acked-by: Christian König christian.koenig@amd.com
Thanks!
And it would be really nice if we could make the DC_FP_* macros somehow fail in the dml folder.
And if we include a kind of dc_assert_fp_disabled() in the dc_fpu_begin() (DC_FP_START) - more or less the reverse of dc_assert_fp_enabled(). Does it meet the `make the DC_FP_* macros somehow fail in the dml folder` ? It is not restricted to the dml folder, but I think it would work similarly... Does it make sense?
No, IIRC our display team even mentioned to me that those macros could potentially be used recursively.
What I mean here is that we somehow raise a compiler warning if somebody tries to use those defines inside the folder.
Maybe check of gcc supports hardware floating point or something like this.
Christian.
Melissa
Thanks, Christian.
FPU documentation states that developers must not use DC_FP_START/END inside dml files, but use this macro to wrap calls to FPU functions in dc folder (outside dml folder). Therefore, this patch removes DC_FP_* wrappers from dml folder and wraps calls for these FPU operations outside dml, as required.
Signed-off-by: Melissa Wen mwen@igalia.com --- .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 10 ++++++++-- .../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 2 ++ .../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 14 -------------- .../gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 2 -- 4 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index c3e141c19a77..6b4d9917933b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -2979,8 +2979,11 @@ void dcn10_prepare_bandwidth( true); dcn10_stereo_hw_frame_pack_wa(dc, context);
- if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE) + if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE) { + DC_FP_START(); dcn_bw_notify_pplib_of_wm_ranges(dc); + DC_FP_END(); + }
if (dc->debug.sanity_checks) hws->funcs.verify_allow_pstate_change_high(dc); @@ -3013,8 +3016,11 @@ void dcn10_optimize_bandwidth(
dcn10_stereo_hw_frame_pack_wa(dc, context);
- if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE) + if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE) { + DC_FP_START(); dcn_bw_notify_pplib_of_wm_ranges(dc); + DC_FP_END(); + }
if (dc->debug.sanity_checks) hws->funcs.verify_allow_pstate_change_high(dc); diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 1587a060b55a..bca049b2f867 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -1506,6 +1506,7 @@ static bool dcn10_resource_construct( && pool->base.pp_smu->rv_funcs.set_pme_wa_enable != NULL) dc->debug.az_endpoint_mute_only = false;
+ DC_FP_START(); if (!dc->debug.disable_pplib_clock_request) dcn_bw_update_from_pplib(dc); dcn_bw_sync_calcs_and_dml(dc); @@ -1513,6 +1514,7 @@ static bool dcn10_resource_construct( dc->res_pool = &pool->base; dcn_bw_notify_pplib_of_wm_ranges(dc); } + DC_FP_END();
{ struct irq_service_init_data init_data; diff --git a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c index c25023f7d604..db3b16b77034 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c +++ b/drivers/gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c @@ -639,7 +639,6 @@ static bool dcn_bw_apply_registry_override(struct dc *dc) { bool updated = false;
- DC_FP_START(); if ((int)(dc->dcn_soc->sr_exit_time * 1000) != dc->debug.sr_exit_time_ns && dc->debug.sr_exit_time_ns) { updated = true; @@ -675,7 +674,6 @@ static bool dcn_bw_apply_registry_override(struct dc *dc) dc->dcn_soc->dram_clock_change_latency = dc->debug.dram_clock_change_latency_ns / 1000.0; } - DC_FP_END();
return updated; } @@ -1492,8 +1490,6 @@ void dcn_bw_update_from_pplib(struct dc *dc) res = dm_pp_get_clock_levels_by_type_with_voltage( ctx, DM_PP_CLOCK_TYPE_FCLK, &fclks);
- DC_FP_START(); - if (res) res = verify_clock_values(&fclks);
@@ -1523,13 +1519,9 @@ void dcn_bw_update_from_pplib(struct dc *dc) } else BREAK_TO_DEBUGGER();
- DC_FP_END(); - res = dm_pp_get_clock_levels_by_type_with_voltage( ctx, DM_PP_CLOCK_TYPE_DCFCLK, &dcfclks);
- DC_FP_START(); - if (res) res = verify_clock_values(&dcfclks);
@@ -1540,8 +1532,6 @@ void dcn_bw_update_from_pplib(struct dc *dc) dc->dcn_soc->dcfclkv_max0p9 = dcfclks.data[dcfclks.num_levels - 1].clocks_in_khz / 1000.0; } else BREAK_TO_DEBUGGER(); - - DC_FP_END(); }
void dcn_bw_notify_pplib_of_wm_ranges(struct dc *dc) @@ -1556,11 +1546,9 @@ void dcn_bw_notify_pplib_of_wm_ranges(struct dc *dc) if (!pp || !pp->set_wm_ranges) return;
- DC_FP_START(); min_fclk_khz = dc->dcn_soc->fabric_and_dram_bandwidth_vmin0p65 * 1000000 / 32; min_dcfclk_khz = dc->dcn_soc->dcfclkv_min0p65 * 1000; socclk_khz = dc->dcn_soc->socclk * 1000; - DC_FP_END();
/* Now notify PPLib/SMU about which Watermarks sets they should select * depending on DPM state they are in. And update BW MGR GFX Engine and @@ -1611,7 +1599,6 @@ void dcn_bw_notify_pplib_of_wm_ranges(struct dc *dc)
void dcn_bw_sync_calcs_and_dml(struct dc *dc) { - DC_FP_START(); DC_LOG_BANDWIDTH_CALCS("sr_exit_time: %f ns\n" "sr_enter_plus_exit_time: %f ns\n" "urgent_latency: %f ns\n" @@ -1800,5 +1787,4 @@ void dcn_bw_sync_calcs_and_dml(struct dc *dc) dc->dml.ip.bug_forcing_LC_req_same_size_fixed = dc->dcn_ip->bug_forcing_luma_and_chroma_request_to_same_size_fixed == dcn_bw_yes; dc->dml.ip.dcfclk_cstate_latency = dc->dcn_ip->dcfclk_cstate_latency; - DC_FP_END(); } diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c index 2f6122153bdb..36b12a350bbd 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c @@ -1290,9 +1290,7 @@ int dcn20_populate_dml_pipes_from_context( }
/* populate writeback information */ - DC_FP_START(); dc->res_pool->funcs->populate_dml_writeback_from_context(dc, res_ctx, pipes); - DC_FP_END();
return pipe_cnt; }
On 2022-03-26 16:24, Melissa Wen wrote:
From FPU documentation, developers must not use DC_FP_START/END in dml files, but invoke it when calling FPU-associated functions (isolated in dml folder). Therefore, the first patch renames dcn10_validate_bandwidth in dml/calcs to dcn_ for generalization, declares dcn10_validate_bandwidth in dcn10 - that calls dcn_validate_bandwidth and wraps with DC_FP_* accordingly. The second patch removes invocations of DC_FP_* from dml files and properly wraps FPU functions in dc code outside dml folder.
Melissa Wen (2): drm/amd/display: detach fpu operations from dcn10_validate_bandwidth in calcs drm/amd/display: remove DC_FP_* wrapper from dml folder
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 10 ++++++++-- .../drm/amd/display/dc/dcn10/dcn10_resource.c | 16 ++++++++++++++++ .../drm/amd/display/dc/dml/calcs/dcn_calcs.c | 19 +------------------ .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 2 -- .../gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 5 files changed, 26 insertions(+), 23 deletions(-)
Hi,
Thanks a lot for your patch!
I reviewed and tested this series and lgtm. Applied to amd-staging-drm-next.
Btw, I agree with Christian. Can you try to find a way to add a compilation error or warning if the developer tries to add DC_FP_* inside DML?
Also, about the question of recursive calling to DC_FP_*, it should be safe if using DC_FP_*.
Thanks Siqueira
On 03/30, Rodrigo Siqueira Jordao wrote:
On 2022-03-26 16:24, Melissa Wen wrote:
From FPU documentation, developers must not use DC_FP_START/END in dml files, but invoke it when calling FPU-associated functions (isolated in dml folder). Therefore, the first patch renames dcn10_validate_bandwidth in dml/calcs to dcn_ for generalization, declares dcn10_validate_bandwidth in dcn10 - that calls dcn_validate_bandwidth and wraps with DC_FP_* accordingly. The second patch removes invocations of DC_FP_* from dml files and properly wraps FPU functions in dc code outside dml folder.
Melissa Wen (2): drm/amd/display: detach fpu operations from dcn10_validate_bandwidth in calcs drm/amd/display: remove DC_FP_* wrapper from dml folder
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 10 ++++++++-- .../drm/amd/display/dc/dcn10/dcn10_resource.c | 16 ++++++++++++++++ .../drm/amd/display/dc/dml/calcs/dcn_calcs.c | 19 +------------------ .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 2 -- .../gpu/drm/amd/display/dc/inc/dcn_calcs.h | 2 +- 5 files changed, 26 insertions(+), 23 deletions(-)
Hi,
Thanks a lot for your patch!
I reviewed and tested this series and lgtm. Applied to amd-staging-drm-next.
Btw, I agree with Christian. Can you try to find a way to add a compilation error or warning if the developer tries to add DC_FP_* inside DML?
Yes, I agree too. I'll try to address it as soon as possible.
Also, about the question of recursive calling to DC_FP_*, it should be safe if using DC_FP_*.
Thanks,
Melissa
Thanks Siqueira
dri-devel@lists.freedesktop.org