Hello Eric,
This is an attempt at fixing support for negative X/Y positioning of various FB formats. I kept you as the author since I started from your initial work and fixed a few things + split the changes in several commits. Let me know if that's a problem, and I change the author.
This series has been tested with a modified modetest (support for this modifier has been added).
Note that I plan to test/debug support for negative X/Y positioning of SANDXX planes.
Regards,
Boris
Eric Anholt (5): drm/vc4: Fix TILE_Y_OFFSET definitions drm/vc4: Define missing PICTH0_SINK_PIX field drm/vc4: Use drm_atomic_helper_check_plane_state() to simplify the logic drm/vc4: Move ->offsets[] adjustment out of setup_clipping_and_scaling() drm/vc4: Fix negative X/Y positioning of planes using T_TILES modifier
drivers/gpu/drm/vc4/vc4_drv.h | 9 -- drivers/gpu/drm/vc4/vc4_plane.c | 272 ++++++++++++++++++++++++---------------- drivers/gpu/drm/vc4/vc4_regs.h | 8 +- 3 files changed, 170 insertions(+), 119 deletions(-)
From: Eric Anholt eric@anholt.net
Y_OFFSET field starts at bit 8 not 7.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com --- drivers/gpu/drm/vc4/vc4_regs.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h index d6864fa4bd14..ccbd6b377ffe 100644 --- a/drivers/gpu/drm/vc4/vc4_regs.h +++ b/drivers/gpu/drm/vc4/vc4_regs.h @@ -1043,8 +1043,8 @@ enum hvs_pixel_format { #define SCALER_PITCH0_TILE_LINE_DIR BIT(15) #define SCALER_PITCH0_TILE_INITIAL_LINE_DIR BIT(14) /* Y offset within a tile. */ -#define SCALER_PITCH0_TILE_Y_OFFSET_MASK VC4_MASK(13, 7) -#define SCALER_PITCH0_TILE_Y_OFFSET_SHIFT 7 +#define SCALER_PITCH0_TILE_Y_OFFSET_MASK VC4_MASK(13, 8) +#define SCALER_PITCH0_TILE_Y_OFFSET_SHIFT 8 #define SCALER_PITCH0_TILE_WIDTH_R_MASK VC4_MASK(6, 0) #define SCALER_PITCH0_TILE_WIDTH_R_SHIFT 0
From: Eric Anholt eric@anholt.net
This is needed to support X/Y negative placement of planes using T-format buffers.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com --- drivers/gpu/drm/vc4/vc4_regs.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h index ccbd6b377ffe..931088014272 100644 --- a/drivers/gpu/drm/vc4/vc4_regs.h +++ b/drivers/gpu/drm/vc4/vc4_regs.h @@ -1037,6 +1037,10 @@ enum hvs_pixel_format { #define SCALER_TILE_HEIGHT_MASK VC4_MASK(15, 0) #define SCALER_TILE_HEIGHT_SHIFT 0
+/* Common PITCH0 fields */ +#define SCALER_PITCH0_SINK_PIX_MASK VC4_MASK(31, 26) +#define SCALER_PITCH0_SINK_PIX_SHIFT 26 + /* PITCH0 fields for T-tiled. */ #define SCALER_PITCH0_TILE_WIDTH_L_MASK VC4_MASK(22, 16) #define SCALER_PITCH0_TILE_WIDTH_L_SHIFT 16
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
In the subject, s/PICTH/PITCH/
From: Eric Anholt eric@anholt.net
drm_atomic_helper_check_plane_state() takes care of checking the scaling capabilities and calculating the clipped X/Y offsets for us.
Rely on this function instead of open-coding the logic. While at it, we get rid of a few fields in vc4_plane_state that can be easily extracted from drm_plane_state.
Incidentally, it seems to fix a problem we had with negative X/Y positioning of YUV planes.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com --- drivers/gpu/drm/vc4/vc4_drv.h | 9 -- drivers/gpu/drm/vc4/vc4_plane.c | 210 +++++++++++++++++++++------------------- 2 files changed, 111 insertions(+), 108 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h index bd6ef1f31822..eae7817837a9 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.h +++ b/drivers/gpu/drm/vc4/vc4_drv.h @@ -344,17 +344,8 @@ struct vc4_plane_state { */ u32 __iomem *hw_dlist;
- /* Clipped coordinates of the plane on the display. */ - int crtc_x, crtc_y, crtc_w, crtc_h; - /* Clipped area being scanned from in the FB. */ - u32 src_x, src_y; - - u32 src_w[2], src_h[2]; - /* Scaling selection for the RGB/Y plane and the Cb/Cr planes. */ enum vc4_scaling_mode x_scaling[2], y_scaling[2]; - bool is_unity; - bool is_yuv;
/* Offset to start scanning out from the start of the plane's * BO. diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index a3275fa66b7b..39e1fa3a8466 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -258,107 +258,108 @@ static u32 vc4_get_scl_field(struct drm_plane_state *state, int plane) } }
+static bool is_unity(struct drm_plane_state *state) +{ + u32 src_w = (state->src.x2 - state->src.x1) >> 16; + u32 src_h = (state->src.y2 - state->src.y1) >> 16; + u32 crtc_w = state->dst.x2 - state->dst.x1; + u32 crtc_h = state->dst.y2 - state->dst.y1; + + return src_w == crtc_w && src_h == crtc_h; +} + +static bool is_yuv(struct drm_plane_state *state) +{ + return state->fb->format->num_planes > 1; +} + static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state) { struct drm_plane *plane = state->plane; struct vc4_plane_state *vc4_state = to_vc4_plane_state(state); struct drm_framebuffer *fb = state->fb; struct drm_gem_cma_object *bo = drm_fb_cma_get_gem_obj(fb, 0); + u32 crtc_h, crtc_w, src_h, src_w, src_x, src_y; u32 subpixel_src_mask = (1 << 16) - 1; u32 format = fb->format->format; int num_planes = fb->format->num_planes; - u32 h_subsample = 1; - u32 v_subsample = 1; - int i; + int min_scale = 1, max_scale = INT_MAX; + struct drm_crtc_state *crtc_state; + u32 h_subsample, v_subsample; + int i, ret; + + crtc_state = drm_atomic_get_existing_crtc_state(state->state, + state->crtc); + if (!crtc_state) { + DRM_DEBUG_KMS("Invalid crtc state\n"); + return -EINVAL; + } + + /* No configuring scaling on the cursor plane, since it gets + * non-vblank-synced updates, and scaling requires LBM changes which + * have to be vblank-synced. + */ + if (plane->type == DRM_PLANE_TYPE_CURSOR) { + min_scale = DRM_PLANE_HELPER_NO_SCALING; + max_scale = DRM_PLANE_HELPER_NO_SCALING; + } else { + min_scale = 1; + max_scale = INT_MAX; + } + + ret = drm_atomic_helper_check_plane_state(state, crtc_state, + min_scale, max_scale, + true, true); + if (ret) + return ret;
for (i = 0; i < num_planes; i++) vc4_state->offsets[i] = bo->paddr + fb->offsets[i];
/* We don't support subpixel source positioning for scaling. */ - if ((state->src_x & subpixel_src_mask) || - (state->src_y & subpixel_src_mask) || - (state->src_w & subpixel_src_mask) || - (state->src_h & subpixel_src_mask)) { + if ((state->src.x1 & subpixel_src_mask) || + (state->src.x2 & subpixel_src_mask) || + (state->src.y1 & subpixel_src_mask) || + (state->src.y2 & subpixel_src_mask)) { return -EINVAL; }
- vc4_state->src_x = state->src_x >> 16; - vc4_state->src_y = state->src_y >> 16; - vc4_state->src_w[0] = state->src_w >> 16; - vc4_state->src_h[0] = state->src_h >> 16; - - vc4_state->crtc_x = state->crtc_x; - vc4_state->crtc_y = state->crtc_y; - vc4_state->crtc_w = state->crtc_w; - vc4_state->crtc_h = state->crtc_h; + src_w = (state->src.x2 - state->src.x1) >> 16; + src_h = (state->src.y2 - state->src.y1) >> 16; + src_x = state->src.x1 >> 16; + src_y = state->src.y1 >> 16; + crtc_w = state->dst.x2 - state->dst.x1; + crtc_h = state->dst.y2 - state->dst.y1;
- vc4_state->x_scaling[0] = vc4_get_scaling_mode(vc4_state->src_w[0], - vc4_state->crtc_w); - vc4_state->y_scaling[0] = vc4_get_scaling_mode(vc4_state->src_h[0], - vc4_state->crtc_h); - - vc4_state->is_unity = (vc4_state->x_scaling[0] == VC4_SCALING_NONE && - vc4_state->y_scaling[0] == VC4_SCALING_NONE); + vc4_state->x_scaling[0] = vc4_get_scaling_mode(src_w, crtc_w); + vc4_state->y_scaling[0] = vc4_get_scaling_mode(src_h, crtc_h); + h_subsample = drm_format_horz_chroma_subsampling(format); + v_subsample = drm_format_vert_chroma_subsampling(format);
if (num_planes > 1) { - vc4_state->is_yuv = true; - - h_subsample = drm_format_horz_chroma_subsampling(format); - v_subsample = drm_format_vert_chroma_subsampling(format); - vc4_state->src_w[1] = vc4_state->src_w[0] / h_subsample; - vc4_state->src_h[1] = vc4_state->src_h[0] / v_subsample; + src_w /= h_subsample; + src_h /= v_subsample;
- vc4_state->x_scaling[1] = - vc4_get_scaling_mode(vc4_state->src_w[1], - vc4_state->crtc_w); - vc4_state->y_scaling[1] = - vc4_get_scaling_mode(vc4_state->src_h[1], - vc4_state->crtc_h); + vc4_state->x_scaling[1] = vc4_get_scaling_mode(src_w, crtc_w); + vc4_state->y_scaling[1] = vc4_get_scaling_mode(src_h, crtc_h);
/* YUV conversion requires that horizontal scaling be enabled, * even on a plane that's otherwise 1:1. Looks like only PPF * works in that case, so let's pick that one. */ - if (vc4_state->is_unity) + if (is_unity(state)) vc4_state->x_scaling[0] = VC4_SCALING_PPF; } else { vc4_state->x_scaling[1] = VC4_SCALING_NONE; vc4_state->y_scaling[1] = VC4_SCALING_NONE; }
- /* No configuring scaling on the cursor plane, since it gets - non-vblank-synced updates, and scaling requires requires - LBM changes which have to be vblank-synced. - */ - if (plane->type == DRM_PLANE_TYPE_CURSOR && !vc4_state->is_unity) - return -EINVAL; - - /* Clamp the on-screen start x/y to 0. The hardware doesn't - * support negative y, and negative x wastes bandwidth. - */ - if (vc4_state->crtc_x < 0) { - for (i = 0; i < num_planes; i++) { - u32 cpp = fb->format->cpp[i]; - u32 subs = ((i == 0) ? 1 : h_subsample); - - vc4_state->offsets[i] += (cpp * - (-vc4_state->crtc_x) / subs); - } - vc4_state->src_w[0] += vc4_state->crtc_x; - vc4_state->src_w[1] += vc4_state->crtc_x / h_subsample; - vc4_state->crtc_x = 0; - } - - if (vc4_state->crtc_y < 0) { - for (i = 0; i < num_planes; i++) { - u32 subs = ((i == 0) ? 1 : v_subsample); - - vc4_state->offsets[i] += (fb->pitches[i] * - (-vc4_state->crtc_y) / subs); - } - vc4_state->src_h[0] += vc4_state->crtc_y; - vc4_state->src_h[1] += vc4_state->crtc_y / v_subsample; - vc4_state->crtc_y = 0; + /* Adjust the base pointer to the first pixel to be scanned out. */ + for (i = 0; i < num_planes; i++) { + vc4_state->offsets[i] += (src_y / (i ? v_subsample : 1)) * + fb->pitches[i]; + vc4_state->offsets[i] += (src_x / (i ? h_subsample : 1)) * + fb->format->cpp[i]; }
return 0; @@ -398,11 +399,13 @@ static u32 vc4_lbm_size(struct drm_plane_state *state) /* This is the worst case number. One of the two sizes will * be used depending on the scaling configuration. */ - u32 pix_per_line = max(vc4_state->src_w[0], (u32)vc4_state->crtc_w); + u32 crtc_w = state->dst.x2 - state->dst.x1; + u32 src_w = (state->src.x2 - state->src.x1) >> 16; + u32 pix_per_line = max(src_w, crtc_w); u32 lbm;
- if (!vc4_state->is_yuv) { - if (vc4_state->is_unity) + if (!is_yuv(state)) { + if (is_unity(state)) return 0; else if (vc4_state->y_scaling[0] == VC4_SCALING_TPZ) lbm = pix_per_line * 8; @@ -427,30 +430,34 @@ static void vc4_write_scaling_parameters(struct drm_plane_state *state, int channel) { struct vc4_plane_state *vc4_state = to_vc4_plane_state(state); + u32 src_w = (state->src.x2 - state->src.x1) >> 16; + u32 src_h = (state->src.y2 - state->src.y1) >> 16; + u32 crtc_w = state->dst.x2 - state->dst.x1; + u32 crtc_h = state->dst.y2 - state->dst.y1; + u32 fmt = state->fb->format->format; + + if (channel) { + src_w /= drm_format_horz_chroma_subsampling(fmt); + src_h /= drm_format_horz_chroma_subsampling(fmt); + }
/* Ch0 H-PPF Word 0: Scaling Parameters */ - if (vc4_state->x_scaling[channel] == VC4_SCALING_PPF) { - vc4_write_ppf(vc4_state, - vc4_state->src_w[channel], vc4_state->crtc_w); - } + if (vc4_state->x_scaling[channel] == VC4_SCALING_PPF) + vc4_write_ppf(vc4_state, src_w, crtc_w);
/* Ch0 V-PPF Words 0-1: Scaling Parameters, Context */ if (vc4_state->y_scaling[channel] == VC4_SCALING_PPF) { - vc4_write_ppf(vc4_state, - vc4_state->src_h[channel], vc4_state->crtc_h); + vc4_write_ppf(vc4_state, src_h, crtc_h); vc4_dlist_write(vc4_state, 0xc0c0c0c0); }
/* Ch0 H-TPZ Words 0-1: Scaling Parameters, Recip */ - if (vc4_state->x_scaling[channel] == VC4_SCALING_TPZ) { - vc4_write_tpz(vc4_state, - vc4_state->src_w[channel], vc4_state->crtc_w); - } + if (vc4_state->x_scaling[channel] == VC4_SCALING_TPZ) + vc4_write_tpz(vc4_state, src_w, crtc_w);
/* Ch0 V-TPZ Words 0-2: Scaling Parameters, Recip, Context */ if (vc4_state->y_scaling[channel] == VC4_SCALING_TPZ) { - vc4_write_tpz(vc4_state, - vc4_state->src_h[channel], vc4_state->crtc_h); + vc4_write_tpz(vc4_state, src_h, crtc_h); vc4_dlist_write(vc4_state, 0xc0c0c0c0); } } @@ -468,6 +475,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane, const struct hvs_format *format = vc4_get_hvs_format(fb->format->format); u64 base_format_mod = fourcc_mod_broadcom_mod(fb->modifier); int num_planes = drm_format_num_planes(format->drm); + u32 src_x, src_y, crtc_h, crtc_w, src_h, src_w; bool mix_plane_alpha; bool covers_screen; u32 scl0, scl1, pitch0; @@ -513,6 +521,13 @@ static int vc4_plane_mode_set(struct drm_plane *plane, scl1 = vc4_get_scl_field(state, 0); }
+ crtc_w = state->dst.x2 - state->dst.x1; + crtc_h = state->dst.y2 - state->dst.y1; + src_x = state->src.x1 >> 16; + src_y = state->src.y1 >> 16; + src_w = (state->src.x2 - state->src.x1) >> 16; + src_h = (state->src.y2 - state->src.y1) >> 16; + switch (base_format_mod) { case DRM_FORMAT_MOD_LINEAR: tiling = SCALER_CTL0_TILING_LINEAR; @@ -592,7 +607,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane, (format->pixel_order << SCALER_CTL0_ORDER_SHIFT) | (hvs_format << SCALER_CTL0_PIXEL_FORMAT_SHIFT) | VC4_SET_FIELD(tiling, SCALER_CTL0_TILING) | - (vc4_state->is_unity ? SCALER_CTL0_UNITY : 0) | + (is_unity(state) ? SCALER_CTL0_UNITY : 0) | VC4_SET_FIELD(scl0, SCALER_CTL0_SCL0) | VC4_SET_FIELD(scl1, SCALER_CTL0_SCL1));
@@ -600,17 +615,14 @@ static int vc4_plane_mode_set(struct drm_plane *plane, vc4_state->pos0_offset = vc4_state->dlist_count; vc4_dlist_write(vc4_state, VC4_SET_FIELD(state->alpha >> 8, SCALER_POS0_FIXED_ALPHA) | - VC4_SET_FIELD(vc4_state->crtc_x, SCALER_POS0_START_X) | - VC4_SET_FIELD(vc4_state->crtc_y, SCALER_POS0_START_Y)); + VC4_SET_FIELD(state->dst.x1, SCALER_POS0_START_X) | + VC4_SET_FIELD(state->dst.y1, SCALER_POS0_START_Y));
/* Position Word 1: Scaled Image Dimensions. */ - if (!vc4_state->is_unity) { + if (!is_unity(state)) vc4_dlist_write(vc4_state, - VC4_SET_FIELD(vc4_state->crtc_w, - SCALER_POS1_SCL_WIDTH) | - VC4_SET_FIELD(vc4_state->crtc_h, - SCALER_POS1_SCL_HEIGHT)); - } + VC4_SET_FIELD(crtc_w, SCALER_POS1_SCL_WIDTH) | + VC4_SET_FIELD(crtc_h, SCALER_POS1_SCL_HEIGHT));
/* Don't waste cycles mixing with plane alpha if the set alpha * is opaque or there is no per-pixel alpha information. @@ -628,8 +640,8 @@ static int vc4_plane_mode_set(struct drm_plane *plane, SCALER_POS2_ALPHA_MODE) | (mix_plane_alpha ? SCALER_POS2_ALPHA_MIX : 0) | (fb->format->has_alpha ? SCALER_POS2_ALPHA_PREMULT : 0) | - VC4_SET_FIELD(vc4_state->src_w[0], SCALER_POS2_WIDTH) | - VC4_SET_FIELD(vc4_state->src_h[0], SCALER_POS2_HEIGHT)); + VC4_SET_FIELD(src_w, SCALER_POS2_WIDTH) | + VC4_SET_FIELD(src_h, SCALER_POS2_HEIGHT));
/* Position Word 3: Context. Written by the HVS. */ vc4_dlist_write(vc4_state, 0xc0c0c0c0); @@ -662,7 +674,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane, }
/* Colorspace conversion words */ - if (vc4_state->is_yuv) { + if (is_yuv(state)) { vc4_dlist_write(vc4_state, SCALER_CSC0_ITR_R_601_5); vc4_dlist_write(vc4_state, SCALER_CSC1_ITR_R_601_5); vc4_dlist_write(vc4_state, SCALER_CSC2_ITR_R_601_5); @@ -712,9 +724,9 @@ static int vc4_plane_mode_set(struct drm_plane *plane, VC4_SET_FIELD(vc4_state->dlist_count, SCALER_CTL0_SIZE);
/* crtc_* are already clipped coordinates. */ - covers_screen = vc4_state->crtc_x == 0 && vc4_state->crtc_y == 0 && - vc4_state->crtc_w == state->crtc->mode.hdisplay && - vc4_state->crtc_h == state->crtc->mode.vdisplay; + covers_screen = state->crtc_x == 0 && state->crtc_y == 0 && + crtc_w == state->crtc->mode.hdisplay && + crtc_h == state->crtc->mode.vdisplay; /* Background fill might be necessary when the plane has per-pixel * alpha content or a non-opaque plane alpha and could blend from the * background or does not cover the entire screen.
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
drm_atomic_helper_check_plane_state() takes care of checking the scaling capabilities and calculating the clipped X/Y offsets for us.
Rely on this function instead of open-coding the logic. While at it, we get rid of a few fields in vc4_plane_state that can be easily extracted from drm_plane_state.
Incidentally, it seems to fix a problem we had with negative X/Y positioning of YUV planes.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
drivers/gpu/drm/vc4/vc4_drv.h | 9 -- drivers/gpu/drm/vc4/vc4_plane.c | 210 +++++++++++++++++++++------------------- 2 files changed, 111 insertions(+), 108 deletions(-)
I feel like you ought to grab authorship of this patch -- you've made a bunch of good changes since my WIP patch.
if (num_planes > 1) {
vc4_state->is_yuv = true;
h_subsample = drm_format_horz_chroma_subsampling(format);
v_subsample = drm_format_vert_chroma_subsampling(format);
vc4_state->src_w[1] = vc4_state->src_w[0] / h_subsample;
vc4_state->src_h[1] = vc4_state->src_h[0] / v_subsample;
src_w /= h_subsample;
src_h /= v_subsample;
I'm a little nervous about leaving src_w/src_h divided after this block, in case we end up using them in some other calculation in a later change. Could we just move the divide into the vc4_get_scaling_mode() call?
In general, I'm not enthusiastic about having src_w computations in 5 places now. Was keeping it in the vc4 state that much of a problem?
I'm not saying no, but copy-and-paste of these kinds of calculations are also a frequent source of bugs when you miss a x->y or w->h change.
On Fri, 27 Jul 2018 13:38:08 -0700 Eric Anholt eric@anholt.net wrote:
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
drm_atomic_helper_check_plane_state() takes care of checking the scaling capabilities and calculating the clipped X/Y offsets for us.
Rely on this function instead of open-coding the logic. While at it, we get rid of a few fields in vc4_plane_state that can be easily extracted from drm_plane_state.
Incidentally, it seems to fix a problem we had with negative X/Y positioning of YUV planes.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
drivers/gpu/drm/vc4/vc4_drv.h | 9 -- drivers/gpu/drm/vc4/vc4_plane.c | 210 +++++++++++++++++++++------------------- 2 files changed, 111 insertions(+), 108 deletions(-)
I feel like you ought to grab authorship of this patch -- you've made a bunch of good changes since my WIP patch.
if (num_planes > 1) {
vc4_state->is_yuv = true;
h_subsample = drm_format_horz_chroma_subsampling(format);
v_subsample = drm_format_vert_chroma_subsampling(format);
vc4_state->src_w[1] = vc4_state->src_w[0] / h_subsample;
vc4_state->src_h[1] = vc4_state->src_h[0] / v_subsample;
src_w /= h_subsample;
src_h /= v_subsample;
I'm a little nervous about leaving src_w/src_h divided after this block, in case we end up using them in some other calculation in a later change. Could we just move the divide into the vc4_get_scaling_mode() call?
In general, I'm not enthusiastic about having src_w computations in 5 places now. Was keeping it in the vc4 state that much of a problem?
No strong objection to keeping ->src_{h,w}[] arrays in vc4_plane_state.
I'm not saying no, but copy-and-paste of these kinds of calculations are also a frequent source of bugs when you miss a x->y or w->h change.
From: Eric Anholt eric@anholt.net
The offset adjustment depends on the framebuffer modified, so let's just move this operation in the DRM_FORMAT_MOD_LINEAR case inside vc4_plane_mode_set().
This we'll be able to fix offset calculation for DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED and DRM_FORMAT_MOD_BROADCOM_SANDXXX.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com --- drivers/gpu/drm/vc4/vc4_plane.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 39e1fa3a8466..2b8ba1c412be 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -354,14 +354,6 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state) vc4_state->y_scaling[1] = VC4_SCALING_NONE; }
- /* Adjust the base pointer to the first pixel to be scanned out. */ - for (i = 0; i < num_planes; i++) { - vc4_state->offsets[i] += (src_y / (i ? v_subsample : 1)) * - fb->pitches[i]; - vc4_state->offsets[i] += (src_x / (i ? h_subsample : 1)) * - fb->format->cpp[i]; - } - return 0; }
@@ -476,6 +468,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane, u64 base_format_mod = fourcc_mod_broadcom_mod(fb->modifier); int num_planes = drm_format_num_planes(format->drm); u32 src_x, src_y, crtc_h, crtc_w, src_h, src_w; + u32 h_subsample, v_subsample; bool mix_plane_alpha; bool covers_screen; u32 scl0, scl1, pitch0; @@ -527,11 +520,25 @@ static int vc4_plane_mode_set(struct drm_plane *plane, src_y = state->src.y1 >> 16; src_w = (state->src.x2 - state->src.x1) >> 16; src_h = (state->src.y2 - state->src.y1) >> 16; + h_subsample = drm_format_horz_chroma_subsampling(format->drm); + v_subsample = drm_format_vert_chroma_subsampling(format->drm);
switch (base_format_mod) { case DRM_FORMAT_MOD_LINEAR: tiling = SCALER_CTL0_TILING_LINEAR; pitch0 = VC4_SET_FIELD(fb->pitches[0], SCALER_SRC_PITCH); + + /* Adjust the base pointer to the first pixel to be scanned + * out. + */ + for (i = 0; i < num_planes; i++) { + vc4_state->offsets[i] += src_y / + (i ? v_subsample : 1) * + fb->pitches[i]; + vc4_state->offsets[i] += src_x / + (i ? h_subsample : 1) * + fb->format->cpp[i]; + } break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: {
On Wed, 25 Jul 2018 17:32:08 +0200 Boris Brezillon boris.brezillon@bootlin.com wrote:
From: Eric Anholt eric@anholt.net
The offset adjustment depends on the framebuffer modified, so let's
^ modifier
just move this operation in the DRM_FORMAT_MOD_LINEAR case inside vc4_plane_mode_set().
This we'll be able to fix offset calculation for
^ Thanks to this, we'll ...
DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED and DRM_FORMAT_MOD_BROADCOM_SANDXXX.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
drivers/gpu/drm/vc4/vc4_plane.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 39e1fa3a8466..2b8ba1c412be 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -354,14 +354,6 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state) vc4_state->y_scaling[1] = VC4_SCALING_NONE; }
- /* Adjust the base pointer to the first pixel to be scanned out. */
- for (i = 0; i < num_planes; i++) {
vc4_state->offsets[i] += (src_y / (i ? v_subsample : 1)) *
fb->pitches[i];
vc4_state->offsets[i] += (src_x / (i ? h_subsample : 1)) *
fb->format->cpp[i];
- }
- return 0;
}
@@ -476,6 +468,7 @@ static int vc4_plane_mode_set(struct drm_plane *plane, u64 base_format_mod = fourcc_mod_broadcom_mod(fb->modifier); int num_planes = drm_format_num_planes(format->drm); u32 src_x, src_y, crtc_h, crtc_w, src_h, src_w;
- u32 h_subsample, v_subsample; bool mix_plane_alpha; bool covers_screen; u32 scl0, scl1, pitch0;
@@ -527,11 +520,25 @@ static int vc4_plane_mode_set(struct drm_plane *plane, src_y = state->src.y1 >> 16; src_w = (state->src.x2 - state->src.x1) >> 16; src_h = (state->src.y2 - state->src.y1) >> 16;
h_subsample = drm_format_horz_chroma_subsampling(format->drm);
v_subsample = drm_format_vert_chroma_subsampling(format->drm);
switch (base_format_mod) { case DRM_FORMAT_MOD_LINEAR: tiling = SCALER_CTL0_TILING_LINEAR; pitch0 = VC4_SET_FIELD(fb->pitches[0], SCALER_SRC_PITCH);
/* Adjust the base pointer to the first pixel to be scanned
* out.
*/
for (i = 0; i < num_planes; i++) {
vc4_state->offsets[i] += src_y /
(i ? v_subsample : 1) *
fb->pitches[i];
vc4_state->offsets[i] += src_x /
(i ? h_subsample : 1) *
fb->format->cpp[i];
}
break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: {
From: Eric Anholt eric@anholt.net
X/Y positioning of T-format buffers is quite tricky and the current implementation was failing to position a plane using this format correctly when the X, Y or both X and Y offsets were negative.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com --- Hi Eric,
I kept the SoB and authorship since you're the original author, but I also significantly reworked the code, so I'd be more confident if you could have a close look at this code.
It's been tested with a modified modetest (one that supports generating T-format buffers) and it seems to work fine with any kind of negative X/Y offset (both when starting on an even or an odd tile row).
Also, I intentionally did not add a Fixes and Cc-stable to this commit because it depends on a rework we've done in vc4_plane_setup_clipping_and_scaling() which cannot be easily backported. What I could do though is add a patch that rejects all negative crtc_{x,y}.
Regards,
Boris --- drivers/gpu/drm/vc4/vc4_plane.c | 51 +++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 2b8ba1c412be..ade47c3f65d1 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -539,22 +539,59 @@ static int vc4_plane_mode_set(struct drm_plane *plane, (i ? h_subsample : 1) * fb->format->cpp[i]; } + break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: { - /* For T-tiled, the FB pitch is "how many bytes from - * one row to the next, such that pitch * tile_h == - * tile_size * tiles_per_row." - */ u32 tile_size_shift = 12; /* T tiles are 4kb */ + /* Whole-tile offsets, mostly for setting the pitch. */ + u32 tile_w_shift = fb->format->cpp[0] == 2 ? 6 : 5; u32 tile_h_shift = 5; /* 16 and 32bpp are 32 pixels high */ + u32 tile_w_mask = (1 << tile_w_shift) - 1; + /* The height mask on 32-bit-per-pixel tiles is 63, i.e. 2 + * times the height (in pixels) of a 4k tile. I just assumed + * this is also true for other RGB formats, but maybe it's not. + */ + u32 tile_h_mask = (2 << tile_h_shift) - 1; + /* For T-tiled, the FB pitch is "how many bytes from one row to + * the next, such that + * + * pitch * tile_h == tile_size * tiles_per_row + */ u32 tiles_w = fb->pitches[0] >> (tile_size_shift - tile_h_shift); + u32 tiles_l = src_x >> tile_w_shift; + u32 tiles_r = tiles_w - tiles_l; + u32 tiles_t = src_y >> tile_h_shift; + /* Intra-tile offsets, which modify the base address (the + * SCALER_PITCH0_TILE_Y_OFFSET tells HVS how to walk from that + * base address). + */ + u32 tile_y = (src_y >> 4) & 1; + u32 subtile_y = (src_y >> 2) & 3; + u32 utile_y = src_y & 3; + u32 x_off = src_x & tile_w_mask; + u32 y_off = src_y & tile_h_mask;
tiling = SCALER_CTL0_TILING_256B_OR_T; + pitch0 = (VC4_SET_FIELD(x_off, SCALER_PITCH0_SINK_PIX) | + VC4_SET_FIELD(y_off, SCALER_PITCH0_TILE_Y_OFFSET) | + VC4_SET_FIELD(tiles_l, SCALER_PITCH0_TILE_WIDTH_L) | + VC4_SET_FIELD(tiles_r, SCALER_PITCH0_TILE_WIDTH_R)); + vc4_state->offsets[0] += tiles_t * (tiles_w << tile_size_shift); + vc4_state->offsets[0] += subtile_y << 8; + vc4_state->offsets[0] += utile_y << 4; + + /* Rows of tiles alternate left-to-right and right-to-left. */ + if (tiles_t & 1) { + pitch0 |= SCALER_PITCH0_TILE_INITIAL_LINE_DIR; + vc4_state->offsets[0] += (tiles_w - tiles_l) << + tile_size_shift; + vc4_state->offsets[0] -= (1 + !tile_y) << 10; + } else { + vc4_state->offsets[0] += tiles_l << tile_size_shift; + vc4_state->offsets[0] += tile_y << 10; + }
- pitch0 = (VC4_SET_FIELD(0, SCALER_PITCH0_TILE_Y_OFFSET) | - VC4_SET_FIELD(0, SCALER_PITCH0_TILE_WIDTH_L) | - VC4_SET_FIELD(tiles_w, SCALER_PITCH0_TILE_WIDTH_R)); break; }
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
X/Y positioning of T-format buffers is quite tricky and the current implementation was failing to position a plane using this format correctly when the X, Y or both X and Y offsets were negative.
Wait, were things working for you with even postivie X/Y offsets on T? Because it wasn't for me, and I think the tile_h_mask change is important for making positive work.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
Hi Eric,
I kept the SoB and authorship since you're the original author, but I also significantly reworked the code, so I'd be more confident if you could have a close look at this code.
I think you should definitely grab authorship on this one. You did the work to make it actually work.
Also, I intentionally did not add a Fixes and Cc-stable to this commit because it depends on a rework we've done in vc4_plane_setup_clipping_and_scaling() which cannot be easily backported. What I could do though is add a patch that rejects all negative crtc_{x,y}.
Agreed. Given that we're trying to fix a bug that nobody else has reported to me yet, I think we can skip dealing with this for stable.
drivers/gpu/drm/vc4/vc4_plane.c | 51 +++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 2b8ba1c412be..ade47c3f65d1 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -539,22 +539,59 @@ static int vc4_plane_mode_set(struct drm_plane *plane, (i ? h_subsample : 1) * fb->format->cpp[i]; }
break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: {
/* For T-tiled, the FB pitch is "how many bytes from
* one row to the next, such that pitch * tile_h ==
* tile_size * tiles_per_row."
u32 tile_size_shift = 12; /* T tiles are 4kb */*/
/* Whole-tile offsets, mostly for setting the pitch. */
u32 tile_h_shift = 5; /* 16 and 32bpp are 32 pixels high */u32 tile_w_shift = fb->format->cpp[0] == 2 ? 6 : 5;
u32 tile_w_mask = (1 << tile_w_shift) - 1;
/* The height mask on 32-bit-per-pixel tiles is 63, i.e. 2
* times the height (in pixels) of a 4k tile. I just assumed
* this is also true for other RGB formats, but maybe it's not.
*/
u32 tile_h_mask = (2 << tile_h_shift) - 1;
Only 2 and 4-byte formats are supported for T format, and tiles are 32 pixels high for both of those.
Other than that,
Reviewed-by: Eric Anholt eric@anholt.net
I have a comment on patch 3 I'd like to sort out, but other than that I'm pleased with this whole series. Thanks for persevering on it!
On Fri, 27 Jul 2018 13:46:31 -0700 Eric Anholt eric@anholt.net wrote:
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
X/Y positioning of T-format buffers is quite tricky and the current implementation was failing to position a plane using this format correctly when the X, Y or both X and Y offsets were negative.
Wait, were things working for you with even postivie X/Y offsets on T? Because it wasn't for me, and I think the tile_h_mask change is important for making positive work.
Not so sure. I've done so many tests that I don't remember which exact setups were working and which ones were not. I'll test again before sending v2 and adjust the commit message accordingly.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
Hi Eric,
I kept the SoB and authorship since you're the original author, but I also significantly reworked the code, so I'd be more confident if you could have a close look at this code.
I think you should definitely grab authorship on this one. You did the work to make it actually work.
I will. Thanks.
Also, I intentionally did not add a Fixes and Cc-stable to this commit because it depends on a rework we've done in vc4_plane_setup_clipping_and_scaling() which cannot be easily backported. What I could do though is add a patch that rejects all negative crtc_{x,y}.
Agreed. Given that we're trying to fix a bug that nobody else has reported to me yet, I think we can skip dealing with this for stable.
drivers/gpu/drm/vc4/vc4_plane.c | 51 +++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 2b8ba1c412be..ade47c3f65d1 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -539,22 +539,59 @@ static int vc4_plane_mode_set(struct drm_plane *plane, (i ? h_subsample : 1) * fb->format->cpp[i]; }
break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: {
/* For T-tiled, the FB pitch is "how many bytes from
* one row to the next, such that pitch * tile_h ==
* tile_size * tiles_per_row."
u32 tile_size_shift = 12; /* T tiles are 4kb */*/
/* Whole-tile offsets, mostly for setting the pitch. */
u32 tile_h_shift = 5; /* 16 and 32bpp are 32 pixels high */u32 tile_w_shift = fb->format->cpp[0] == 2 ? 6 : 5;
u32 tile_w_mask = (1 << tile_w_shift) - 1;
/* The height mask on 32-bit-per-pixel tiles is 63, i.e. 2
* times the height (in pixels) of a 4k tile. I just assumed
* this is also true for other RGB formats, but maybe it's not.
*/
u32 tile_h_mask = (2 << tile_h_shift) - 1;
Only 2 and 4-byte formats are supported for T format, and tiles are 32 pixels high for both of those.
Other than that,
Reviewed-by: Eric Anholt eric@anholt.net
I have a comment on patch 3 I'd like to sort out,
I'll address that one in v2.
but other than that I'm pleased with this whole series. Thanks for persevering on it!
On Fri, 27 Jul 2018 13:46:31 -0700 Eric Anholt eric@anholt.net wrote:
Boris Brezillon boris.brezillon@bootlin.com writes:
From: Eric Anholt eric@anholt.net
X/Y positioning of T-format buffers is quite tricky and the current implementation was failing to position a plane using this format correctly when the X, Y or both X and Y offsets were negative.
Wait, were things working for you with even postivie X/Y offsets on T?
I think I was talking about crtc_x/y, while you were probably talking about src_x/y. If you have src_x/y = 0 and crtc_x/y >= 0 it works fine, if you have src_x/y = 0 and crtc_x/y < 0 it doesn't work, and if you have src_x/y != 0 it doesn't work either.
Because it wasn't for me, and I think the tile_h_mask change is important for making positive work.
Yes, for positive src_x/y it does matter. I'll clarify that in my commit message.
Signed-off-by: Eric Anholt eric@anholt.net Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com
Hi Eric,
I kept the SoB and authorship since you're the original author, but I also significantly reworked the code, so I'd be more confident if you could have a close look at this code.
I think you should definitely grab authorship on this one. You did the work to make it actually work.
Also, I intentionally did not add a Fixes and Cc-stable to this commit because it depends on a rework we've done in vc4_plane_setup_clipping_and_scaling() which cannot be easily backported. What I could do though is add a patch that rejects all negative crtc_{x,y}.
Agreed. Given that we're trying to fix a bug that nobody else has reported to me yet, I think we can skip dealing with this for stable.
drivers/gpu/drm/vc4/vc4_plane.c | 51 +++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c index 2b8ba1c412be..ade47c3f65d1 100644 --- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -539,22 +539,59 @@ static int vc4_plane_mode_set(struct drm_plane *plane, (i ? h_subsample : 1) * fb->format->cpp[i]; }
break;
case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED: {
/* For T-tiled, the FB pitch is "how many bytes from
* one row to the next, such that pitch * tile_h ==
* tile_size * tiles_per_row."
u32 tile_size_shift = 12; /* T tiles are 4kb */*/
/* Whole-tile offsets, mostly for setting the pitch. */
u32 tile_h_shift = 5; /* 16 and 32bpp are 32 pixels high */u32 tile_w_shift = fb->format->cpp[0] == 2 ? 6 : 5;
u32 tile_w_mask = (1 << tile_w_shift) - 1;
/* The height mask on 32-bit-per-pixel tiles is 63, i.e. 2
* times the height (in pixels) of a 4k tile. I just assumed
* this is also true for other RGB formats, but maybe it's not.
*/
u32 tile_h_mask = (2 << tile_h_shift) - 1;
Only 2 and 4-byte formats are supported for T format, and tiles are 32 pixels high for both of those.
Okay, I'll drop the second sentence then.
Other than that,
Reviewed-by: Eric Anholt eric@anholt.net
I have a comment on patch 3 I'd like to sort out, but other than that I'm pleased with this whole series. Thanks for persevering on it!
dri-devel@lists.freedesktop.org