In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org ---
drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 9b19eee0e1b4..9c9463ec5465 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) } EXPORT_SYMBOL(drm_add_override_edid_modes);
-static struct edid *drm_do_get_edid_base_block( +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, int (*get_edid_block)(void *data, u8 *buf, unsigned int block, size_t len), - void *data, bool *edid_corrupt, int *null_edid_counter) + void *data) { - int i; + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL; + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; void *edid; + int i;
edid = kmalloc(EDID_LENGTH, GFP_KERNEL); if (edid == NULL) @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( return edid;
carp: - kfree(edid); - return ERR_PTR(-EINVAL); - + if (connector) + connector_bad_edid(connector, edid, 1); out: kfree(edid); return NULL; @@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, if (override) return override;
- edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data, - &connector->edid_corrupt, - &connector->null_edid_counter); - if (IS_ERR_OR_NULL(edid)) { - if (IS_ERR(edid)) - connector_bad_edid(connector, edid, 1); + edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data); + if (!edid) return NULL; - }
/* if there's no extensions or no connector, we're done */ valid_extensions = edid[0x7e]; @@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter) struct edid *edid; u32 panel_id;
- edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter, - NULL, NULL); + edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter);
/* * There are no manufacturer IDs of 0, so if there is a problem reading * the EDID then we'll just return 0. */ - if (IS_ERR_OR_NULL(edid)) + if (!edid) return 0;
panel_id = edid_extract_panel_id(edid);
Hi Douglas,
On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson dianders@chromium.org wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
The crash is was seeing is gone, so Tested-by: Geert Uytterhoeven geert+renesas@glider.be
Gr{oetje,eeting}s,
Geert
Hi,
On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Douglas,
On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson dianders@chromium.org wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
The crash is was seeing is gone, so Tested-by: Geert Uytterhoeven geert+renesas@glider.be
Thanks for testing! I'll plan to apply tomorrow morning (California time) to balance between giving folks a chance to yell at me for my patch and the urgency of fixing the breakage.
-Doug
Hi,
On Mon, Oct 4, 2021 at 5:40 PM Doug Anderson dianders@chromium.org wrote:
Hi,
On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Douglas,
On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson dianders@chromium.org wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
The crash is was seeing is gone, so Tested-by: Geert Uytterhoeven geert+renesas@glider.be
Thanks for testing! I'll plan to apply tomorrow morning (California time) to balance between giving folks a chance to yell at me for my patch and the urgency of fixing the breakage.
Ah, doh! I can't push until I can get a review tag from someone. As soon as I see one then I'll give it a push.
-Doug
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 9b19eee0e1b4..9c9463ec5465 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) } EXPORT_SYMBOL(drm_add_override_edid_modes);
-static struct edid *drm_do_get_edid_base_block( +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, int (*get_edid_block)(void *data, u8 *buf, unsigned int block, size_t len),
- void *data, bool *edid_corrupt, int *null_edid_counter)
- void *data)
{
- int i;
int *null_edid_counter = connector ? &connector->null_edid_counter : NULL;
bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; void *edid;
int i;
edid = kmalloc(EDID_LENGTH, GFP_KERNEL); if (edid == NULL)
@@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( return edid;
carp:
- kfree(edid);
- return ERR_PTR(-EINVAL);
- if (connector)
connector_bad_edid(connector, edid, 1);
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of the kernel. I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME for me...
[AMD Official Use Only]
Hi Ville:
Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it.
Regards, Jerry
-----Original Message----- From: Ville Syrjälä ville.syrjala@linux.intel.com Sent: October 4, 2021 3:45 PM To: Douglas Anderson dianders@chromium.org Cc: dri-devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; Daniel Vetter daniel@ffwll.ch; David Airlie airlied@linux.ie; Jani Nikula jani.nikula@intel.com; Linus Walleij linus.walleij@linaro.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; Sam Ravnborg sam@ravnborg.org; Thomas Zimmermann tzimmermann@suse.de; linux-kernel@vger.kernel.org; Zuo, Jerry Jerry.Zuo@amd.com; Wentland, Harry Harry.Wentland@amd.com; Siqueira, Rodrigo Rodrigo.Siqueira@amd.com Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 9b19eee0e1b4..9c9463ec5465 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) } EXPORT_SYMBOL(drm_add_override_edid_modes);
-static struct edid *drm_do_get_edid_base_block( +static struct edid *drm_do_get_edid_base_block(struct drm_connector +*connector, int (*get_edid_block)(void *data, u8 *buf, unsigned int block, size_t len),
- void *data, bool *edid_corrupt, int *null_edid_counter)
- void *data)
{
- int i;
- int *null_edid_counter = connector ? &connector-
null_edid_counter : NULL;
bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; void *edid;
int i;
edid = kmalloc(EDID_LENGTH, GFP_KERNEL); if (edid == NULL)
@@ -1941,9 +1943,8 @@ static struct edid
*drm_do_get_edid_base_block(
return edid;
carp:
- kfree(edid);
- return ERR_PTR(-EINVAL);
- if (connector)
connector_bad_edid(connector, edid, 1);
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of the kernel. I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME for me...
-- Ville Syrjälä Intel
Hi,
On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry Jerry.Zuo@amd.com wrote:
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of the kernel. I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME for me...
[AMD Official Use Only]
Hi Ville:
Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it.
Regards, Jerry
I've cut out other bits from this email and changed the subject line since I think this is an issue unrelated to the one my original patch was fixing.
I don't actually know a ton about DP compliance testing, but I attempted to try to be helpful and revert commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard to deal with the conflicts in the revert itself, but then things didn't compile because there are two places that use `real_edid_checksum` and that goes away if I revert the patch.
I've made an attempt to fix the problem by just adding a bounds check. Perhaps you can see if that looks good to you:
https://lore.kernel.org/r/20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c...
-Doug
[AMD Official Use Only]
-----Original Message----- From: Doug Anderson dianders@chromium.org Sent: October 5, 2021 11:14 AM To: Zuo, Jerry Jerry.Zuo@amd.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com; dri- devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; Daniel Vetter daniel@ffwll.ch; David Airlie airlied@linux.ie; Jani Nikula jani.nikula@intel.com; Linus Walleij linus.walleij@linaro.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; Sam Ravnborg sam@ravnborg.org; Thomas Zimmermann tzimmermann@suse.de; linux-kernel@vger.kernel.org; Wentland, Harry Harry.Wentland@amd.com; Siqueira, Rodrigo Rodrigo.Siqueira@amd.com; Kuogee Hsieh khsieh@codeaurora.org Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)
Hi,
On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry Jerry.Zuo@amd.com wrote:
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to
happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of
the kernel.
I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME
for me...
[AMD Official Use Only]
Hi Ville:
Yhea, it is pretty old change from two years ago, and it is no long valid
anymore. Please simply drop it.
Regards, Jerry
I've cut out other bits from this email and changed the subject line since I think this is an issue unrelated to the one my original patch was fixing.
I don't actually know a ton about DP compliance testing, but I attempted to try to be helpful and revert commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard to deal with the conflicts in the revert itself, but then things didn't compile because there are two places that use `real_edid_checksum` and that goes away if I revert the patch.
I've made an attempt to fix the problem by just adding a bounds check. Perhaps you can see if that looks good to you:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore. kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1 295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90 b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3 D&reserved=0
-Doug
The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch.
On 2021-10-05 11:25, Zuo, Jerry wrote:
[AMD Official Use Only]
-----Original Message----- From: Doug Anderson dianders@chromium.org Sent: October 5, 2021 11:14 AM To: Zuo, Jerry Jerry.Zuo@amd.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com; dri- devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; Daniel Vetter daniel@ffwll.ch; David Airlie airlied@linux.ie; Jani Nikula jani.nikula@intel.com; Linus Walleij linus.walleij@linaro.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; Sam Ravnborg sam@ravnborg.org; Thomas Zimmermann tzimmermann@suse.de; linux-kernel@vger.kernel.org; Wentland, Harry Harry.Wentland@amd.com; Siqueira, Rodrigo Rodrigo.Siqueira@amd.com; Kuogee Hsieh khsieh@codeaurora.org Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)
Hi,
On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry Jerry.Zuo@amd.com wrote:
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to
happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of
the kernel.
I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME
for me...
[AMD Official Use Only]
Hi Ville:
Yhea, it is pretty old change from two years ago, and it is no long valid
anymore. Please simply drop it.
Regards, Jerry
I've cut out other bits from this email and changed the subject line since I think this is an issue unrelated to the one my original patch was fixing.
I don't actually know a ton about DP compliance testing, but I attempted to try to be helpful and revert commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard to deal with the conflicts in the revert itself, but then things didn't compile because there are two places that use `real_edid_checksum` and that goes away if I revert the patch.
I've made an attempt to fix the problem by just adding a bounds check. Perhaps you can see if that looks good to you:
kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1
295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90 b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3 D&reserved=0
-Doug
The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch.
Can you try the CTS test with Doug's fix?
https://patchwork.freedesktop.org/patch/457537/
Harry
[AMD Official Use Only]
-----Original Message----- From: Wentland, Harry Harry.Wentland@amd.com Sent: October 5, 2021 2:04 PM To: Zuo, Jerry Jerry.Zuo@amd.com; Doug Anderson dianders@chromium.org Cc: Ville Syrjälä ville.syrjala@linux.intel.com; dri- devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; Daniel Vetter daniel@ffwll.ch; David Airlie airlied@linux.ie; Jani Nikula jani.nikula@intel.com; Linus Walleij linus.walleij@linaro.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; Sam Ravnborg sam@ravnborg.org; Thomas Zimmermann tzimmermann@suse.de; linux-kernel@vger.kernel.org; Siqueira, Rodrigo Rodrigo.Siqueira@amd.com; Kuogee Hsieh khsieh@codeaurora.org Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)
On 2021-10-05 11:25, Zuo, Jerry wrote:
[AMD Official Use Only]
-----Original Message----- From: Doug Anderson dianders@chromium.org Sent: October 5, 2021 11:14 AM To: Zuo, Jerry Jerry.Zuo@amd.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com; dri- devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; Daniel Vetter daniel@ffwll.ch; David Airlie airlied@linux.ie; Jani Nikula jani.nikula@intel.com; Linus Walleij linus.walleij@linaro.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; Sam Ravnborg sam@ravnborg.org; Thomas Zimmermann tzimmermann@suse.de; linux-kernel@vger.kernel.org; Wentland, Harry Harry.Wentland@amd.com; Siqueira, Rodrigo Rodrigo.Siqueira@amd.com; Kuogee Hsieh khsieh@codeaurora.org Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID)
Hi,
On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry Jerry.Zuo@amd.com wrote:
BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to
happen.
Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight.
The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of
the kernel.
I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME
for me...
[AMD Official Use Only]
Hi Ville:
Yhea, it is pretty old change from two years ago, and it is no
long valid
anymore. Please simply drop it.
Regards, Jerry
I've cut out other bits from this email and changed the subject line since I think this is an issue unrelated to the one my original patch was
fixing.
I don't actually know a ton about DP compliance testing, but I attempted to try to be helpful and revert commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard to deal with the conflicts in the revert itself, but then things didn't compile because there are two places that use `real_edid_checksum` and that goes away if I revert the patch.
I've made an attempt to fix the problem by just adding a bounds check. Perhaps you can see if that looks good to you:
kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1
295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90
b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d
%7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJ
WIj
oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3
D&reserved=0
-Doug
The patch used for DP1.4 compliance edid corruption test. Let me double
check if edid corruption test could be passed without the patch.
Can you try the CTS test with Doug's fix?
https://patchwork.freedesktop.org/patch/457537/
Harry
Yes, I'll give a try on that.
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
A bit of historical fallout zone this part of the code. So not the easiest thing to read in general. But looks like what you have here should work.
Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 9b19eee0e1b4..9c9463ec5465 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) } EXPORT_SYMBOL(drm_add_override_edid_modes);
-static struct edid *drm_do_get_edid_base_block( +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, int (*get_edid_block)(void *data, u8 *buf, unsigned int block, size_t len),
- void *data, bool *edid_corrupt, int *null_edid_counter)
- void *data)
{
- int i;
int *null_edid_counter = connector ? &connector->null_edid_counter : NULL;
bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; void *edid;
int i;
edid = kmalloc(EDID_LENGTH, GFP_KERNEL); if (edid == NULL)
@@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( return edid;
carp:
- kfree(edid);
- return ERR_PTR(-EINVAL);
- if (connector)
connector_bad_edid(connector, edid, 1);
out: kfree(edid); return NULL; @@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, if (override) return override;
- edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data,
&connector->edid_corrupt,
&connector->null_edid_counter);
- if (IS_ERR_OR_NULL(edid)) {
if (IS_ERR(edid))
connector_bad_edid(connector, edid, 1);
- edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data);
- if (!edid) return NULL;
}
/* if there's no extensions or no connector, we're done */ valid_extensions = edid[0x7e];
@@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter) struct edid *edid; u32 panel_id;
- edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter,
NULL, NULL);
edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter);
/*
- There are no manufacturer IDs of 0, so if there is a problem reading
- the EDID then we'll just return 0.
*/
- if (IS_ERR_OR_NULL(edid))
if (!edid) return 0;
panel_id = edid_extract_panel_id(edid);
-- 2.33.0.800.g4c38ced690-goog
Hi,
On Tue, Oct 5, 2021 at 9:43 AM Ville Syrjälä ville.syrjala@linux.intel.com wrote:
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote:
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer.
Let's re-jigger things so we can pass the bad EDID in properly.
Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot oliver.sang@intel.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Douglas Anderson dianders@chromium.org
A bit of historical fallout zone this part of the code. So not the easiest thing to read in general. But looks like what you have here should work.
Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com
Thanks! Pushed to drm-misc/drm-misc-next:
e7bd95a7ed4e drm/edid: Fix crash with zero/invalid EDID
dri-devel@lists.freedesktop.org