[PATCH 0/3] drm/mst: Add support for simultaneous down replies

List overview All Threads
Download

newer

older

[RESEND PATCH v2 0/7] drm:...

[v4] vgacon: Fix a UAF in...

Sean Paul

13 Feb 2020 13 Feb '20

9:15 p.m.

From: Sean Paul seanpaul@chromium.org

Hey all, Earlier this week on my 5.5 kernel (I can't seem to get a 5.6 kernel to behave on any of my devices), I ran into the multi-reply problem that Wayne fixed in January. Without realizing there was already a fix upstream, I went about solving it in different way. It wasn't until rebasing the patches on 5.6 for the list that I realized there was already a solution.

At any rate, I think this way of handling things might be a bit more performant. I'm not super happy with the cleanliness of the code, I think this series should make things easier to read, but I don't think I achieved that. So suggestions are welcome on how to break this apart.

Thanks,

Sean

Sean Paul (3): drm/mst: Separate sideband packet header parsing from message building drm/mst: Support simultaneous down replies drm/dp_mst: Remove single tx msg restriction.

drivers/gpu/drm/drm_dp_mst_topology.c | 196 ++++++++++++++------------ include/drm/drm_dp_mst_helper.h | 65 ++++----- 2 files changed, 137 insertions(+), 124 deletions(-)

-- Sean Paul, Software Engineer, Google / Chromium OS

Show replies by date

Sean Paul

13 Feb 13 Feb

9:15 p.m.

New subject: [PATCH 1/3] drm/mst: Separate sideband packet header parsing from message building

From: Sean Paul seanpaul@chromium.org

In preparation of per-branch device down message handling, separate header parsing from message building. This will allow us to peek at figure out which branch device the message is from before starting to parse the message contents.

Signed-off-by: Sean Paul seanpaul@chromium.org --- drivers/gpu/drm/drm_dp_mst_topology.c | 102 ++++++++++++++------------ 1 file changed, 57 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index a811247cecfef..e8bb39bb17a0f 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -687,51 +687,45 @@ static void drm_dp_encode_sideband_reply(struct drm_dp_sideband_msg_reply_body * raw->cur_len = idx; }

-/* this adds a chunk of msg to the builder to get the final msg */ -static bool drm_dp_sideband_msg_build(struct drm_dp_sideband_msg_rx *msg, - u8 *replybuf, u8 replybuflen, bool hdr) +static int drm_dp_sideband_msg_set_header(struct drm_dp_sideband_msg_rx *msg, + struct drm_dp_sideband_msg_hdr *hdr, + u8 hdrlen) { - int ret; - u8 crc4; + /* + * ignore out-of-order messages or messages that are part of a + * failed transaction + */ + if (!hdr->somt && !msg->have_somt) + return false;

- if (hdr) { - u8 hdrlen; - struct drm_dp_sideband_msg_hdr recv_hdr; - ret = drm_dp_decode_sideband_msg_hdr(&recv_hdr, replybuf, replybuflen, &hdrlen); - if (ret == false) { - print_hex_dump(KERN_DEBUG, "failed hdr", DUMP_PREFIX_NONE, 16, 1, replybuf, replybuflen, false); - return false; - } + /* get length contained in this portion */ + msg->curchunk_idx = 0; + msg->curchunk_len = hdr->msg_len; + msg->curchunk_hdrlen = hdrlen;

- /* - * ignore out-of-order messages or messages that are part of a - * failed transaction - */ - if (!recv_hdr.somt && !msg->have_somt) - return false; + /* we have already gotten an somt - don't bother parsing */ + if (hdr->somt && msg->have_somt) + return false;

- /* get length contained in this portion */ - msg->curchunk_len = recv_hdr.msg_len; - msg->curchunk_hdrlen = hdrlen; + if (hdr->somt) { + memcpy(&msg->initial_hdr, hdr, + sizeof(struct drm_dp_sideband_msg_hdr)); + msg->have_somt = true; + } + if (hdr->eomt) + msg->have_eomt = true;

- /* we have already gotten an somt - don't bother parsing */ - if (recv_hdr.somt && msg->have_somt) - return false; + return true; +}

- if (recv_hdr.somt) { - memcpy(&msg->initial_hdr, &recv_hdr, sizeof(struct drm_dp_sideband_msg_hdr)); - msg->have_somt = true; - } - if (recv_hdr.eomt) - msg->have_eomt = true; +/* this adds a chunk of msg to the builder to get the final msg */ +static bool drm_dp_sideband_msg_build(struct drm_dp_sideband_msg_rx *msg, + u8 *replybuf, u8 replybuflen) +{ + u8 crc4;

- /* copy the bytes for the remainder of this header chunk */ - msg->curchunk_idx = min(msg->curchunk_len, (u8)(replybuflen - hdrlen)); - memcpy(&msg->chunk[0], replybuf + hdrlen, msg->curchunk_idx); - } else { - memcpy(&msg->chunk[msg->curchunk_idx], replybuf, replybuflen); - msg->curchunk_idx += replybuflen; - } + memcpy(&msg->chunk[msg->curchunk_idx], replybuf, replybuflen); + msg->curchunk_idx += replybuflen;

if (msg->curchunk_idx >= msg->curchunk_len) { /* do CRC */ @@ -3702,25 +3696,43 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) u8 replyblock[32]; int replylen, curreply; int ret; + u8 hdrlen; + struct drm_dp_sideband_msg_hdr hdr; struct drm_dp_sideband_msg_rx *msg; - int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE; + int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : + DP_SIDEBAND_MSG_DOWN_REP_BASE; + msg = up ? &mgr->up_req_recv : &mgr->down_rep_recv;

len = min(mgr->max_dpcd_transaction_bytes, 16); - ret = drm_dp_dpcd_read(mgr->aux, basereg, - replyblock, len); + ret = drm_dp_dpcd_read(mgr->aux, basereg, replyblock, len); if (ret != len) { DRM_DEBUG_KMS("failed to read DPCD down rep %d %d\n", len, ret); return false; } - ret = drm_dp_sideband_msg_build(msg, replyblock, len, true); + + ret = drm_dp_decode_sideband_msg_hdr(&hdr, replyblock, len, &hdrlen); + if (ret == false) { + print_hex_dump(KERN_DEBUG, "failed hdr", DUMP_PREFIX_NONE, 16, + 1, replyblock, len, false); + DRM_DEBUG_KMS("ERROR: failed header\n"); + return false; + } + + if (!drm_dp_sideband_msg_set_header(msg, &hdr, hdrlen)) { + DRM_DEBUG_KMS("sideband msg set header failed %d\n", + replyblock[0]); + return false; + } + + replylen = min(msg->curchunk_len, (u8)(len - hdrlen)); + ret = drm_dp_sideband_msg_build(msg, replyblock + hdrlen, replylen); if (!ret) { DRM_DEBUG_KMS("sideband msg build failed %d\n", replyblock[0]); return false; } - replylen = msg->curchunk_len + msg->curchunk_hdrlen;

- replylen -= len; + replylen = msg->curchunk_len + msg->curchunk_hdrlen - len; curreply = len; while (replylen > 0) { len = min3(replylen, mgr->max_dpcd_transaction_bytes, 16); @@ -3732,7 +3744,7 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) return false; }

- ret = drm_dp_sideband_msg_build(msg, replyblock, len, false); + ret = drm_dp_sideband_msg_build(msg, replyblock, len); if (!ret) { DRM_DEBUG_KMS("failed to build sideband msg\n"); return false;

-- Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul

9:15 p.m.

New subject: [PATCH 2/3] drm/mst: Support simultaneous down replies

From: Sean Paul seanpaul@chromium.org

Currently we have one down reply message servicing the mst manager, so we need to serialize all tx msgs to ensure we only have one message in flight at a time. For obvious reasons this is suboptimal (but less suboptimal than the free-for-all we had before serialization).

This patch removes the single down_rep_recv message from manager and adds 2 replies in the branch structure. The 2 replies mirrors the tx_slots which we use to rate-limit outgoing messages and correspond to seqno in the packet headers.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org --- drivers/gpu/drm/drm_dp_mst_topology.c | 80 ++++++++++++++++----------- include/drm/drm_dp_mst_helper.h | 59 ++++++++++---------- 2 files changed, 78 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index e8bb39bb17a0f..7e6a82efdfc02 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3690,7 +3690,8 @@ int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr, } EXPORT_SYMBOL(drm_dp_mst_topology_mgr_resume);

-static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) +static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up, + struct drm_dp_mst_branch **mstb, int *seqno) { int len; u8 replyblock[32]; @@ -3702,7 +3703,8 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE;

- msg = up ? &mgr->up_req_recv : &mgr->down_rep_recv; + *mstb = NULL; + *seqno = -1;

len = min(mgr->max_dpcd_transaction_bytes, 16); ret = drm_dp_dpcd_read(mgr->aux, basereg, replyblock, len); @@ -3719,6 +3721,21 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) return false; }

+ *seqno = hdr.seqno; + + if (up) { + msg = &mgr->up_req_recv; + } else { + /* Caller is responsible for giving back this reference */ + *mstb = drm_dp_get_mst_branch_device(mgr, hdr.lct, hdr.rad); + if (!*mstb) { + DRM_DEBUG_KMS("Got MST reply from unknown device %d\n", + hdr.lct); + return false; + } + msg = &(*mstb)->down_rep_recv[hdr.seqno]; + } + if (!drm_dp_sideband_msg_set_header(msg, &hdr, hdrlen)) { DRM_DEBUG_KMS("sideband msg set header failed %d\n", replyblock[0]); @@ -3759,53 +3776,52 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up) static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) { struct drm_dp_sideband_msg_tx *txmsg; - struct drm_dp_mst_branch *mstb; - struct drm_dp_sideband_msg_hdr *hdr = &mgr->down_rep_recv.initial_hdr; - int slot = -1; + struct drm_dp_mst_branch *mstb = NULL; + struct drm_dp_sideband_msg_rx *msg = NULL; + int seqno = -1;

- if (!drm_dp_get_one_sb_msg(mgr, false)) - goto clear_down_rep_recv; + if (!drm_dp_get_one_sb_msg(mgr, false, &mstb, &seqno)) + goto out_clear_reply;

- if (!mgr->down_rep_recv.have_eomt) - return 0; + msg = &mstb->down_rep_recv[seqno];

- mstb = drm_dp_get_mst_branch_device(mgr, hdr->lct, hdr->rad); - if (!mstb) { - DRM_DEBUG_KMS("Got MST reply from unknown device %d\n", - hdr->lct); - goto clear_down_rep_recv; - } + /* Multi-packet message transmission, don't clear the reply */ + if (!msg->have_eomt) + goto out;

/* find the message */ - slot = hdr->seqno; mutex_lock(&mgr->qlock); - txmsg = mstb->tx_slots[slot]; + txmsg = mstb->tx_slots[seqno]; /* remove from slots */ mutex_unlock(&mgr->qlock);

if (!txmsg) { + struct drm_dp_sideband_msg_hdr *hdr; + hdr = &msg->initial_hdr; DRM_DEBUG_KMS("Got MST reply with no msg %p %d %d %02x %02x\n", mstb, hdr->seqno, hdr->lct, hdr->rad[0], - mgr->down_rep_recv.msg[0]); - goto no_msg; + msg->msg[0]); + goto out_clear_reply; }

- drm_dp_sideband_parse_reply(&mgr->down_rep_recv, &txmsg->reply); + drm_dp_sideband_parse_reply(msg, &txmsg->reply);

- if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK) + if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK) { DRM_DEBUG_KMS("Got NAK reply: req 0x%02x (%s), reason 0x%02x (%s), nak data 0x%02x\n", txmsg->reply.req_type, drm_dp_mst_req_type_str(txmsg->reply.req_type), txmsg->reply.u.nak.reason, drm_dp_mst_nak_reason_str(txmsg->reply.u.nak.reason), txmsg->reply.u.nak.nak_data); + goto out_clear_reply; + }

- memset(&mgr->down_rep_recv, 0, sizeof(struct drm_dp_sideband_msg_rx)); + memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx)); drm_dp_mst_topology_put_mstb(mstb);

mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; - mstb->tx_slots[slot] = NULL; + mstb->tx_slots[seqno] = NULL; mgr->is_waiting_for_dwn_reply = false; mutex_unlock(&mgr->qlock);

@@ -3813,13 +3829,15 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr)

return 0;

-no_msg: - drm_dp_mst_topology_put_mstb(mstb); -clear_down_rep_recv: +out_clear_reply: mutex_lock(&mgr->qlock); mgr->is_waiting_for_dwn_reply = false; mutex_unlock(&mgr->qlock); - memset(&mgr->down_rep_recv, 0, sizeof(struct drm_dp_sideband_msg_rx)); + if (msg) + memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx)); +out: + if (mstb) + drm_dp_mst_topology_put_mstb(mstb);

return 0; } @@ -3895,11 +3913,10 @@ static void drm_dp_mst_up_req_work(struct work_struct *work)

static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) { - struct drm_dp_sideband_msg_hdr *hdr = &mgr->up_req_recv.initial_hdr; struct drm_dp_pending_up_req *up_req; - bool seqno; + int seqno;

- if (!drm_dp_get_one_sb_msg(mgr, true)) + if (!drm_dp_get_one_sb_msg(mgr, true, NULL, &seqno)) goto out;

if (!mgr->up_req_recv.have_eomt) @@ -3912,7 +3929,6 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) } INIT_LIST_HEAD(&up_req->next);

- seqno = hdr->seqno; drm_dp_sideband_parse_req(&mgr->up_req_recv, &up_req->msg);

if (up_req->msg.req_type != DP_CONNECTION_STATUS_NOTIFY && @@ -3946,7 +3962,7 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) res_stat->available_pbn); }

- up_req->hdr = *hdr; + up_req->hdr = mgr->up_req_recv.initial_hdr; mutex_lock(&mgr->up_req_lock); list_add_tail(&up_req->next, &mgr->up_req_list); mutex_unlock(&mgr->up_req_lock); diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index 5483f888712ad..7d0341f94ce1b 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -160,6 +160,31 @@ struct drm_dp_mst_port { bool fec_capable; };

+/* sideband msg header - not bit struct */ +struct drm_dp_sideband_msg_hdr { + u8 lct; + u8 lcr; + u8 rad[8]; + bool broadcast; + bool path_msg; + u8 msg_len; + bool somt; + bool eomt; + bool seqno; +}; + +struct drm_dp_sideband_msg_rx { + u8 chunk[48]; + u8 msg[256]; + u8 curchunk_len; + u8 curchunk_idx; /* chunk we are parsing now */ + u8 curchunk_hdrlen; + u8 curlen; /* total length of the msg */ + bool have_somt; + bool have_eomt; + struct drm_dp_sideband_msg_hdr initial_hdr; +}; + /** * struct drm_dp_mst_branch - MST branch device. * @rad: Relative Address to talk to this branch device. @@ -232,24 +257,16 @@ struct drm_dp_mst_branch { int last_seqno; bool link_address_sent;

+ /** + * @down_rep_recv: Message receiver state for down replies. + */ + struct drm_dp_sideband_msg_rx down_rep_recv[2]; + /* global unique identifier to identify branch devices */ u8 guid[16]; };

-/* sideband msg header - not bit struct */ -struct drm_dp_sideband_msg_hdr { - u8 lct; - u8 lcr; - u8 rad[8]; - bool broadcast; - bool path_msg; - u8 msg_len; - bool somt; - bool eomt; - bool seqno; -}; - struct drm_dp_nak_reply { u8 guid[16]; u8 reason; @@ -306,18 +323,6 @@ struct drm_dp_remote_i2c_write_ack_reply { };

-struct drm_dp_sideband_msg_rx { - u8 chunk[48]; - u8 msg[256]; - u8 curchunk_len; - u8 curchunk_idx; /* chunk we are parsing now */ - u8 curchunk_hdrlen; - u8 curlen; /* total length of the msg */ - bool have_somt; - bool have_eomt; - struct drm_dp_sideband_msg_hdr initial_hdr; -}; - #define DRM_DP_MAX_SDP_STREAMS 16 struct drm_dp_allocate_payload { u8 port_number; @@ -556,10 +561,6 @@ struct drm_dp_mst_topology_mgr { */ int conn_base_id;

- /** - * @down_rep_recv: Message receiver state for down replies. - */ - struct drm_dp_sideband_msg_rx down_rep_recv; /** * @up_req_recv: Message receiver state for up requests. */

-- Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul

9:15 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time") now that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org --- drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; } - mgr->is_waiting_for_dwn_reply = false; - } out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);

- drm_dp_mst_kick_tx(mgr); return ret; }

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */ - mgr->is_waiting_for_dwn_reply = true; list_del(&txmsg->next); } else if (ret) { DRM_DEBUG_KMS("failed to send msg in q %d\n", ret); - mgr->is_waiting_for_dwn_reply = false; list_del(&txmsg->next); if (txmsg->seqno != -1) txmsg->dst->tx_slots[txmsg->seqno] = NULL; @@ -2841,8 +2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }

- if (list_is_singular(&mgr->tx_msg_downq) && - !mgr->is_waiting_for_dwn_reply) + if (list_is_singular(&mgr->tx_msg_downq)) process_single_down_tx_qlock(mgr); mutex_unlock(&mgr->qlock); } @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL; - mgr->is_waiting_for_dwn_reply = false; mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq); @@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply: - mutex_lock(&mgr->qlock); - mgr->is_waiting_for_dwn_reply = false; - mutex_unlock(&mgr->qlock); if (msg) memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx)); out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);

mutex_lock(&mgr->qlock); - if (!list_empty(&mgr->tx_msg_downq) && !mgr->is_waiting_for_dwn_reply) + if (!list_empty(&mgr->tx_msg_downq)) process_single_down_tx_qlock(mgr); mutex_unlock(&mgr->qlock); } diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index 7d0341f94ce1b..fcc30e64c8e7e 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock; - - /** - * @is_waiting_for_dwn_reply: indicate whether is waiting for down reply - */ - bool is_waiting_for_dwn_reply; - /** * @tx_msg_downq: List of pending down replies. */

-- Sean Paul, Software Engineer, Google / Chromium OS

Lin, Wayne

14 Feb 14 Feb

5:57 a.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a time due to hub/dock compatibility problems.

...

From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

As the result of that, correct me if I'm wrong, I remember most gpu vendors just send one down request at a time now in windows environment. I would suggest the original solution :)

Thanks!

...

-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time") now that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; }
mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);

drm_dp_mst_kick_tx(mgr); return ret;

}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
mgr->is_waiting_for_dwn_reply = true;
list_del(&txmsg->next); } else if (ret) { DRM_DEBUG_KMS("failed to send msg in q %d\n", ret);
mgr->is_waiting_for_dwn_reply = false;
list_del(&txmsg->next); if (txmsg->seqno != -1) txmsg->dst->tx_slots[txmsg->seqno] = NULL; @@ -2841,8
+2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
   !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq)) process_single_down_tx_qlock(mgr); mutex_unlock(&mgr->qlock);

} @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;

mgr->is_waiting_for_dwn_reply = false; mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq);

@@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:

mutex_lock(&mgr->qlock);

mgr->is_waiting_for_dwn_reply = false;

mutex_unlock(&mgr->qlock); if (msg) memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx));

out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);

mutex_lock(&mgr->qlock);

if (!list_empty(&mgr->tx_msg_downq)

&& !mgr->is_waiting_for_dwn_reply)

if (!list_empty(&mgr->tx_msg_downq)) process_single_down_tx_qlock(mgr); mutex_unlock(&mgr->qlock);

} diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index 7d0341f94ce1b..fcc30e64c8e7e 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
* @is_waiting_for_dwn_reply: indicate whether is waiting for down
reply
*/
bool is_waiting_for_dwn_reply;

/**

@tx_msg_downq: List of pending down replies.

*/
-- Sean Paul, Software Engineer, Google / Chromium OS

-- Wayne Lin

Sean Paul

4:08 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...

[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a time due to hub/dock compatibility problems. From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

...

As the result of that, correct me if I'm wrong, I remember most gpu vendors just send one down request at a time now in windows environment. I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Sean

...

Thanks!

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time") now that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; }
        mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);
drm_dp_mst_kick_tx(mgr);
return ret;
}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
        mgr->is_waiting_for_dwn_reply = true;
        list_del(&txmsg->next);
} else if (ret) {
        DRM_DEBUG_KMS("failed to send msg in q %d\n", ret);
        mgr->is_waiting_for_dwn_reply = false;
        list_del(&txmsg->next);
        if (txmsg->seqno != -1)
                txmsg->dst->tx_slots[txmsg->seqno] = NULL; @@ -2841,8
+2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
    !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq);
@@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:
mutex_lock(&mgr->qlock);
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);
if (msg)
        memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx));
out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);
  mutex_lock(&mgr->qlock);
if (!list_empty(&mgr->tx_msg_downq)
&& !mgr->is_waiting_for_dwn_reply)
if (!list_empty(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index 7d0341f94ce1b..fcc30e64c8e7e 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
 * @is_waiting_for_dwn_reply: indicate whether is waiting for down
reply
 */
bool is_waiting_for_dwn_reply;
/**
 * @tx_msg_downq: List of pending down replies.
 */
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin

Lin, Wayne

17 Feb 17 Feb

7:08 a.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

[AMD Public Use]

...

-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...
[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a

time due to hub/dock compatibility problems.

...
From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...

...
As the result of that, correct me if I'm wrong, I remember most gpu vendors

just send one down request at a time now in windows environment.

...
I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter the same problem?

Thanks!

...

Sean

...
Thanks!

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time")

now

...
...
that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; }
        mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);
drm_dp_mst_kick_tx(mgr);
return ret;
}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
        mgr->is_waiting_for_dwn_reply = true;
        list_del(&txmsg->next);
} else if (ret) {
        DRM_DEBUG_KMS("failed to send msg in q %d\n", ret);
        mgr->is_waiting_for_dwn_reply = false;
        list_del(&txmsg->next);
        if (txmsg->seqno != -1)
                txmsg->dst->tx_slots[txmsg->seqno] = NULL;
@@

...
...
-2841,8 +2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
    !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq);
@@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:
mutex_lock(&mgr->qlock);
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);
if (msg)
        memset(msg, 0, sizeof(struct
drm_dp_sideband_msg_rx));

...
...
out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);
  mutex_lock(&mgr->qlock);
if (!list_empty(&mgr->tx_msg_downq)
&& !mgr->is_waiting_for_dwn_reply)
if (!list_empty(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index
7d0341f94ce1b..fcc30e64c8e7e

...
...
100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
 * @is_waiting_for_dwn_reply: indicate whether is waiting for
down

...
...
reply
 */
bool is_waiting_for_dwn_reply;
/**
 * @tx_msg_downq: List of pending down replies.
 */
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin

-- Best regards, Wayne Lin

Sean Paul

18 Feb 18 Feb

3:52 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Mon, Feb 17, 2020 at 07:08:37AM +0000, Lin, Wayne wrote:

...

[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...
[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a

time due to hub/dock compatibility problems.

...
From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...
...
As the result of that, correct me if I'm wrong, I remember most gpu vendors

just send one down request at a time now in windows environment.

...
I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter the same problem?

Hi Wayne, Thanks for clarifying. I'm a bit hesitant to scrap this idea without solid evidence that this is a common problem. Do you have any hubs around AMD that you could try my patchset with? Perhaps we could get some hard data? I'm happy to test things on my end, but I probably shouldn't just start buying a bunch of random HP docks in hopes one of them exhibits the issue :)

To add anecdote to anecdote, everything on my desk seems to work with interleaved replies.

Since HDCP spec requires the host to verify each channel's encryption every <2s, we're going to be increasing the amount of sideband traffic a fair amount, so I would like to ensure we're doing everything possible to maximize our sideband bandwidth.

Sean

...

Thanks!

...
Sean

...
Thanks!

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time")

now

...
...
that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; }
        mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);
drm_dp_mst_kick_tx(mgr);
return ret;
}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
        mgr->is_waiting_for_dwn_reply = true;
        list_del(&txmsg->next);
} else if (ret) {
        DRM_DEBUG_KMS("failed to send msg in q %d\n", ret);
        mgr->is_waiting_for_dwn_reply = false;
        list_del(&txmsg->next);
        if (txmsg->seqno != -1)
                txmsg->dst->tx_slots[txmsg->seqno] = NULL;
@@

...
...
-2841,8 +2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
    !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq);
@@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:
mutex_lock(&mgr->qlock);
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);
if (msg)
        memset(msg, 0, sizeof(struct
drm_dp_sideband_msg_rx));

...
...
out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);
  mutex_lock(&mgr->qlock);
if (!list_empty(&mgr->tx_msg_downq)
&& !mgr->is_waiting_for_dwn_reply)
if (!list_empty(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index
7d0341f94ce1b..fcc30e64c8e7e

...
...
100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
 * @is_waiting_for_dwn_reply: indicate whether is waiting for
down

...
...
reply
 */
bool is_waiting_for_dwn_reply;
/**
 * @tx_msg_downq: List of pending down replies.
 */
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin
-- Best regards, Wayne Lin

-- Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul

5:15 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Tue, Feb 18, 2020 at 10:52:06AM -0500, Sean Paul wrote:

...

On Mon, Feb 17, 2020 at 07:08:37AM +0000, Lin, Wayne wrote:

...
[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...
[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a

time due to hub/dock compatibility problems.

...
From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...
...
As the result of that, correct me if I'm wrong, I remember most gpu vendors

just send one down request at a time now in windows environment.

...
I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter the same problem?

Hi Wayne, Thanks for clarifying. I'm a bit hesitant to scrap this idea without solid evidence that this is a common problem. Do you have any hubs around AMD that you could try my patchset with?

Oh, if you can test the set, you'll also need this patch as well :-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3814,7 +3814,9 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up, int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE;

- *mstb = NULL; + if (mstb) + *mstb = NULL; + *seqno = -1;

len = min(mgr->max_dpcd_transaction_bytes, 16);

...

Perhaps we could get some hard data? I'm happy to test things on my end, but I probably shouldn't just start buying a bunch of random HP docks in hopes one of them exhibits the issue :)

To add anecdote to anecdote, everything on my desk seems to work with interleaved replies.

Since HDCP spec requires the host to verify each channel's encryption every <2s, we're going to be increasing the amount of sideband traffic a fair amount, so I would like to ensure we're doing everything possible to maximize our sideband bandwidth.

Sean

...
Thanks!

...
Sean

...
Thanks!

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time")

now

...
...
that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state == DRM_DP_SIDEBAND_TX_SENT) { mstb->tx_slots[txmsg->seqno] = NULL; }
        mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);
drm_dp_mst_kick_tx(mgr);
return ret;
}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
        mgr->is_waiting_for_dwn_reply = true;
        list_del(&txmsg->next);
} else if (ret) {
        DRM_DEBUG_KMS("failed to send msg in q %d\n", ret);
        mgr->is_waiting_for_dwn_reply = false;
        list_del(&txmsg->next);
        if (txmsg->seqno != -1)
                txmsg->dst->tx_slots[txmsg->seqno] = NULL;
@@

...
...
-2841,8 +2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
    !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} @@ -3822,7 +3816,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq);
@@ -3830,9 +3823,6 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:
mutex_lock(&mgr->qlock);
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);
if (msg)
        memset(msg, 0, sizeof(struct
drm_dp_sideband_msg_rx));

...
...
out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, tx_work);
  mutex_lock(&mgr->qlock);
if (!list_empty(&mgr->tx_msg_downq)
&& !mgr->is_waiting_for_dwn_reply)
if (!list_empty(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);
} diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index
7d0341f94ce1b..fcc30e64c8e7e

...
...
100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
 * @is_waiting_for_dwn_reply: indicate whether is waiting for
down

...
...
reply
 */
bool is_waiting_for_dwn_reply;
/**
 * @tx_msg_downq: List of pending down replies.
 */
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin
-- Best regards, Wayne Lin
-- Sean Paul, Software Engineer, Google / Chromium OS

-- Sean Paul, Software Engineer, Google / Chromium OS

Lin, Wayne

19 Feb 19 Feb

11 a.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

[AMD Public Use]

...

-----Original Message----- From: Sean Paul sean@poorly.run Sent: Wednesday, February 19, 2020 1:15 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: Sean Paul sean@poorly.run; dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Tue, Feb 18, 2020 at 10:52:06AM -0500, Sean Paul wrote:

...
On Mon, Feb 17, 2020 at 07:08:37AM +0000, Lin, Wayne wrote:

...
[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com

wrote:

...
...
...
...
[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a

time due to hub/dock compatibility problems.

...
From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...
...
As the result of that, correct me if I'm wrong, I remember most gpu vendors

just send one down request at a time now in windows environment.

...
I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter

the same problem?

...
...
Hi Wayne, Thanks for clarifying. I'm a bit hesitant to scrap this idea without solid evidence that this is a common problem. Do you have any hubs around AMD that you could try my patchset with?

Hi Paul, Sure! I will see what I have at hand and try your patch set. It might take me some time but I will update this as soon as possible. :)

Thanks!

...

Oh, if you can test the set, you'll also need this patch as well :-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3814,7 +3814,9 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up, int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE;
  *mstb = NULL;
  if (mstb)
          *mstb = NULL;
  *seqno = -1;

  len = min(mgr->max_dpcd_transaction_bytes, 16);
...
Perhaps we could get some hard data? I'm happy to test things on my end, but I probably shouldn't just start buying a bunch of random HP docks in hopes one of them exhibits the issue :)

To add anecdote to anecdote, everything on my desk seems to work with interleaved replies.

Since HDCP spec requires the host to verify each channel's encryption every <2s, we're going to be increasing the amount of sideband traffic a fair amount, so I would like to ensure we're doing everything possible to maximize our sideband bandwidth.

Sean

...
Thanks!

...
Sean

...
Thanks!

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, February 14, 2020 5:15 AM To: dri-devel@lists.freedesktop.org Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

From: Sean Paul seanpaul@chromium.org

Now that we can support multiple simultaneous replies, remove the restrictions placed on sending new tx msgs.

This patch essentially just reverts commit 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a time")

now

...
...
that the problem is solved in a different way.

Cc: Wayne Lin Wayne.Lin@amd.com Signed-off-by: Sean Paul seanpaul@chromium.org

drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ include/drm/drm_dp_mst_helper.h | 6 ------ 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 7e6a82efdfc02..cbf0bb0ddeb84 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1203,8 +1203,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, txmsg->state ==

DRM_DP_SIDEBAND_TX_SENT) {

...
...
...
...
...
                  mstb->tx_slots[txmsg->seqno] = NULL;
          }
        mgr->is_waiting_for_dwn_reply = false;
}
out: if (unlikely(ret == -EIO) && drm_debug_enabled(DRM_UT_DP)) { @@ -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, } mutex_unlock(&mgr->qlock);
drm_dp_mst_kick_tx(mgr);
return ret;
}

@@ -2797,11 +2794,9 @@ static void process_single_down_tx_qlock(struct drm_dp_mst_topology_mgr *mgr) ret = process_single_tx_qlock(mgr, txmsg, false); if (ret == 1) { /* txmsg is sent it should be in the slots now */
        mgr->is_waiting_for_dwn_reply = true;
        list_del(&txmsg->next);
} else if (ret) {
        DRM_DEBUG_KMS("failed to send msg in q %d\n",
ret);

...
...
...
...
...
        mgr->is_waiting_for_dwn_reply = false;
        list_del(&txmsg->next);
        if (txmsg->seqno != -1)
                txmsg->dst->tx_slots[txmsg->seqno] =
NULL;
@@

...
...
-2841,8 +2836,7 @@ static void drm_dp_queue_down_tx(struct drm_dp_mst_topology_mgr *mgr, drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); }
if (list_is_singular(&mgr->tx_msg_downq) &&
    !mgr->is_waiting_for_dwn_reply)
if (list_is_singular(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);  } @@ -3822,7 +3816,6 @@
static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) mutex_lock(&mgr->qlock); txmsg->state = DRM_DP_SIDEBAND_TX_RX; mstb->tx_slots[seqno] = NULL;
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);

wake_up_all(&mgr->tx_waitq); @@ -3830,9 +3823,6 @@
static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0;

out_clear_reply:
mutex_lock(&mgr->qlock);
mgr->is_waiting_for_dwn_reply = false;
mutex_unlock(&mgr->qlock);
if (msg)
        memset(msg, 0, sizeof(struct
drm_dp_sideband_msg_rx));

...
...
out: @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct work_struct *work) struct drm_dp_mst_topology_mgr *mgr =
container_of(work,

...
...
...
...
...
struct drm_dp_mst_topology_mgr, tx_work);
  mutex_lock(&mgr->qlock);
if (!list_empty(&mgr->tx_msg_downq)
&& !mgr->is_waiting_for_dwn_reply)
if (!list_empty(&mgr->tx_msg_downq))
        process_single_down_tx_qlock(mgr);
mutex_unlock(&mgr->qlock);  } diff --git
a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index
7d0341f94ce1b..fcc30e64c8e7e

...
...
100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { * &drm_dp_sideband_msg_tx.state once they are queued */ struct mutex qlock;
/**
 * @is_waiting_for_dwn_reply: indicate whether is waiting
for

...
...
...
down

...
...
reply
 */
bool is_waiting_for_dwn_reply;
/**
 * @tx_msg_downq: List of pending down replies.
 */
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin
-- Best regards, Wayne Lin
-- Sean Paul, Software Engineer, Google / Chromium OS
-- Sean Paul, Software Engineer, Google / Chromium OS

-- Wayne Lin

Sean Paul

5 Mar 5 Mar

5:19 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Wed, Feb 19, 2020 at 6:00 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...

[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Wednesday, February 19, 2020 1:15 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: Sean Paul sean@poorly.run; dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Tue, Feb 18, 2020 at 10:52:06AM -0500, Sean Paul wrote:

...
On Mon, Feb 17, 2020 at 07:08:37AM +0000, Lin, Wayne wrote:

...
[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com

wrote:

...
...
...
...
[AMD Public Use]

Hi Paul,

Thanks for the mail!

I tried to solve this problem by having restriction on sending one msg at a

time due to hub/dock compatibility problems.

...
From my experience, some branch devices don't handle well on interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...
...
As the result of that, correct me if I'm wrong, I remember most gpu vendors

just send one down request at a time now in windows environment.

...
I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter

the same problem?

...
...
Hi Wayne, Thanks for clarifying. I'm a bit hesitant to scrap this idea without solid evidence that this is a common problem. Do you have any hubs around AMD that you could try my patchset with?

Hi Paul, Sure! I will see what I have at hand and try your patch set. It might take me some time but I will update this as soon as possible. :)

Hi Wayne, I'm seeing spurious timeouts even with your serialization patch on mainline. Have you had a chance to test this set? At least on my machines it seems to be working better.

Sean

...

Thanks!

...
Oh, if you can test the set, you'll also need this patch as well :-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3814,7 +3814,9 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up, int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE;
  *mstb = NULL;
  if (mstb)
          *mstb = NULL;
  *seqno = -1;

  len = min(mgr->max_dpcd_transaction_bytes, 16);
...
Perhaps we could get some hard data? I'm happy to test things on my end, but I probably shouldn't just start buying a bunch of random HP docks in hopes one of them exhibits the issue :)

To add anecdote to anecdote, everything on my desk seems to work with interleaved replies.

Since HDCP spec requires the host to verify each channel's encryption every <2s, we're going to be increasing the amount of sideband traffic a fair amount, so I would like to ensure we're doing everything possible to maximize our sideband bandwidth.

Sean

...
Thanks!

...
Sean

...
Thanks! > -----Original Message----- > From: Sean Paul sean@poorly.run > Sent: Friday, February 14, 2020 5:15 AM > To: dri-devel@lists.freedesktop.org > Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean > Paul seanpaul@chromium.org; Maarten Lankhorst > maarten.lankhorst@linux.intel.com; Maxime Ripard > mripard@kernel.org; David Airlie airlied@linux.ie > Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction. > > From: Sean Paul seanpaul@chromium.org > > Now that we can support multiple simultaneous replies, remove > the restrictions placed on sending new tx msgs. > > This patch essentially just reverts commit > 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a > time")

now

...
> that the problem is solved in a different way. > > Cc: Wayne Lin Wayne.Lin@amd.com > Signed-off-by: Sean Paul seanpaul@chromium.org > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ > include/drm/drm_dp_mst_helper.h | 6 ------ > 2 files changed, 2 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > b/drivers/gpu/drm/drm_dp_mst_topology.c > index 7e6a82efdfc02..cbf0bb0ddeb84 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -1203,8 +1203,6 @@ static int > drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, > txmsg->state ==

DRM_DP_SIDEBAND_TX_SENT) {

...
...
...
...
> mstb->tx_slots[txmsg->seqno] = NULL; > } > - mgr->is_waiting_for_dwn_reply = false; > - > } > out: > if (unlikely(ret == -EIO) && > drm_debug_enabled(DRM_UT_DP)) { @@ > -1214,7 +1212,6 @@ static int drm_dp_mst_wait_tx_reply(struct > drm_dp_mst_branch *mstb, > } > mutex_unlock(&mgr->qlock); > > - drm_dp_mst_kick_tx(mgr); > return ret; > } > > @@ -2797,11 +2794,9 @@ static void > process_single_down_tx_qlock(struct > drm_dp_mst_topology_mgr *mgr) > ret = process_single_tx_qlock(mgr, txmsg, false); > if (ret == 1) { > /* txmsg is sent it should be in the slots now */ > - mgr->is_waiting_for_dwn_reply = true; > list_del(&txmsg->next); > } else if (ret) { > DRM_DEBUG_KMS("failed to send msg in q %d\n",

ret);

...
...
...
...
> - mgr->is_waiting_for_dwn_reply = false; > list_del(&txmsg->next); > if (txmsg->seqno != -1) > txmsg->dst->tx_slots[txmsg->seqno] = > NULL;

@@

...
> -2841,8 > +2836,7 @@ static void drm_dp_queue_down_tx(struct > drm_dp_mst_topology_mgr *mgr, > drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); > } > > - if (list_is_singular(&mgr->tx_msg_downq) && > - !mgr->is_waiting_for_dwn_reply) > + if (list_is_singular(&mgr->tx_msg_downq)) > process_single_down_tx_qlock(mgr); > mutex_unlock(&mgr->qlock); } @@ -3822,7 +3816,6 @@ > static int drm_dp_mst_handle_down_rep(struct > drm_dp_mst_topology_mgr *mgr) > mutex_lock(&mgr->qlock); > txmsg->state = DRM_DP_SIDEBAND_TX_RX; > mstb->tx_slots[seqno] = NULL; > - mgr->is_waiting_for_dwn_reply = false; > mutex_unlock(&mgr->qlock); > > wake_up_all(&mgr->tx_waitq); @@ -3830,9 +3823,6 @@ > static int drm_dp_mst_handle_down_rep(struct > drm_dp_mst_topology_mgr *mgr) > return 0; > > out_clear_reply: > - mutex_lock(&mgr->qlock); > - mgr->is_waiting_for_dwn_reply = false; > - mutex_unlock(&mgr->qlock); > if (msg) > memset(msg, 0, sizeof(struct

drm_dp_sideband_msg_rx));

...
> out: > @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct > work_struct > *work) > struct drm_dp_mst_topology_mgr *mgr =

container_of(work,

...
...
...
...
> struct drm_dp_mst_topology_mgr, tx_work); > > mutex_lock(&mgr->qlock); > - if (!list_empty(&mgr->tx_msg_downq) > && !mgr->is_waiting_for_dwn_reply) > + if (!list_empty(&mgr->tx_msg_downq)) > process_single_down_tx_qlock(mgr); > mutex_unlock(&mgr->qlock); } diff --git > a/include/drm/drm_dp_mst_helper.h > b/include/drm/drm_dp_mst_helper.h index

7d0341f94ce1b..fcc30e64c8e7e

...
> 100644 > --- a/include/drm/drm_dp_mst_helper.h > +++ b/include/drm/drm_dp_mst_helper.h > @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { > * &drm_dp_sideband_msg_tx.state once they are queued > */ > struct mutex qlock; > - > - /** > - * @is_waiting_for_dwn_reply: indicate whether is waiting

for

...
...
...
down

...
> reply > - */ > - bool is_waiting_for_dwn_reply; > - > /** > * @tx_msg_downq: List of pending down replies. > */ > --

> Sean Paul, Software Engineer, Google / Chromium OS

Wayne Lin

-- Best regards, Wayne Lin

-- Sean Paul, Software Engineer, Google / Chromium OS

-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin

Lin, Wayne

6 Mar 6 Mar

1:41 p.m.

New subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

[AMD Public Use]

...

-----Original Message----- From: Sean Paul sean@poorly.run Sent: Friday, March 6, 2020 1:19 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Wed, Feb 19, 2020 at 6:00 AM Lin, Wayne Wayne.Lin@amd.com wrote:

...
[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Wednesday, February 19, 2020 1:15 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: Sean Paul sean@poorly.run; dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Tue, Feb 18, 2020 at 10:52:06AM -0500, Sean Paul wrote:

...
On Mon, Feb 17, 2020 at 07:08:37AM +0000, Lin, Wayne wrote:

...
[AMD Public Use]

...
-----Original Message----- From: Sean Paul sean@poorly.run Sent: Saturday, February 15, 2020 12:09 AM To: Lin, Wayne Wayne.Lin@amd.com Cc: dri-devel@lists.freedesktop.org; lyude@redhat.com; Sean Paul seanpaul@chromium.org; Maarten Lankhorst maarten.lankhorst@linux.intel.com; Maxime Ripard mripard@kernel.org; David Airlie airlied@linux.ie Subject: Re: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction.

On Fri, Feb 14, 2020 at 12:58 AM Lin, Wayne Wayne.Lin@amd.com

wrote:

...
...
...
> > [AMD Public Use] > > Hi Paul, > > Thanks for the mail! > > I tried to solve this problem by having restriction on > sending one msg at a time due to hub/dock compatibility problems. > From my experience, some branch devices don't handle well on > interleaved replies (Dock from HP I think)

Hi Wayne, Hmm, that's interesting, do you have a part number of the failing dock so I can test it?

Hi Paul,

Sorry but it's been quite a while. I can't exactly tell the part number. If I remember correctly, when the specific branch device receives interleaved replies, it just doesn't reply to any requests.

...
> As the result of that, correct me if I'm wrong, I remember > most gpu vendors just send one down request at a time now in windows environment. > I would suggest the original solution :)

I can't really say what happens on the Windows side of the world, but I suppose that makes sense if this is a widespread issue with docks. I do worry about the performance hit.

If indeed this is a problem, could we ratelimit per branch device instead of globally? Even that would be better than serializing everything.

Since the problem was because some branch devices can't simultaneously handle two replies, I'm afraid that we might still encounter

the same problem?

...
...
Hi Wayne, Thanks for clarifying. I'm a bit hesitant to scrap this idea without solid evidence that this is a common problem. Do you have any hubs around AMD that you could try my patchset with?

Hi Paul, Sure! I will see what I have at hand and try your patch set. It might take me some time but I will update this as soon as possible. :)

Hi Wayne, I'm seeing spurious timeouts even with your serialization patch on mainline. Have you had a chance to test this set? At least on my machines it seems to be working better.

Sean

Hi Paul,

Sorry for late reply. I was also concerning about the workload at the 1st branch device when all replies come back. But at least these patches work on hubs I have at hand now.

Whole series are reviewed. Reviewed-by: Wayne Lin waynelin@amd.com

Thanks for your time and help!

...

...
Thanks!

...
Oh, if you can test the set, you'll also need this patch as well :-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -3814,7 +3814,9 @@ static bool drm_dp_get_one_sb_msg(struct drm_dp_mst_topology_mgr *mgr, bool up, int basereg = up ? DP_SIDEBAND_MSG_UP_REQ_BASE : DP_SIDEBAND_MSG_DOWN_REP_BASE;
  *mstb = NULL;
  if (mstb)
          *mstb = NULL;
  *seqno = -1;

  len = min(mgr->max_dpcd_transaction_bytes, 16);
...
Perhaps we could get some hard data? I'm happy to test things on my end, but I probably shouldn't just start buying a bunch of random HP docks in hopes one of them exhibits the issue :)

To add anecdote to anecdote, everything on my desk seems to work with interleaved replies.

Since HDCP spec requires the host to verify each channel's encryption every <2s, we're going to be increasing the amount of sideband traffic a fair amount, so I would like to ensure we're doing everything possible to maximize our sideband bandwidth.

Sean

...
Thanks!

...
Sean

> > Thanks! > > -----Original Message----- > > From: Sean Paul sean@poorly.run > > Sent: Friday, February 14, 2020 5:15 AM > > To: dri-devel@lists.freedesktop.org > > Cc: lyude@redhat.com; Lin, Wayne Wayne.Lin@amd.com; Sean > > Paul seanpaul@chromium.org; Maarten Lankhorst > > maarten.lankhorst@linux.intel.com; Maxime Ripard > > mripard@kernel.org; David Airlie airlied@linux.ie > > Subject: [PATCH 3/3] drm/dp_mst: Remove single tx msg restriction. > > > > From: Sean Paul seanpaul@chromium.org > > > > Now that we can support multiple simultaneous replies, > > remove the restrictions placed on sending new tx msgs. > > > > This patch essentially just reverts commit > > 5a64967a2f3b ("drm/dp_mst: Have DP_Tx send one msg at a > > time") now > > that the problem is solved in a different way. > > > > Cc: Wayne Lin Wayne.Lin@amd.com > > Signed-off-by: Sean Paul seanpaul@chromium.org > > --- > > drivers/gpu/drm/drm_dp_mst_topology.c | 14 ++------------ > > include/drm/drm_dp_mst_helper.h | 6 ------ > > 2 files changed, 2 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > index 7e6a82efdfc02..cbf0bb0ddeb84 100644 > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > @@ -1203,8 +1203,6 @@ static int > > drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, > > txmsg->state ==

DRM_DP_SIDEBAND_TX_SENT) {

...
...
...
> > mstb->tx_slots[txmsg->seqno] = NULL; > > } > > - mgr->is_waiting_for_dwn_reply = false; > > - > > } > > out: > > if (unlikely(ret == -EIO) && > > drm_debug_enabled(DRM_UT_DP)) { @@ > > -1214,7 +1212,6 @@ static int > > drm_dp_mst_wait_tx_reply(struct drm_dp_mst_branch *mstb, > > } > > mutex_unlock(&mgr->qlock); > > > > - drm_dp_mst_kick_tx(mgr); > > return ret; > > } > > > > @@ -2797,11 +2794,9 @@ static void > > process_single_down_tx_qlock(struct > > drm_dp_mst_topology_mgr *mgr) > > ret = process_single_tx_qlock(mgr, txmsg, false); > > if (ret == 1) { > > /* txmsg is sent it should be in the slots now */ > > - mgr->is_waiting_for_dwn_reply = true; > > list_del(&txmsg->next); > > } else if (ret) { > > DRM_DEBUG_KMS("failed to send msg in q > > %d\n",

ret);

...
...
...
> > - mgr->is_waiting_for_dwn_reply = false; > > list_del(&txmsg->next); > > if (txmsg->seqno != -1) > > txmsg->dst->tx_slots[txmsg->seqno] = > > NULL; @@ > > -2841,8 > > +2836,7 @@ static void drm_dp_queue_down_tx(struct > > drm_dp_mst_topology_mgr *mgr, > > drm_dp_mst_dump_sideband_msg_tx(&p, txmsg); > > } > > > > - if (list_is_singular(&mgr->tx_msg_downq) && > > - !mgr->is_waiting_for_dwn_reply) > > + if (list_is_singular(&mgr->tx_msg_downq)) > > process_single_down_tx_qlock(mgr); > > mutex_unlock(&mgr->qlock); } @@ -3822,7 +3816,6 @@ > > static int drm_dp_mst_handle_down_rep(struct > > drm_dp_mst_topology_mgr *mgr) > > mutex_lock(&mgr->qlock); > > txmsg->state = DRM_DP_SIDEBAND_TX_RX; > > mstb->tx_slots[seqno] = NULL; > > - mgr->is_waiting_for_dwn_reply = false; > > mutex_unlock(&mgr->qlock); > > > > wake_up_all(&mgr->tx_waitq); @@ -3830,9 +3823,6 @@ > > static int drm_dp_mst_handle_down_rep(struct > > drm_dp_mst_topology_mgr *mgr) > > return 0; > > > > out_clear_reply: > > - mutex_lock(&mgr->qlock); > > - mgr->is_waiting_for_dwn_reply = false; > > - mutex_unlock(&mgr->qlock); > > if (msg) > > memset(msg, 0, sizeof(struct drm_dp_sideband_msg_rx)); > > out: > > @@ -4670,7 +4660,7 @@ static void drm_dp_tx_work(struct > > work_struct > > *work) > > struct drm_dp_mst_topology_mgr *mgr =

container_of(work,

...
...
...
> > struct drm_dp_mst_topology_mgr, tx_work); > > > > mutex_lock(&mgr->qlock); > > - if (!list_empty(&mgr->tx_msg_downq) > > && !mgr->is_waiting_for_dwn_reply) > > + if (!list_empty(&mgr->tx_msg_downq)) > > process_single_down_tx_qlock(mgr); > > mutex_unlock(&mgr->qlock); } diff --git > > a/include/drm/drm_dp_mst_helper.h > > b/include/drm/drm_dp_mst_helper.h index 7d0341f94ce1b..fcc30e64c8e7e > > 100644 > > --- a/include/drm/drm_dp_mst_helper.h > > +++ b/include/drm/drm_dp_mst_helper.h > > @@ -619,12 +619,6 @@ struct drm_dp_mst_topology_mgr { > > * &drm_dp_sideband_msg_tx.state once they are queued > > */ > > struct mutex qlock; > > - > > - /** > > - * @is_waiting_for_dwn_reply: indicate whether is waiting

for

...
...
...
down > > reply > > - */ > > - bool is_waiting_for_dwn_reply; > > - > > /** > > * @tx_msg_downq: List of pending down replies. > > */ > > -- > > Sean Paul, Software Engineer, Google / Chromium OS > -- > Wayne Lin

-- Best regards, Wayne Lin

-- Sean Paul, Software Engineer, Google / Chromium OS

-- Sean Paul, Software Engineer, Google / Chromium OS
-- Wayne Lin

-- Wayne Lin

Lyude Paul

14 Feb 14 Feb

11:33 p.m.

On Thu, 2020-02-13 at 16:15 -0500, Sean Paul wrote:

...

From: Sean Paul seanpaul@chromium.org

Hey all, Earlier this week on my 5.5 kernel (I can't seem to get a 5.6 kernel to behave on any of my devices), I ran into the multi-reply problem that Wayne fixed in January. Without realizing there was already a fix upstream, I went about solving it in different way. It wasn't until rebasing the patches on 5.6 for the list that I realized there was already a solution.

At any rate, I think this way of handling things might be a bit more performant. I'm not super happy with the cleanliness of the code, I think this series should make things easier to read, but I don't think I achieved that. So suggestions are welcome on how to break this apart.

Honestly it looks a bit cleaner to me. Sideband message parsing in MST is pretty complex, so I'd think the code's probably always going to be messy to some extent.

My only suggestion with cleaning things up - maybe we should stop calling it building a sideband reply, and call it re-assembling one? Seems a little less confusing, since we're really just taking the rx chunks and reassembling them into a full sideband message. I know at least I constantly find myself forgetting those functions are for rx and not tx.

So, I will give my r-b for the whole series, but...

Reviewed-by: Lyude Paul lyude@redhat.com

...I think we should definitely look more into what Wayne's talking about before pushing this, and see if it's widespread enough of an issue to be a concern. It does kind of suck how slow MST probing can be, so I'd wonder if we could try your idea of rate limiting. My one concern there is I'm not actually sure if there's anything in the DP MST protocol that indicates how many messages a hub can handle at the same time - it's always supposed to just be two iirc.

...

Thanks,

Sean

Sean Paul (3): drm/mst: Separate sideband packet header parsing from message building drm/mst: Support simultaneous down replies drm/dp_mst: Remove single tx msg restriction.

drivers/gpu/drm/drm_dp_mst_topology.c | 196 ++++++++++++++------------ include/drm/drm_dp_mst_helper.h | 65 ++++----- 2 files changed, 137 insertions(+), 124 deletions(-)

-- Cheers, Lyude Paul (she/her) Associate Software Engineer at Red Hat

Sean Paul

18 Feb 18 Feb

4:20 p.m.

On Fri, Feb 14, 2020 at 06:33:17PM -0500, Lyude Paul wrote:

...

On Thu, 2020-02-13 at 16:15 -0500, Sean Paul wrote:

...
From: Sean Paul seanpaul@chromium.org

Hey all, Earlier this week on my 5.5 kernel (I can't seem to get a 5.6 kernel to behave on any of my devices), I ran into the multi-reply problem that Wayne fixed in January. Without realizing there was already a fix upstream, I went about solving it in different way. It wasn't until rebasing the patches on 5.6 for the list that I realized there was already a solution.

At any rate, I think this way of handling things might be a bit more performant. I'm not super happy with the cleanliness of the code, I think this series should make things easier to read, but I don't think I achieved that. So suggestions are welcome on how to break this apart.

Honestly it looks a bit cleaner to me. Sideband message parsing in MST is pretty complex, so I'd think the code's probably always going to be messy to some extent.

My only suggestion with cleaning things up - maybe we should stop calling it building a sideband reply, and call it re-assembling one? Seems a little less confusing, since we're really just taking the rx chunks and reassembling them into a full sideband message. I know at least I constantly find myself forgetting those functions are for rx and not tx.

Good point, something like drm_dp_sideband_append_payload() might be more descriptive and less confusing. I'm happy to change this if we decide to go through with this set.

...

So, I will give my r-b for the whole series, but...

Reviewed-by: Lyude Paul lyude@redhat.com

...I think we should definitely look more into what Wayne's talking about before pushing this, and see if it's widespread enough of an issue to be a concern. It does kind of suck how slow MST probing can be, so I'd wonder if we could try your idea of rate limiting. My one concern there is I'm not actually sure if there's anything in the DP MST protocol that indicates how many messages a hub can handle at the same time - it's always supposed to just be two iirc.

I don't see an upper bound on pending downstream replies either. AFAICT from reading the spec, each endpoint must support 2 concurrent message requests. A forwarding device is responsible for reading the replies when it detects DOWN_REP_MSG_RDY. So each forwarding device has the ability (and should) rate- limit their own forwards.

I guess some forwarding devices might only read one reply when they get the IRQ and not circle back for the rest? We could probably come up with a heuristic for handling this, but it'd be a bit nasty and is probably not worth it (I'd guess the vast majority of MST usecases are depth==1).

Sean

...

...
Thanks,

Sean

Sean Paul (3): drm/mst: Separate sideband packet header parsing from message building drm/mst: Support simultaneous down replies drm/dp_mst: Remove single tx msg restriction.

drivers/gpu/drm/drm_dp_mst_topology.c | 196 ++++++++++++++------------ include/drm/drm_dp_mst_helper.h | 65 ++++----- 2 files changed, 137 insertions(+), 124 deletions(-)

-- Cheers, Lyude Paul (she/her) Associate Software Engineer at Red Hat

-- Sean Paul, Software Engineer, Google / Chromium OS

1884

Age (days ago)

1906

Last active (days ago)

dri-devel@lists.freedesktop.org

13 comments

3 participants

tags (0)

participants (3)

Lin, Wayne
Lyude Paul
Sean Paul