On Tue, Jan 15, 2019 at 11:47:00PM +0100, Rafael J. Wysocki wrote:
From: Rafael J. Wysocki rafael.j.wysocki@intel.com Subject: [PATCH] driver core: Fix adding device links to probing suppliers
Currently, it is not valid to add a device link from a consumer driver ->probe() callback to a supplier that is still probing too, but generally this is a valid use case. For example, if the consumer has just acquired a resource that can only be available when the supplier is functional, adding a device link to that supplier right away should be safe (and even desirable arguably), but device_link_add() doesn't handle that case correctly and the initial state of the link created by it is wrong then.
To address this problem, change the initial state of device links added between a probing supplier and a probing consumer to DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to skip such links on the supplier side.
With this change, if the supplier probe completes first, device_links_driver_bound() called for it will skip the link state update and when it is called for the consumer, the link state will be updated to "active". In turn, if the consumer probe completes first, device_links_driver_bound() called for it will change the state of the link to "active" and when it is called for the supplier, the link status update will be skipped.
However, in principle the supplier or consumer probe may still fail after the link has been added, so modify device_links_no_driver() to change device links in the "active" or "consumer probe" state to "dormant" on the supplier side and update __device_links_no_driver() to change the link state to "available" only if it is "consumer probe" or "active".
Then, if the supplier probe fails first, the leftover link to the probing consumer will become "dormant" and device_links_no_driver() called for the consumer (when its probe fails) will clean it up. In turn, if the consumer probe fails first, it will either drop the link, or change its state to "available" and, in the latter case, when device_links_no_driver() is called for the supplier, it will update the link state to "dormant". [If the supplier probe fails, but the consumer probe succeeds, which should not happen as long as the consumer driver is correct, the link still will be around, but it will be "dormant" until the supplier is probed again.]
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
Hi Rafael,
I've tried this with Armada DRM without using the component helper for bridges, and it seems to work up to a point (I have a vga bridge here to allow further testing):
[ 2.323441] armada-drm display-subsystem: assigned reserved memory node framebuffer [ 2.332001] armada-drm display-subsystem: bound f1820000.lcd-controller (ops armada_lcd_ops) [ 2.340790] armada-drm display-subsystem: bound f1810000.lcd-controller (ops armada_lcd_ops) [ 2.349345] armada-drm display-subsystem: node=/i2c-mux/i2c@0/hdmi-encoder@70 [ 2.356719] armada-drm display-subsystem: panel=fffffdfb bridge= (null) ret=0 [ 2.364038] armada-drm display-subsystem: Linked as a consumer to 1-0070 [ 2.370818] armada-drm display-subsystem: bridge=ee8cda2c ret=0 [ 2.376894] armada-drm display-subsystem: node=/vga-bridge [ 2.382453] armada-drm display-subsystem: panel=fffffdfb bridge=(null) ret=0 [ 2.389762] armada-drm display-subsystem: Linked as a consumer to vga-bridge [ 2.396883] armada-drm display-subsystem: bridge=ef3bec40 ret=0
When I remove the HDMI encoder:
root@cubox:/sys/bus/i2c/drivers/tda998x# echo 1-0070 > unbind
then I get:
[ 1013.824860] Console: switching to colour dummy device 80x30 [ 1013.866785] armada-drm display-subsystem: Dropping the link to vga-bridge [ 1013.867126] armada-drm display-subsystem: Dropping the link to 1-0070
which looks like it did what was expected - the DRM device is indeed unbound, the nodes in /dev/dri are gone. When rebinding the HDMI encoder:
[ 1015.864703] tda998x 1-0070: found TDA19988 [ 1015.898078] tda9950 1-0034: TDA9950 CEC interface, hardware version 3.3 [ 1015.941374] Registered IR keymap rc-cec [ 1015.941684] rc rc0: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0 [ 1015.942439] input: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0/input2 [ 1016.982921] alloc_contig_range: [10390, 10490) PFNs busy [ 1016.987400] kirkwood-spdif-audio audio-subsystem: snd-soc-dummy-dai <-> kirkwood-fe mapping ok [ 1016.987591] kirkwood-spdif-audio audio-subsystem: multicodec <-> kirkwood-spdif mapping ok
but the DRM stuff doesn't come back - this is what Daniel was talking about further down this thread.
I guess the kernel can't know that it should come back because the device links were dropped at the unbind stage, which means the kernel has lost the information necessary to know that the display subsystem is dependent on the presence of the HDMI encoder. I don't see an easy way around that.
If we keep the device links after an unbind event, then, because we create them during probe, we'll be attempting to recreate the link when we reattach the supplier to the consumer. If we only allow one instance, then what does device_link_add() return.
Maybe it is going to be less painful to push everything bridge-related to use the component helper after all. Dunno. Problems either way.
Documentation/driver-api/device_link.rst | 10 ++-- drivers/base/core.c | 74 +++++++++++++++++++++++++++---- 2 files changed, 73 insertions(+), 11 deletions(-)
Index: linux-pm/drivers/base/core.c
--- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -260,17 +260,26 @@ struct device_link *device_link_add(stru link->status = DL_STATE_NONE; } else { switch (supplier->links.status) {
case DL_DEV_DRIVER_BOUND:
case DL_DEV_PROBING: switch (consumer->links.status) { case DL_DEV_PROBING: /*
* Some callers expect the link creation during
* consumer driver probe to resume the supplier
* even without DL_FLAG_RPM_ACTIVE.
* A consumer driver can create a link to a
* supplier that has not completed its probing
* yet as long as it knows that the supplier is
* already functional (for example, it has just
* acquired some resources from the supplier). */
if (flags & DL_FLAG_PM_RUNTIME)
pm_runtime_resume(supplier);
link->status = DL_STATE_CONSUMER_PROBE;
break;
default:
link->status = DL_STATE_DORMANT;
break;
}
break;
case DL_DEV_DRIVER_BOUND:
switch (consumer->links.status) {
case DL_DEV_PROBING: link->status = DL_STATE_CONSUMER_PROBE; break; case DL_DEV_DRIVER_BOUND:
@@ -291,6 +300,14 @@ struct device_link *device_link_add(stru }
/*
* Some callers expect the link creation during consumer driver probe to
* resume the supplier even without DL_FLAG_RPM_ACTIVE.
*/
- if (link->status == DL_STATE_CONSUMER_PROBE &&
flags & DL_FLAG_PM_RUNTIME)
pm_runtime_resume(supplier);
- /*
- Move the consumer and all of the devices depending on it to the end
- of dpm_list and the devices_kset list.
@@ -474,6 +491,16 @@ void device_links_driver_bound(struct de if (link->flags & DL_FLAG_STATELESS) continue;
/*
* Links created during consumer probe may be in the "consumer
* probe" state to start with if the supplier is still probing
* when they are created and they may become "active" if the
* consumer probe returns first. Skip them here.
*/
if (link->status == DL_STATE_CONSUMER_PROBE ||
link->status == DL_STATE_ACTIVE)
continue;
- WARN_ON(link->status != DL_STATE_DORMANT); WRITE_ONCE(link->status, DL_STATE_AVAILABLE); }
@@ -513,17 +540,48 @@ static void __device_links_no_driver(str
if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER) kref_put(&link->kref, __device_link_del);
else if (link->status != DL_STATE_SUPPLIER_UNBIND)
else if (link->status == DL_STATE_CONSUMER_PROBE ||
link->status == DL_STATE_ACTIVE) WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
}
dev->links.status = DL_DEV_NO_DRIVER;
}
+/**
- device_links_no_driver - Update links after failing driver probe.
- @dev: Device whose driver has just failed to probe.
- Clean up leftover links to consumers for @dev and invoke
- %__device_links_no_driver() to update links to suppliers for it as
- appropriate.
- Links with the DL_FLAG_STATELESS flag set are ignored.
- */
void device_links_no_driver(struct device *dev) {
- struct device_link *link;
- device_links_write_lock();
- list_for_each_entry(link, &dev->links.consumers, s_node) {
if (link->flags & DL_FLAG_STATELESS)
continue;
/*
* The probe has failed, so if the status of the link is
* "consumer probe" or "active", it must have been added by
* a probing consumer while this device was still probing.
* Change its state to "dormant", as it represents a valid
* relationship, but it is not functionally meaningful.
*/
if (link->status == DL_STATE_CONSUMER_PROBE ||
link->status == DL_STATE_ACTIVE)
WRITE_ONCE(link->status, DL_STATE_DORMANT);
- }
- __device_links_no_driver(dev);
- device_links_write_unlock();
}
Index: linux-pm/Documentation/driver-api/device_link.rst
--- linux-pm.orig/Documentation/driver-api/device_link.rst +++ linux-pm/Documentation/driver-api/device_link.rst @@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti
Another example for an inconsistent state would be a device link that represents a driver presence dependency, yet is added from the consumer's -``->probe`` callback while the supplier hasn't probed yet: Had the driver -core known about the device link earlier, it wouldn't have probed the +``->probe`` callback while the supplier hasn't started to probe yet: Had the +driver core known about the device link earlier, it wouldn't have probed the consumer in the first place. The onus is thus on the consumer to check presence of the supplier after adding the link, and defer probing on -non-presence. +non-presence. [Note that it is valid to create a link from the consumer's +``->probe`` callback while the supplier is still probing, but the consumer must +know that the supplier is functional already at the link creation time (that is +the case, for instance, if the consumer has just acquired some resources that +would not have been available had the supplier not been functional then).]
If a device link is added in the ``->probe`` callback of the supplier or consumer driver, it is typically deleted in its ``->remove`` callback for