Re: RFC: Drm-connector properties managed by another driver / privacy screen support

15 Apr 2020

      Hi,
On 4/15/20 8:29 PM, Daniel Vetter wrote:
...
On Wed, Apr 15, 2020 at 8:19 PM Hans de Goede hdegoede@redhat.com wrote:
...
Hi,
On 4/15/20 7:54 PM, Daniel Vetter wrote:
...
On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
...
Hi,
On 4/15/20 2:01 PM, Daniel Vetter wrote:
...
On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
...
Hi,
On 4/15/20 12:22 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede hdegoede@redhat.com wrote:
>>
>> Hi,
>>
>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
<snip>
>>> iv. What every SoC subsystem does:
>>>
>>> - lcdshadow drivers register drivers
>>> - drm drivers look them up
>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
>>> the entire thing is assembled.
>>>
>>> That's what we're doing already for other standardized components like
>>> drm_bridge or drm_panel, and I think that's also the right approach
>>> for backlight and anything else like that. Hand-rolling our own
>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
>>> building a driver from component, which spans subsystems and is also
>>> the underlying magic that makes stuff like component.c work.
>>
>> On the SoCs we have devicetree telling us what components there
>> are, so we can wait for them to show up. The only way to figure out
>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
>> or duplicating a lot of code from thinkpad_acpi. Edit:
>> also see below for a possible solution.
>
> Yup it sucks. I think all we can do is have a small acpi match
> function (which yes will duplicate some of the thinkpad_acpi driver
> logic) to re-create that information and give us a "should we have a
> lcdshadow driver for this $pci_device" answer.
Ok, so questions about this solution:

Where should that match-function live ?

An acpi_thinkpad derived match-function will only be able to
   answer if there is an lcdshadow device/driver for the internal
   panel. It will not be able to tie this info to a certain PCI
   device. My plan is to pass NULL as dev_name when registering
   the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
   skip device-name matching (consider everything a match) for
   lcdshadow-devices registered with NULL as dev_name.
So I guess in this case the mini match function should just
   ignore the passed in device?

Yeah I think we can't really avoid that. I also expect that we'll need
ACPI and dt versions of this, and driver needs to know which one to call.
Since at least in a dt world the driver knows exactly for which dt node it
needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
more strict.
For the acpi version I'm not even sure we can do more than provide the
struct device * pointer of the gpu. I think if we ever get more than 1
lcdshadow driver on acpi systems we can add more stuff later on, for now
I'd just leave that out.
So maybe
acpi_lcdshadow_get(struct device *dev);
of_lcdshadow_get(struct device_node *np);
And with maybe a future plan to add some kind of enum or whatever to
acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
Ok, note I plan to only implement the acpi version for now, I do
expect some non ACPI/x86 devices to show up with his feature
eventually but I believe it is best to implement this once
those actually show up. Esp. since this will also involve adding
some devicetree bindings for this.
ofc, just wanted to lay out the entire thing. The DT version needs some
good bikeshed on the dt schema first anyway (so that the helper can decode
that directly).
...
...
We might also want a low-level lcdshadow_get() which only returns ENOENT
when the driver isn't there, and which leaves "do we really need one?" to
higher levels to answer.
Right, so my latest idea on that is indeed a high-level lcdshadow_get()
which takes a struct device * and a connector-name and which never
returns EPROBE_DEFER.
As for leaving things to the higher levels to answer, as explained
in my other follow-up email I think that we should probably add a
lcdshadow_probe_defer() helper for this and call that early on
in the PCI-driver probe functions for the 3 major x86 GPU drivers.
Does that sound ok to you?
Uh ... not pretty. There's still a lifetime problem that strictly speaking
there's nothing stopping the other driver from getting unloaded between
your _probe_defer and the subsequent _get. I think fixing this properly
(and screaming a bit at the error code, oh well) is better.
I would really like to separate the discussion and the work
on getting the 3 major x86 GPU drivers ready to deal with EPROBE_DEFER
from the lcdshadow discussion and work.  I expect getting these
3 drivers ready for EPROBE_DEFER is going to be a major undertaking
and I would like avoid introducing this significant scope creep
to the lcdshadow discussion, because it simply is a too big undertaking
to undertake without us getting a significant amount of manpower
specifically for this from somewhere.
Note I do agree with you that getting these 3 drivers ready
for EPROBE_DEFER handling is a worthwhile undertaking, but
it simply will take too much extra time and as such IMHO it
really is out of scope for the lcdshadow stuff.
I expect the amount of work (esp. also dealing with testing
and regressions) for the EPROBE_DEFER project by itself
to be a lot *more* work then the actual lcdshadow project.
So going with the assumption/decision that adding proper
EPROBE_DEFER handling to these 3 drivers is out of scope,
I believe that adding a lcdshadow_probe_defer() helper is
an ok solution/workaround for now.
As for your case of the other driver getting unloaded in between
the check and use of it, that can only happen by explicit user
action and in that case the worst thing what will happen
is the "privacy-screen" property missing from the connector,
which in that case is more or less exactly what the user
asked for.
For i915 we've had patches, for mei-hdcp integration. Until it became
clear that the mei subsystem is bonkers, and handles suspend/resume by
unloading! drivers.
Which forced i915 to unload and broke the world :-/
So at least for i915 the work should be done already, and just amount
to updating the patches Ram already had. No ideas about the other 2.
Ok.
...
What I don't really want is an interface with races. So if fixing the
other drivers is too much work, what we could do is roughly:

in the top-level probe function to an acpi_lcdshadow_get. This might

fail with EPROBE_DEFER.

this gives us a dangling reference, but we can drop that one when we

exit the top-level probe function (both on success and on error cases)

the 2nd acpi_lcdshadow_get call deep down should always succeed at

that point (since the top level holds a reference), so you could wrap
that in a WARNING(IS_ERR_PTR()). Ok that's a lie, if we concurrently
unload then the 2nd call still fails (the reference can never prevent
a hotunbind in the linux kernel device model), so this is exactly as
buggy as your version, so still needs a comment about that.
Adding a acpi_lcdshadow_probe_defer() function otoh just gives people
the impression that that's actually a correct way of doing this.
Then it's up to the driver maintainers whether they're ok with the
above hack of a fake reference, or not. As I said, I think for i915 it
should be fairly ok to just roll it out, but maybe the patches don't
apply at all anymore.
Ok trying to taking a ref early on and handling EPROBE_DEFER
at that first attempt to take a ref works for me. So lets go with
that. I will try to whip up a v1 patch-set for this, ETA aprox.
1-2 weeks I guess.
...
btw to make everything work you also need to set up a device_link
between the lcdshadow device (and it's driver, that's the more
important thing) and the gpu device. That device link will make sure
that

suspend/resume is done in the right order
driver load/unload is done in the right order, i.e. unloading of the

lcdshadow driver will automatically force an unbind of the gpu driver
first.
That is an interesting idea, but that does assume that their
is an actual struct device which the code handling the lcdshadow
binds to, which in case of thinkpad_acpi is not really the case.
Anyways passing in a parent device when registering a lcdshadow_dev
seems like a good idea and we can do the device_link thing if the parent
is non NULL.
Regards,
Hans

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: RFC: Drm-connector properties managed by another driver / privacy screen support