From: Rob Clark robdclark@chromium.org
One of the challenges we have to enable the aarch64 laptops upstream is dealing with the fact that the bootloader enables the display and takes the corresponding SMMU context-bank out of BYPASS. Unfortunately, currently, the IOMMU framework attaches a DMA (or potentially an IDENTITY) domain before the driver is probed and has a chance to intervene and shutdown scanout. Which makes things go horribly wrong.
But in this case, drm/msm is already directly managing it's IOMMUs directly, the DMA API attached iommu_domain simply gets in the way. This series adds a way that a driver can indicate to drivers/iommu that it does not wish to have an DMA managed iommu_domain attached. This way, drm/msm can shut down scanout cleanly before attaching it's own iommu_domain.
NOTE that to get things working with arm-smmu on the aarch64 laptops, you also need a patchset[1] from Bjorn Andersson to inherit SMMU config at boot, when it is already enabled.
[1] https://www.spinics.net/lists/arm-kernel/msg732246.html
NOTE that in discussion of previous revisions, RMRR came up. This is not really a replacement for RMRR (nor does RMRR really provide any more information than we already get from EFI GOP, or DT in the simplefb case). I also don't see how RMRR could help w/ SMMU handover of CB/SMR config (Bjorn's patchset[1]) without defining new tables.
This perhaps doesn't solve the more general case of bootloader enabled display for drivers that actually want to use DMA API managed IOMMU. But it does also happen to avoid a related problem with GPU, caused by the DMA domain claiming the context bank that the GPU firmware expects to use. And it avoids spurious TLB invalidation coming from the unused DMA domain. So IMHO this is a useful and necessary change.
Rob Clark (2): iommu: add support for drivers that manage iommu explicitly drm/msm: mark devices where iommu is managed by driver
drivers/gpu/drm/msm/adreno/adreno_device.c | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 1 + drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 1 + drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/iommu/iommu.c | 2 +- drivers/iommu/of_iommu.c | 3 +++ include/linux/device.h | 3 ++- 7 files changed, 10 insertions(+), 2 deletions(-)
From: Rob Clark robdclark@chromium.org
Avoid attaching any non-driver managed domain if the driver indicates that it manages the iommu directly.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/iommu/iommu.c | 2 +- drivers/iommu/of_iommu.c | 3 +++ include/linux/device.h | 3 ++- 3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0c674d80c37f..2ac5e8d48cae 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -674,7 +674,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); - if (group->domain) + if (group->domain && !(dev->driver && dev->driver->driver_manages_iommu)) ret = __iommu_attach_device(group->domain, dev); mutex_unlock(&group->mutex); if (ret) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 614a93aa5305..62b47e384a77 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -221,6 +221,9 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, } else if (err < 0) { dev_dbg(dev, "Adding to IOMMU failed: %d\n", err); ops = NULL; + } else if (dev->driver && dev->driver->driver_manages_iommu) { + dev_dbg(dev, "Driver manages IOMMU\n"); + ops = NULL; }
return ops; diff --git a/include/linux/device.h b/include/linux/device.h index 1aa341b2a0db..b77a11b8d9bb 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -284,7 +284,8 @@ struct device_driver { struct module *owner; const char *mod_name; /* used for built-in modules */
- bool suppress_bind_attrs; /* disables bind/unbind via sysfs */ + bool suppress_bind_attrs:1; /* disables bind/unbind via sysfs */ + bool driver_manages_iommu:1; /* driver manages IOMMU explicitly */ enum probe_type probe_type;
const struct of_device_id *of_match_table;
On Fri, Sep 06, 2019 at 02:44:01PM -0700, Rob Clark wrote:
@@ -674,7 +674,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices);
- if (group->domain)
- if (group->domain && !(dev->driver && dev->driver->driver_manages_iommu))
Hmm, this code usually runs at enumeration time when no driver is attached to the device. Actually it would be pretty dangerous when this code runs while a driver is attached to the device. How does that change make things work for you?
Regards,
Joerg
On Tue, Sep 10, 2019 at 1:14 AM Joerg Roedel joro@8bytes.org wrote:
On Fri, Sep 06, 2019 at 02:44:01PM -0700, Rob Clark wrote:
@@ -674,7 +674,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices);
if (group->domain)
if (group->domain && !(dev->driver && dev->driver->driver_manages_iommu))
Hmm, this code usually runs at enumeration time when no driver is attached to the device. Actually it would be pretty dangerous when this code runs while a driver is attached to the device. How does that change make things work for you?
I was seeing this get called via the path driver_probe_device() -> platform_dma_configure() -> of_dma_configure() -> of_iommu_configure() -> iommu_probe_device() -> ...
The only cases I was seeing where dev->driver is NULL where a few places that drivers call of_dma_configure() on their own sub-devices. But maybe there are some other paths that I did not notice?
BR, -R
On 10/09/2019 16:34, Rob Clark wrote:
On Tue, Sep 10, 2019 at 1:14 AM Joerg Roedel joro@8bytes.org wrote:
On Fri, Sep 06, 2019 at 02:44:01PM -0700, Rob Clark wrote:
@@ -674,7 +674,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices);
if (group->domain)
if (group->domain && !(dev->driver && dev->driver->driver_manages_iommu))
Hmm, this code usually runs at enumeration time when no driver is attached to the device. Actually it would be pretty dangerous when this code runs while a driver is attached to the device. How does that change make things work for you?
I was seeing this get called via the path driver_probe_device() -> platform_dma_configure() -> of_dma_configure() -> of_iommu_configure() -> iommu_probe_device() -> ...
The only cases I was seeing where dev->driver is NULL where a few places that drivers call of_dma_configure() on their own sub-devices. But maybe there are some other paths that I did not notice?
For the of_iommu flow, it very much depends on your DT layout and driver probe order as to whether you catch the "proper" call from iommu_bus_notifier()/iommu_bus_init() or the late "replay" from of_iommu_configure(). I wouldn't make any assumptions of consistency.
Robin.
From: Rob Clark robdclark@chromium.org
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/adreno/adreno_device.c | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 1 + drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 1 + drivers/gpu/drm/msm/msm_drv.c | 1 + 4 files changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 7f750a9510a5..19f2bd2d6cb4 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -452,6 +452,7 @@ static struct platform_driver adreno_driver = { .name = "adreno", .of_match_table = dt_match, .pm = &adreno_pm_ops, + .driver_manages_iommu = true, }, };
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index 5751815a26d7..dec8cc6b64dc 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -1094,6 +1094,7 @@ static struct platform_driver dpu_driver = { .name = "msm_dpu", .of_match_table = dpu_dt_match, .pm = &dpu_pm_ops, + .driver_manages_iommu = true, }, };
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c index d93de3a569b4..eff1b000258e 100644 --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c @@ -1134,6 +1134,7 @@ static struct platform_driver mdp5_driver = { .name = "msm_mdp", .of_match_table = mdp5_dt_match, .pm = &mdp5_pm_ops, + .driver_manages_iommu = true, }, };
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 3a4fd20a33e8..336a6d0a4cd3 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -1388,6 +1388,7 @@ static struct platform_driver msm_platform_driver = { .name = "msm", .of_match_table = dt_match, .pm = &msm_pm_ops, + .driver_manages_iommu = true, }, };
On 06/09/2019 22:44, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
One of the challenges we have to enable the aarch64 laptops upstream is dealing with the fact that the bootloader enables the display and takes the corresponding SMMU context-bank out of BYPASS. Unfortunately, currently, the IOMMU framework attaches a DMA (or potentially an IDENTITY) domain before the driver is probed and has a chance to intervene and shutdown scanout. Which makes things go horribly wrong.
Nope, things already went horribly wrong in arm_smmu_device_reset() - sure, sometimes for some configurations it might *seem* like they didn't and that you can fudge the context bank state at arm's length from core code later, but the truth is that impl->cfg_probe is your last chance to guarantee that any necessary SMMU state is preserved.
The remainder of the problem involves reworking default domain allocation such that we can converge on what iommu_request_dm_for_dev() currently does but without the momentary attachment to a translation domain to cause hiccups. That's starting here:
https://lore.kernel.org/linux-iommu/cover.1566353521.git.sai.praneeth.prakhy...
But in this case, drm/msm is already directly managing it's IOMMUs directly, the DMA API attached iommu_domain simply gets in the way. This series adds a way that a driver can indicate to drivers/iommu that it does not wish to have an DMA managed iommu_domain attached. This way, drm/msm can shut down scanout cleanly before attaching it's own iommu_domain.
NOTE that to get things working with arm-smmu on the aarch64 laptops, you also need a patchset[1] from Bjorn Andersson to inherit SMMU config at boot, when it is already enabled.
[1] https://www.spinics.net/lists/arm-kernel/msg732246.html
NOTE that in discussion of previous revisions, RMRR came up. This is not really a replacement for RMRR (nor does RMRR really provide any more information than we already get from EFI GOP, or DT in the simplefb case). I also don't see how RMRR could help w/ SMMU handover of CB/SMR config (Bjorn's patchset[1]) without defining new tables.
The point of RMRR-like-things is that they identify not just the memory region but also the specific device accessing them, which means the IOMMU driver knows up-front which IDs etc. it must be careful not to disrupt. Obviously for SMMU that *would* be some new table (designed to encompass everything relevant) since literal RMRRs are specifically an Intel VT-d thing.
This perhaps doesn't solve the more general case of bootloader enabled display for drivers that actually want to use DMA API managed IOMMU. But it does also happen to avoid a related problem with GPU, caused by the DMA domain claiming the context bank that the GPU firmware expects to use.
Careful bringing that up again, or I really will rework the context bank allocator to avoid this default domain problem entirely... ;)
Robin.
And it avoids spurious TLB invalidation coming from the unused DMA domain. So IMHO this is a useful and necessary change.
Rob Clark (2): iommu: add support for drivers that manage iommu explicitly drm/msm: mark devices where iommu is managed by driver
drivers/gpu/drm/msm/adreno/adreno_device.c | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 1 + drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 1 + drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/iommu/iommu.c | 2 +- drivers/iommu/of_iommu.c | 3 +++ include/linux/device.h | 3 ++- 7 files changed, 10 insertions(+), 2 deletions(-)
On Tue, Sep 10, 2019 at 9:34 AM Robin Murphy robin.murphy@arm.com wrote:
On 06/09/2019 22:44, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
One of the challenges we have to enable the aarch64 laptops upstream is dealing with the fact that the bootloader enables the display and takes the corresponding SMMU context-bank out of BYPASS. Unfortunately, currently, the IOMMU framework attaches a DMA (or potentially an IDENTITY) domain before the driver is probed and has a chance to intervene and shutdown scanout. Which makes things go horribly wrong.
Nope, things already went horribly wrong in arm_smmu_device_reset() - sure, sometimes for some configurations it might *seem* like they didn't and that you can fudge the context bank state at arm's length from core code later, but the truth is that impl->cfg_probe is your last chance to guarantee that any necessary SMMU state is preserved.
cfg_probe is where Bjorn's patch is preserving the SMMU state. So I think that should ensure device_reset() preserves the configuration.. or at least if something is missing, that seems fixable.
The remainder of the problem involves reworking default domain allocation such that we can converge on what iommu_request_dm_for_dev() currently does but without the momentary attachment to a translation domain to cause hiccups. That's starting here:
https://lore.kernel.org/linux-iommu/cover.1566353521.git.sai.praneeth.prakhy...
I suppose if the stream-match state and bootloader chosen context bank is preserved, then keeping it direct-mapped would avoid things starting to fault before display driver is probed. That plus some solution for GPU default domain would narrow the scope of what I need to just avoiding getting iommu dma_ops installed.
But in this case, drm/msm is already directly managing it's IOMMUs directly, the DMA API attached iommu_domain simply gets in the way. This series adds a way that a driver can indicate to drivers/iommu that it does not wish to have an DMA managed iommu_domain attached. This way, drm/msm can shut down scanout cleanly before attaching it's own iommu_domain.
NOTE that to get things working with arm-smmu on the aarch64 laptops, you also need a patchset[1] from Bjorn Andersson to inherit SMMU config at boot, when it is already enabled.
[1] https://www.spinics.net/lists/arm-kernel/msg732246.html
NOTE that in discussion of previous revisions, RMRR came up. This is not really a replacement for RMRR (nor does RMRR really provide any more information than we already get from EFI GOP, or DT in the simplefb case). I also don't see how RMRR could help w/ SMMU handover of CB/SMR config (Bjorn's patchset[1]) without defining new tables.
The point of RMRR-like-things is that they identify not just the memory region but also the specific device accessing them, which means the IOMMU driver knows up-front which IDs etc. it must be careful not to disrupt. Obviously for SMMU that *would* be some new table (designed to encompass everything relevant) since literal RMRRs are specifically an Intel VT-d thing.
Perhaps I'm not looking in the right place, but the extent of what I could find about RMRR tables was:
https://github.com/tianocore/edk2/blob/master/MdePkg/Include/IndustryStandar...
I couldn't really see how that specifies the device. But entirely possible that I'm not seeing the whole picture.
I am a bit curious about how windows handles this, since they must have the same problem on these laptops.
This perhaps doesn't solve the more general case of bootloader enabled display for drivers that actually want to use DMA API managed IOMMU. But it does also happen to avoid a related problem with GPU, caused by the DMA domain claiming the context bank that the GPU firmware expects to use.
Careful bringing that up again, or I really will rework the context bank allocator to avoid this default domain problem entirely... ;)
That doesn't seem like a bad outcome ;-)
BR, -R
Robin.
And it avoids spurious TLB invalidation coming from the unused DMA domain. So IMHO this is a useful and necessary change.
Rob Clark (2): iommu: add support for drivers that manage iommu explicitly drm/msm: mark devices where iommu is managed by driver
drivers/gpu/drm/msm/adreno/adreno_device.c | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 1 + drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 1 + drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/iommu/iommu.c | 2 +- drivers/iommu/of_iommu.c | 3 +++ include/linux/device.h | 3 ++- 7 files changed, 10 insertions(+), 2 deletions(-)
On Tue, Sep 10, 2019 at 10:10:49AM -0700, Rob Clark wrote:
On Tue, Sep 10, 2019 at 9:34 AM Robin Murphy robin.murphy@arm.com wrote:
On 06/09/2019 22:44, Rob Clark wrote:
NOTE that in discussion of previous revisions, RMRR came up. This is not really a replacement for RMRR (nor does RMRR really provide any more information than we already get from EFI GOP, or DT in the simplefb case). I also don't see how RMRR could help w/ SMMU handover of CB/SMR config (Bjorn's patchset[1]) without defining new tables.
The point of RMRR-like-things is that they identify not just the memory region but also the specific device accessing them, which means the IOMMU driver knows up-front which IDs etc. it must be careful not to disrupt. Obviously for SMMU that *would* be some new table (designed to encompass everything relevant) since literal RMRRs are specifically an Intel VT-d thing.
Perhaps I'm not looking in the right place, but the extent of what I could find about RMRR tables was:
https://github.com/tianocore/edk2/blob/master/MdePkg/Include/IndustryStandar...
I couldn't really see how that specifies the device. But entirely possible that I'm not seeing the whole picture.
I don't think anybody was planning to implement RMRR "as-is" for arm64, so you might be better off looking at the proposal from Thierry, although it has some issues that are still to be resolved:
http://lkml.kernel.org/r/20190829111407.17191-1-thierry.reding@gmail.com
Will
dri-devel@lists.freedesktop.org