This patch series fixes usage of the etnaviv driver with GPUs behind a IOMMU. It was tested on a NXP LS1028A SoC. Together with Lucas' MMU patches [1] there are not more (GPU internal) MMU nor (system) IOMMU faults on the LS1028A.
[1] https://lists.freedesktop.org/archives/etnaviv/2021-August/003682.html
changes since v1: - don't move the former dma_mask setup code around in patch 2/3. Will avoid any confusion.
Michael Walle (3): drm/etnaviv: use PLATFORM_DEVID_NONE drm/etnaviv: fix dma configuration of the virtual device drm/etnaviv: use a 32 bit mask as coherent DMA mask
drivers/gpu/drm/etnaviv/etnaviv_drv.c | 41 ++++++++++++++++++++------- 1 file changed, 31 insertions(+), 10 deletions(-)
There is already a macro for the magic value. Use it.
Signed-off-by: Michael Walle michael@walle.cc Reviewed-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 7dcc6392792d..2509b3e85709 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -653,7 +653,7 @@ static int __init etnaviv_init(void) if (!of_device_is_available(np)) continue;
- pdev = platform_device_alloc("etnaviv", -1); + pdev = platform_device_alloc("etnaviv", PLATFORM_DEVID_NONE); if (!pdev) { ret = -ENOMEM; of_node_put(np);
The DMA configuration of the virtual device is inherited from the first actual etnaviv device. Unfortunately, this doesn't work with an IOMMU:
[ 5.191008] Failed to set up IOMMU for device (null); retaining platform DMA ops
This is because there is no associated iommu_group with the device. The group is set in iommu_group_add_device() which is eventually called by device_add() via the platform bus: device_add() blocking_notifier_call_chain() iommu_bus_notifier() iommu_probe_device() __iommu_probe_device() iommu_group_get_for_dev() iommu_group_add_device()
Move of_dma_configure() into the probe function, which is called after device_add(). Normally, the platform code will already call it itself if .of_node is set. Unfortunately, this isn't the case here.
Signed-off-by: Michael Walle michael@walle.cc --- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 2509b3e85709..54eb653ca295 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -589,6 +589,7 @@ static int compare_str(struct device *dev, void *data) static int etnaviv_pdev_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; + struct device_node *first_node = NULL; struct component_match *match = NULL;
if (!dev->platform_data) { @@ -598,6 +599,9 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) if (!of_device_is_available(core_node)) continue;
+ if (!first_node) + first_node = core_node; + drm_of_component_match_add(&pdev->dev, &match, compare_of, core_node); } @@ -609,6 +613,14 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) component_match_add(dev, &match, compare_str, names[i]); }
+ /* + * Apply the same DMA configuration to the virtual etnaviv + * device as the GPU we found. This assumes that all Vivante + * GPUs in the system share the same DMA constraints. + */ + if (first_node) + of_dma_configure(&pdev->dev, first_node, true); + return component_master_add_with_match(dev, &etnaviv_master_ops, match); }
@@ -662,13 +674,6 @@ static int __init etnaviv_init(void) pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
- /* - * Apply the same DMA configuration to the virtual etnaviv - * device as the GPU we found. This assumes that all Vivante - * GPUs in the system share the same DMA constraints. - */ - of_dma_configure(&pdev->dev, np, true); - ret = platform_device_add(pdev); if (ret) { platform_device_put(pdev);
The STLB and the first command buffer (which is used to set up the TLBs) has a 32 bit size restriction in hardware. There seems to be no way to specify addresses larger than 32 bit. Keep it simple and restict the addresses to the lower 4 GiB range for all coherent DMA memory allocations.
Please note, that platform_device_alloc() will initialize dev->dma_mask to point to pdev->platform_dma_mask, thus dma_set_mask() will work as expected.
While at it, move the dma_mask setup code to the of_dma_configure() to keep all the DMA setup code next to each other.
Suggested-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Michael Walle michael@walle.cc --- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 54eb653ca295..0b756ecb1bc2 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -613,6 +613,24 @@ static int etnaviv_pdev_probe(struct platform_device *pdev) component_match_add(dev, &match, compare_str, names[i]); }
+ /* + * PTA and MTLB can have 40 bit base addresses, but + * unfortunately, an entry in the MTLB can only point to a + * 32 bit base address of a STLB. Moreover, to initialize the + * MMU we need a command buffer with a 32 bit address because + * without an MMU there is only an indentity mapping between + * the internal 32 bit addresses and the bus addresses. + * + * To make things easy, we set the dma_coherent_mask to 32 + * bit to make sure we are allocating the command buffers and + * TLBs in the lower 4 GiB address space. + */ + if (dma_set_mask(&pdev->dev, DMA_BIT_MASK(40)) || + dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32))) { + dev_dbg(&pdev->dev, "No suitable DMA available\n"); + return -ENODEV; + } + /* * Apply the same DMA configuration to the virtual etnaviv * device as the GPU we found. This assumes that all Vivante @@ -671,8 +689,6 @@ static int __init etnaviv_init(void) of_node_put(np); goto unregister_platform_driver; } - pdev->dev.coherent_dma_mask = DMA_BIT_MASK(40); - pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
ret = platform_device_add(pdev); if (ret) {
Sorry I missed this earlier...
On 2021-09-07 17:49, Michael Walle wrote:
Since AFAICS you're not changing the default dma_mask pointer to point to some storage other than the coherent mask, the dma_set_mask() call effectively does nothing and both masks will end up reading back as 32 bits.
Robin.
On 2021-12-01 13:41, Lucas Stach wrote:
Urgh, apologies for the confusion - seems I had one of those mental short-circuits and was utterly convinced that that's what the platform device setup did, but of course it's only the fallback case in of_dma_configure(). Sorry for the noise!
Cheers, Robin.
dri-devel@lists.freedesktop.org