Hi,
We've had for quite some time to hack around in our drivers to take into account the fact that our DMA accesses are not done through the parent node, but through another bus with a different mapping than the CPU for the RAM (0 instead of 0x40000000 for most SoCs).
After some discussion after the submission of a camera device suffering of the same hacks, I've decided to put together a serie that introduce a property called dma-parent that allows to express the DMA relationship between a master and its bus, even if they are not direct parents in the DT.
Let me know what you think, Maxime
Maxime Ripard (7): dt-bindings: Add a dma-parent property dt-bindings: bus: Add binding for the Allwinner MBUS controller of: address: Add parent pointer to the __of_translate_address args of: address: Add support for the dma-parent property drm/sun4i: Rely on dma-parent for our RAM offset clk: sunxi-ng: sun5i: Export the MBUS clock ARM: dts: sun5i: Add the MBUS controller
Documentation/devicetree/bindings/sunxi-mbus.txt | 35 ++++++++++++++- Documentation/devicetree/booting-without-of.txt | 10 ++++- arch/arm/boot/dts/sun5i.dtsi | 11 ++++- drivers/clk/sunxi-ng/ccu-sun5i.h | 4 +-- drivers/gpu/drm/sun4i/sun4i_backend.c | 28 ++++++++--- drivers/of/address.c | 43 +++++++++++++---- include/dt-bindings/clock/sun5i-ccu.h | 2 +- 7 files changed, 111 insertions(+), 22 deletions(-) create mode 100644 Documentation/devicetree/bindings/sunxi-mbus.txt
base-commit: 4a3928c6f8a53fa1aed28ccba227742486e8ddcb
The current DT bindings assume that the DMA will be performed by the devices through their parent DT node, and rely on that assumption for the address translation using dma-ranges.
However, some SoCs have devices that will perform DMA through another bus, with separate address translation rules. We therefore need to express that relationship, through the dma-parent property.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- Documentation/devicetree/booting-without-of.txt | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt index e86bd2f64117..4a65c943c02d 100644 --- a/Documentation/devicetree/booting-without-of.txt +++ b/Documentation/devicetree/booting-without-of.txt @@ -1403,8 +1403,15 @@ In addition, each DMA master device on the DMA bus may or may not support coherent DMA operations. The "dma-coherent" property is intended to be used for identifying devices supported coherent DMA operations in DT.
+Some devices will also perform DMA through another bus than their parent +control bus. In such a case, the "dma-parent" property is intended to express +that relationship to another device in DT that will be the DMA parent bus. + * DMA Bus master Optional property: +- #dma-parent-cells: <integer> + The #dma-parent-cells property defines the width of the cells used to + represent the DMA parent. - dma-ranges: <prop-encoded-array> encoded as arbitrary number of triplets of (child-bus-address, parent-bus-address, length). Each triplet specified describes a contiguous DMA address range. @@ -1420,6 +1427,9 @@ Optional property: - dma-ranges: <empty> value. if present - It means that DMA addresses translation has to be enabled for this device. - dma-coherent: Present if dma operations are coherent +- dma-parent: List of phandles and their optional arguments according to the + #dma-parent-cells from the provider. Expresses the routing of DMA if it + doesn't go through the parent node, but some other node in the device tree.
Example: soc {
The MBUS controller drives the MBUS that other devices in the SoC will use to perform DMA. It also has a register interface that allows to monitor and control the bandwidth and priorities for masters on that bus.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- Documentation/devicetree/bindings/sunxi-mbus.txt | 35 +++++++++++++++++- 1 file changed, 35 insertions(+) create mode 100644 Documentation/devicetree/bindings/sunxi-mbus.txt
diff --git a/Documentation/devicetree/bindings/sunxi-mbus.txt b/Documentation/devicetree/bindings/sunxi-mbus.txt new file mode 100644 index 000000000000..436df0cac9d0 --- /dev/null +++ b/Documentation/devicetree/bindings/sunxi-mbus.txt @@ -0,0 +1,35 @@ +Allwinner Memory Bus (MBUS) controller + +The MBUS controller drives the MBUS that other devices in the SoC will +use to perform DMA. It also has a register interface that allows to +monitor and control the bandwidth and priorities for masters on that +bus. + +Required properties: + - compatible: Must be one of: + - allwinner,sun5i-a13-mbus + - reg: Offset and length of the register set for the controller + - clocks: phandle to the clock driving the controller + - dma-ranges: see booting-without-of.txt + - #dma-parent-cells: Must be one, with the argument being the MBUS port + ID + +Each device having to perform their DMA through the MBUS must have the +dma-parent property set to the MBUS controller, as documented in +booting-without-of.txt. + +Example: + +mbus: dram-controller@1c01000 { + compatible = "allwinner,sun5i-a13-mbus"; + reg = <0x01c01000 0x1000>; + clocks = <&ccu CLK_MBUS>; + dma-ranges = <0x00000000 0x40000000 0x20000000>; + #dma-parent-cells = <1>; +}; + +fe0: display-frontend@1e00000 { + compatible = "allwinner,sun5i-a13-display-frontend"; + ... + dma-parent = <&mbus 19>; +};
The __of_translate_address function is used to translate the device tree addresses to physical addresses using the various ranges property to create the offset.
However, it's shared between the CPU addresses (based on the ranges property) and the DMA addresses (based on dma-ranges). Since we're going to add support for a DMA parent node that is not the DT parent node, we need to change the logic a bit to have an optional parent node that we should use.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- drivers/of/address.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index ce4d3d8b85de..e3267728a0cb 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -562,9 +562,9 @@ static int of_translate_one(struct device_node *parent, struct of_bus *bus, * that way, but this is traditionally the way IBM at least do things */ static u64 __of_translate_address(struct device_node *dev, + struct device_node *parent, const __be32 *in_addr, const char *rprop) { - struct device_node *parent = NULL; struct of_bus *bus, *pbus; __be32 addr[OF_MAX_ADDR_CELLS]; int na, ns, pna, pns; @@ -575,10 +575,13 @@ static u64 __of_translate_address(struct device_node *dev, /* Increase refcount at current level */ of_node_get(dev);
- /* Get parent & match bus type */ - parent = of_get_parent(dev); - if (parent == NULL) - goto bail; + if (!parent) { + /* Get parent & match bus type */ + parent = of_get_parent(dev); + if (parent == NULL) + goto bail; + } + bus = of_match_bus(parent);
/* Count address cells & copy address locally */ @@ -638,13 +641,13 @@ static u64 __of_translate_address(struct device_node *dev,
u64 of_translate_address(struct device_node *dev, const __be32 *in_addr) { - return __of_translate_address(dev, in_addr, "ranges"); + return __of_translate_address(dev, NULL, in_addr, "ranges"); } EXPORT_SYMBOL(of_translate_address);
u64 of_translate_dma_address(struct device_node *dev, const __be32 *in_addr) { - return __of_translate_address(dev, in_addr, "dma-ranges"); + return __of_translate_address(dev, NULL, in_addr, "dma-ranges"); } EXPORT_SYMBOL(of_translate_dma_address);
Some SoCs have devices that are using a separate bus from the main bus to perform DMA.
These buses might have some restrictions and/or different mapping than from the CPU side, so we'd need to express those using the usual dma-ranges, but using a different DT node than the node's parent.
Add support for a dma-parent property that links to the DMA bus used by the device in such a case.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- drivers/of/address.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index e3267728a0cb..cc82f825ee03 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -647,7 +647,17 @@ EXPORT_SYMBOL(of_translate_address);
u64 of_translate_dma_address(struct device_node *dev, const __be32 *in_addr) { - return __of_translate_address(dev, NULL, in_addr, "dma-ranges"); + struct of_phandle_args args; + struct device_node *parent = NULL; + int ret; + + ret = of_parse_phandle_with_args(dev, "dma-parent", + "#dma-parent-cells", + 0, &args); + if (!ret) + parent = args.np; + + return __of_translate_address(dev, parent, in_addr, "dma-ranges"); } EXPORT_SYMBOL(of_translate_dma_address);
@@ -847,11 +857,21 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz return -EINVAL;
while (1) { + struct of_phandle_args args; + naddr = of_n_addr_cells(node); nsize = of_n_size_cells(node); - node = of_get_next_parent(node); - if (!node) - break; + + ret = of_parse_phandle_with_args(node, "dma-parent", + "#dma-parent-cells", + 0, &args); + if (!ret) { + node = args.np; + } else { + node = of_get_next_parent(node); + if (!node) + break; + }
ranges = of_get_property(node, "dma-ranges", &len);
Now that we can express our DMA topology, rely on those property instead of hardcoding an offset from the dma_addr_t which wasn't really great.
We still need to add some code to deal with the old DT that would lack that property, but we move the offset to the DRM device dma_pfn_offset to be able to rely on just the dma_addr_t associated to the GEM object.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- drivers/gpu/drm/sun4i/sun4i_backend.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c index 847eecbe4d14..04e85d3ca36e 100644 --- a/drivers/gpu/drm/sun4i/sun4i_backend.c +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c @@ -222,13 +222,6 @@ int sun4i_backend_update_layer_buffer(struct sun4i_backend *backend, paddr = drm_fb_cma_get_gem_addr(fb, state, 0); DRM_DEBUG_DRIVER("Setting buffer address to %pad\n", &paddr);
- /* - * backend DMA accesses DRAM directly, bypassing the system - * bus. As such, the address range is different and the buffer - * address needs to be corrected. - */ - paddr -= PHYS_OFFSET; - /* Write the 32 lower bits of the address (in bits) */ lo_paddr = paddr << 3; DRM_DEBUG_DRIVER("Setting address lower bits to 0x%x\n", lo_paddr); @@ -361,6 +354,27 @@ static int sun4i_backend_bind(struct device *dev, struct device *master, return -ENOMEM; dev_set_drvdata(dev, backend);
+ if (of_find_property(dev->of_node, "dma-parent", NULL)) { + /* + * This assume we have the same DMA constraints for all our the + * devices in our pipeline (all the backends, but also the + * frontends). This sounds bad, but it has always been the case + * for us, and DRM doesn't do per-device allocation either, so + * we would need to fix DRM first... + */ + ret = of_dma_configure(drm->dev, dev->of_node); + if (ret) + return ret; + } else { + /* + * If we don't have the dma-parent property, most likely + * because of an old DT, we need to set the DMA offset by hand + * on our device since the RAM mapping is at 0 for the DMA bus, + * unlike the CPU. + */ + drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET; + } + backend->engine.node = dev->of_node; backend->engine.ops = &sun4i_backend_engine_ops; backend->engine.id = sun4i_backend_of_get_id(dev->of_node);
On Tue, Apr 03, 2018 at 03:29:18PM +0200, Maxime Ripard wrote:
Now that we can express our DMA topology, rely on those property instead of hardcoding an offset from the dma_addr_t which wasn't really great.
We still need to add some code to deal with the old DT that would lack that property, but we move the offset to the DRM device dma_pfn_offset to be able to rely on just the dma_addr_t associated to the GEM object.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com
Yay for hiding more bus address funies behind dma_map_* support. This should also help with cleaner dma-buf import.
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
drivers/gpu/drm/sun4i/sun4i_backend.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c index 847eecbe4d14..04e85d3ca36e 100644 --- a/drivers/gpu/drm/sun4i/sun4i_backend.c +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c @@ -222,13 +222,6 @@ int sun4i_backend_update_layer_buffer(struct sun4i_backend *backend, paddr = drm_fb_cma_get_gem_addr(fb, state, 0); DRM_DEBUG_DRIVER("Setting buffer address to %pad\n", &paddr);
- /*
* backend DMA accesses DRAM directly, bypassing the system
* bus. As such, the address range is different and the buffer
* address needs to be corrected.
*/
- paddr -= PHYS_OFFSET;
- /* Write the 32 lower bits of the address (in bits) */ lo_paddr = paddr << 3; DRM_DEBUG_DRIVER("Setting address lower bits to 0x%x\n", lo_paddr);
@@ -361,6 +354,27 @@ static int sun4i_backend_bind(struct device *dev, struct device *master, return -ENOMEM; dev_set_drvdata(dev, backend);
- if (of_find_property(dev->of_node, "dma-parent", NULL)) {
/*
* This assume we have the same DMA constraints for all our the
* devices in our pipeline (all the backends, but also the
* frontends). This sounds bad, but it has always been the case
* for us, and DRM doesn't do per-device allocation either, so
* we would need to fix DRM first...
*/
ret = of_dma_configure(drm->dev, dev->of_node);
if (ret)
return ret;
- } else {
/*
* If we don't have the dma-parent property, most likely
* because of an old DT, we need to set the DMA offset by hand
* on our device since the RAM mapping is at 0 for the DMA bus,
* unlike the CPU.
*/
drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
- }
- backend->engine.node = dev->of_node; backend->engine.ops = &sun4i_backend_engine_ops; backend->engine.id = sun4i_backend_of_get_id(dev->of_node);
-- git-series 0.9.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
The MBUS clock is used by the MBUS controller, so let's export it so that we can use it in our DT node.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- drivers/clk/sunxi-ng/ccu-sun5i.h | 4 ---- include/dt-bindings/clock/sun5i-ccu.h | 2 +- 2 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/clk/sunxi-ng/ccu-sun5i.h b/drivers/clk/sunxi-ng/ccu-sun5i.h index 93a275fbd9a9..b66abd4fd0bf 100644 --- a/drivers/clk/sunxi-ng/ccu-sun5i.h +++ b/drivers/clk/sunxi-ng/ccu-sun5i.h @@ -60,10 +60,6 @@
/* The rest of the module clocks are exported */
-#define CLK_MBUS 99 - -/* And finally the IEP clock */ - #define CLK_NUMBER (CLK_IEP + 1)
#endif /* _CCU_SUN5I_H_ */ diff --git a/include/dt-bindings/clock/sun5i-ccu.h b/include/dt-bindings/clock/sun5i-ccu.h index 81f34d477aeb..2e6b9ddcc24e 100644 --- a/include/dt-bindings/clock/sun5i-ccu.h +++ b/include/dt-bindings/clock/sun5i-ccu.h @@ -100,7 +100,7 @@ #define CLK_AVS 96 #define CLK_HDMI 97 #define CLK_GPU 98 - +#define CLK_MBUS 99 #define CLK_IEP 100
#endif /* _DT_BINDINGS_CLK_SUN5I_H_ */
On Tue, Apr 03, 2018 at 03:29:19PM +0200, Maxime Ripard wrote:
The MBUS clock is used by the MBUS controller, so let's export it so that we can use it in our DT node.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com
drivers/clk/sunxi-ng/ccu-sun5i.h | 4 ---- include/dt-bindings/clock/sun5i-ccu.h | 2 +- 2 files changed, 1 insertion(+), 5 deletions(-)
Reviewed-by: Rob Herring robh@kernel.org
The MBUS (and its associated controller) is the bus in the Allwinner SoCs that DMA devices use in the system to access the memory.
Among other things (and depending on the SoC generation), it can also enforce priorities or report bandwidth usages on a per-master basis.
One of the most notable thing is that instead of having the same mapping for the RAM than the CPU, it maps it at address 0, which means we'll have to do address translation thanks to the dma-ranges property.
Signed-off-by: Maxime Ripard maxime.ripard@bootlin.com --- arch/arm/boot/dts/sun5i.dtsi | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/arm/boot/dts/sun5i.dtsi b/arch/arm/boot/dts/sun5i.dtsi index 07f2248ed5f8..acb24e537e0b 100644 --- a/arch/arm/boot/dts/sun5i.dtsi +++ b/arch/arm/boot/dts/sun5i.dtsi @@ -112,6 +112,7 @@ compatible = "simple-bus"; #address-cells = <1>; #size-cells = <1>; + dma-ranges; ranges;
sram-controller@1c00000 { @@ -150,6 +151,14 @@ }; };
+ mbus: dram-controller@1c01000 { + compatible = "allwinner,sun5i-a13-mbus"; + reg = <0x01c01000 0x1000>; + clocks = <&ccu CLK_MBUS>; + dma-ranges = <0x00000000 0x40000000 0x20000000>; + #dma-parent-cells = <1>; + }; + dma: dma-controller@1c02000 { compatible = "allwinner,sun4i-a10-dma"; reg = <0x01c02000 0x1000>; @@ -677,6 +686,7 @@ clock-names = "ahb", "mod", "ram"; resets = <&ccu RST_DE_FE>; + dma-parent = <&mbus 19>; status = "disabled";
ports { @@ -705,6 +715,7 @@ clock-names = "ahb", "mod", "ram"; resets = <&ccu RST_DE_BE>; + dma-parent = <&mbus 18>; status = "disabled";
assigned-clocks = <&ccu CLK_DE_BE>;
On Tue, Apr 3, 2018 at 8:29 AM, Maxime Ripard maxime.ripard@bootlin.com wrote:
Hi,
We've had for quite some time to hack around in our drivers to take into account the fact that our DMA accesses are not done through the parent node, but through another bus with a different mapping than the CPU for the RAM (0 instead of 0x40000000 for most SoCs).
After some discussion after the submission of a camera device suffering of the same hacks, I've decided to put together a serie that introduce a property called dma-parent that allows to express the DMA relationship between a master and its bus, even if they are not direct parents in the DT.
Reading thru v6 of the camera driver, it seems like having intermediate buses would solve the problem in your case?
As Arnd mentioned in that thread, something new needs to address all the deficiencies with dma-ranges and describing DMA bus topologies. This doesn't address the needs of describing bus interconnects. There's been some efforts by the QCom folks with an interconnect binding. They've mostly punted (for now at least) to not describing the whole interconnect in DT and keeping the details in a driver.
On the flip side, this does mirror the established pattern used by interrupts, so maybe it's okay on it's own. I'll wait for others to comment.
Rob
Hi Rob,
On Tue, Apr 03, 2018 at 11:03:30AM -0500, Rob Herring wrote:
On Tue, Apr 3, 2018 at 8:29 AM, Maxime Ripard maxime.ripard@bootlin.com wrote:
Hi,
We've had for quite some time to hack around in our drivers to take into account the fact that our DMA accesses are not done through the parent node, but through another bus with a different mapping than the CPU for the RAM (0 instead of 0x40000000 for most SoCs).
After some discussion after the submission of a camera device suffering of the same hacks, I've decided to put together a serie that introduce a property called dma-parent that allows to express the DMA relationship between a master and its bus, even if they are not direct parents in the DT.
Reading thru v6 of the camera driver, it seems like having intermediate buses would solve the problem in your case?
I guess it would yes, but I guess it wouldn't model the hardware properly since this seems to be really a bus only meant to do DMA, and you're not accessing the registers of the device through that bus.
And as far as I know, the DT implies that the topology is the one of the "control" side of the devices.
We'll also need eventually to have retrieve the MBUS endpoints ID to be able to support perf and PM QoS properly.
As Arnd mentioned in that thread, something new needs to address all the deficiencies with dma-ranges and describing DMA bus topologies. This doesn't address the needs of describing bus interconnects. There's been some efforts by the QCom folks with an interconnect binding. They've mostly punted (for now at least) to not describing the whole interconnect in DT and keeping the details in a driver.
Is it that patch serie? https://lkml.org/lkml/2018/3/9/856
On the flip side, this does mirror the established pattern used by interrupts, so maybe it's okay on it's own. I'll wait for others to comment.
We'll see how it turns out then :)
Maxime
On Mon, Apr 09, 2018 at 11:22:29AM +0200, Maxime Ripard wrote:
Hi Rob,
On Tue, Apr 03, 2018 at 11:03:30AM -0500, Rob Herring wrote:
On Tue, Apr 3, 2018 at 8:29 AM, Maxime Ripard maxime.ripard@bootlin.com wrote:
Hi,
We've had for quite some time to hack around in our drivers to take into account the fact that our DMA accesses are not done through the parent node, but through another bus with a different mapping than the CPU for the RAM (0 instead of 0x40000000 for most SoCs).
After some discussion after the submission of a camera device suffering of the same hacks, I've decided to put together a serie that introduce a property called dma-parent that allows to express the DMA relationship between a master and its bus, even if they are not direct parents in the DT.
Reading thru v6 of the camera driver, it seems like having intermediate buses would solve the problem in your case?
I guess it would yes, but I guess it wouldn't model the hardware properly since this seems to be really a bus only meant to do DMA, and you're not accessing the registers of the device through that bus.
And as far as I know, the DT implies that the topology is the one of the "control" side of the devices.
We'll also need eventually to have retrieve the MBUS endpoints ID to be able to support perf and PM QoS properly.
As Arnd mentioned in that thread, something new needs to address all the deficiencies with dma-ranges and describing DMA bus topologies. This doesn't address the needs of describing bus interconnects. There's been some efforts by the QCom folks with an interconnect binding. They've mostly punted (for now at least) to not describing the whole interconnect in DT and keeping the details in a driver.
Is it that patch serie? https://lkml.org/lkml/2018/3/9/856
On the flip side, this does mirror the established pattern used by interrupts, so maybe it's okay on it's own. I'll wait for others to comment.
We'll see how it turns out then :)
Ping?
How should we move forward on this?
Maxime
dri-devel@lists.freedesktop.org