This series brings initial support for memory interconnect to Tegra20, Tegra30 and Tegra124 SoCs.
For the starter only display controllers and devfreq devices are getting interconnect API support, others could be supported later on. The display controllers have the biggest demand for interconnect API right now because dynamic memory frequency scaling can't be done safely without taking into account bandwidth requirement from the displays. In particular this series fixes distorted display output on T30 Ouya and T124 TK1 devices.
Changelog:
v9: - Squashed "memory: tegra30-emc: Factor out clk initialization" into patch "tegra30: Support interconnect framework". Suggested by Krzysztof Kozlowski.
- Improved Kconfig in the patch "memory: tegra124-emc: Make driver modular" by adding CONFIG_TEGRA124_CLK_EMC entry, which makes clk-driver changes to look a bit more cleaner. Suggested by Krzysztof Kozlowski.
- Dropped voltage regulator support from ICC and DT patches for now because there is a new discussion about using a power domain abstraction for controlling the regulator, which is likely to happen.
- Replaced direct "operating-points-v2" property checking in EMC drivers with checking of a returned error code from dev_pm_opp_of_add_table(). Note that I haven't touched T20 EMC driver because it's very likely that we'll replace that code with a common helper soon anyways. Suggested by Viresh Kumar.
- The T30 DT patches now include EMC OPP changes for Ouya board, which is available now in linux-next.
Dmitry Osipenko (17): memory: tegra30: Support interconnect framework memory: tegra124-emc: Make driver modular memory: tegra124-emc: Continue probing if timings are missing in device-tree memory: tegra124: Support interconnect framework drm/tegra: dc: Support memory bandwidth management drm/tegra: dc: Extend debug stats with total number of events PM / devfreq: tegra30: Support interconnect and OPPs from device-tree PM / devfreq: tegra30: Separate configurations per-SoC generation PM / devfreq: tegra20: Deprecate in a favor of emc-stat based driver ARM: tegra: Correct EMC registers size in Tegra20 device-tree ARM: tegra: Add interconnect properties to Tegra20 device-tree ARM: tegra: Add interconnect properties to Tegra30 device-tree ARM: tegra: Add interconnect properties to Tegra124 device-tree ARM: tegra: Add nvidia,memory-controller phandle to Tegra20 EMC device-tree ARM: tegra: Add EMC OPP properties to Tegra20 device-trees ARM: tegra: Add EMC OPP and ICC properties to Tegra30 EMC and ACTMON device-tree nodes ARM: tegra: Add EMC OPP and ICC properties to Tegra124 EMC and ACTMON device-tree nodes
MAINTAINERS | 1 - arch/arm/boot/dts/tegra124-apalis-emc.dtsi | 8 + .../arm/boot/dts/tegra124-jetson-tk1-emc.dtsi | 8 + arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi | 10 + .../arm/boot/dts/tegra124-nyan-blaze-emc.dtsi | 10 + .../boot/dts/tegra124-peripherals-opp.dtsi | 419 ++++++++++++++++++ arch/arm/boot/dts/tegra124.dtsi | 31 ++ .../boot/dts/tegra20-acer-a500-picasso.dts | 5 + arch/arm/boot/dts/tegra20-colibri.dtsi | 4 + arch/arm/boot/dts/tegra20-paz00.dts | 4 + .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 92 ++++ arch/arm/boot/dts/tegra20.dtsi | 33 +- ...30-asus-nexus7-grouper-memory-timings.dtsi | 12 + arch/arm/boot/dts/tegra30-ouya.dts | 8 + .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 383 ++++++++++++++++ arch/arm/boot/dts/tegra30.dtsi | 33 +- drivers/clk/tegra/Kconfig | 3 + drivers/clk/tegra/Makefile | 2 +- drivers/clk/tegra/clk-tegra124-emc.c | 41 +- drivers/clk/tegra/clk-tegra124.c | 26 +- drivers/clk/tegra/clk.h | 18 +- drivers/devfreq/Kconfig | 10 - drivers/devfreq/Makefile | 1 - drivers/devfreq/tegra20-devfreq.c | 210 --------- drivers/devfreq/tegra30-devfreq.c | 154 ++++--- drivers/gpu/drm/tegra/Kconfig | 1 + drivers/gpu/drm/tegra/dc.c | 359 +++++++++++++++ drivers/gpu/drm/tegra/dc.h | 19 + drivers/gpu/drm/tegra/drm.c | 14 + drivers/gpu/drm/tegra/hub.c | 3 + drivers/gpu/drm/tegra/plane.c | 121 +++++ drivers/gpu/drm/tegra/plane.h | 15 + drivers/memory/tegra/Kconfig | 5 +- drivers/memory/tegra/tegra124-emc.c | 382 ++++++++++++++-- drivers/memory/tegra/tegra124.c | 82 +++- drivers/memory/tegra/tegra30-emc.c | 349 ++++++++++++++- drivers/memory/tegra/tegra30.c | 173 +++++++- include/linux/clk/tegra.h | 8 + include/soc/tegra/emc.h | 16 - 39 files changed, 2699 insertions(+), 374 deletions(-) create mode 100644 arch/arm/boot/dts/tegra124-peripherals-opp.dtsi create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi create mode 100644 arch/arm/boot/dts/tegra30-peripherals-opp.dtsi delete mode 100644 drivers/devfreq/tegra20-devfreq.c delete mode 100644 include/soc/tegra/emc.h
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS. MC driver now supports tuning of memory arbitration latency, which needs to be done for ISO memory clients, like a Display client for example.
Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/memory/tegra/Kconfig | 1 + drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++-- drivers/memory/tegra/tegra30.c | 173 +++++++++++++- 3 files changed, 501 insertions(+), 22 deletions(-)
diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig index 2a4a16bcf91c..ca7077a06f4c 100644 --- a/drivers/memory/tegra/Kconfig +++ b/drivers/memory/tegra/Kconfig @@ -24,6 +24,7 @@ config TEGRA30_EMC tristate "NVIDIA Tegra30 External Memory Controller driver" default y depends on TEGRA_MC && ARCH_TEGRA_3x_SOC + select PM_OPP help This driver is for the External Memory Controller (EMC) found on Tegra30 chips. The EMC controls the external DRAM on the board. diff --git a/drivers/memory/tegra/tegra30-emc.c b/drivers/memory/tegra/tegra30-emc.c index 3488786da03b..1974b067789f 100644 --- a/drivers/memory/tegra/tegra30-emc.c +++ b/drivers/memory/tegra/tegra30-emc.c @@ -14,16 +14,21 @@ #include <linux/debugfs.h> #include <linux/delay.h> #include <linux/err.h> +#include <linux/interconnect-provider.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/iopoll.h> #include <linux/kernel.h> #include <linux/module.h> +#include <linux/mutex.h> #include <linux/of_platform.h> #include <linux/platform_device.h> +#include <linux/pm_opp.h> +#include <linux/slab.h> #include <linux/sort.h> #include <linux/types.h>
+#include <soc/tegra/common.h> #include <soc/tegra/fuse.h>
#include "mc.h" @@ -323,9 +328,21 @@ struct emc_timing { bool emc_cfg_dyn_self_ref; };
+enum emc_rate_request_type { + EMC_RATE_DEBUG, + EMC_RATE_ICC, + EMC_RATE_TYPE_MAX, +}; + +struct emc_rate_request { + unsigned long min_rate; + unsigned long max_rate; +}; + struct tegra_emc { struct device *dev; struct tegra_mc *mc; + struct icc_provider provider; struct notifier_block clk_nb; struct clk *clk; void __iomem *regs; @@ -352,6 +369,15 @@ struct tegra_emc { unsigned long min_rate; unsigned long max_rate; } debugfs; + + /* + * There are multiple sources in the EMC driver which could request + * a min/max clock rate, these rates are contained in this array. + */ + struct emc_rate_request requested_rate[EMC_RATE_TYPE_MAX]; + + /* protect shared rate-change code path */ + struct mutex rate_lock; };
static int emc_seq_update_timing(struct tegra_emc *emc) @@ -1094,6 +1120,83 @@ static long emc_round_rate(unsigned long rate, return timing->rate; }
+static void tegra_emc_rate_requests_init(struct tegra_emc *emc) +{ + unsigned int i; + + for (i = 0; i < EMC_RATE_TYPE_MAX; i++) { + emc->requested_rate[i].min_rate = 0; + emc->requested_rate[i].max_rate = ULONG_MAX; + } +} + +static int emc_request_rate(struct tegra_emc *emc, + unsigned long new_min_rate, + unsigned long new_max_rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = emc->requested_rate; + unsigned long min_rate = 0, max_rate = ULONG_MAX; + unsigned int i; + int err; + + /* select minimum and maximum rates among the requested rates */ + for (i = 0; i < EMC_RATE_TYPE_MAX; i++, req++) { + if (i == type) { + min_rate = max(new_min_rate, min_rate); + max_rate = min(new_max_rate, max_rate); + } else { + min_rate = max(req->min_rate, min_rate); + max_rate = min(req->max_rate, max_rate); + } + } + + if (min_rate > max_rate) { + dev_err_ratelimited(emc->dev, "%s: type %u: out of range: %lu %lu\n", + __func__, type, min_rate, max_rate); + return -ERANGE; + } + + /* + * EMC rate-changes should go via OPP API because it manages voltage + * changes. + */ + err = dev_pm_opp_set_rate(emc->dev, min_rate); + if (err) + return err; + + emc->requested_rate[type].min_rate = new_min_rate; + emc->requested_rate[type].max_rate = new_max_rate; + + return 0; +} + +static int emc_set_min_rate(struct tegra_emc *emc, unsigned long rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = &emc->requested_rate[type]; + int ret; + + mutex_lock(&emc->rate_lock); + ret = emc_request_rate(emc, rate, req->max_rate, type); + mutex_unlock(&emc->rate_lock); + + return ret; +} + +static int emc_set_max_rate(struct tegra_emc *emc, unsigned long rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = &emc->requested_rate[type]; + int ret; + + mutex_lock(&emc->rate_lock); + ret = emc_request_rate(emc, req->min_rate, rate, type); + mutex_unlock(&emc->rate_lock); + + return ret; +} + /* * debugfs interface * @@ -1177,7 +1280,7 @@ static int tegra_emc_debug_min_rate_set(void *data, u64 rate) if (!tegra_emc_validate_rate(emc, rate)) return -EINVAL;
- err = clk_set_min_rate(emc->clk, rate); + err = emc_set_min_rate(emc, rate, EMC_RATE_DEBUG); if (err < 0) return err;
@@ -1207,7 +1310,7 @@ static int tegra_emc_debug_max_rate_set(void *data, u64 rate) if (!tegra_emc_validate_rate(emc, rate)) return -EINVAL;
- err = clk_set_max_rate(emc->clk, rate); + err = emc_set_max_rate(emc, rate, EMC_RATE_DEBUG); if (err < 0) return err;
@@ -1264,6 +1367,219 @@ static void tegra_emc_debugfs_init(struct tegra_emc *emc) emc, &tegra_emc_debug_max_rate_fops); }
+static inline struct tegra_emc * +to_tegra_emc_provider(struct icc_provider *provider) +{ + return container_of(provider, struct tegra_emc, provider); +} + +static struct icc_node_data * +emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data) +{ + struct icc_provider *provider = data; + struct icc_node_data *ndata; + struct icc_node *node; + + /* External Memory is the only possible ICC route */ + list_for_each_entry(node, &provider->nodes, node_list) { + if (node->id != TEGRA_ICC_EMEM) + continue; + + ndata = kzalloc(sizeof(*ndata), GFP_KERNEL); + if (!ndata) + return ERR_PTR(-ENOMEM); + + /* + * SRC and DST nodes should have matching TAG in order to have + * it set by default for a requested path. + */ + ndata->tag = TEGRA_MC_ICC_TAG_ISO; + ndata->node = node; + + return ndata; + } + + return ERR_PTR(-EPROBE_DEFER); +} + +static int emc_icc_set(struct icc_node *src, struct icc_node *dst) +{ + struct tegra_emc *emc = to_tegra_emc_provider(dst->provider); + unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw); + unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw); + unsigned long long rate = max(avg_bw, peak_bw); + const unsigned int dram_data_bus_width_bytes = 4; + const unsigned int ddr = 2; + int err; + + /* + * Tegra30 EMC runs on a clock rate of SDRAM bus. This means that + * EMC clock rate is twice smaller than the peak data rate because + * data is sampled on both EMC clock edges. + */ + do_div(rate, ddr * dram_data_bus_width_bytes); + rate = min_t(u64, rate, U32_MAX); + + err = emc_set_min_rate(emc, rate, EMC_RATE_ICC); + if (err) + return err; + + return 0; +} + +static int tegra_emc_interconnect_init(struct tegra_emc *emc) +{ + const struct tegra_mc_soc *soc = emc->mc->soc; + struct icc_node *node; + int err; + + emc->provider.dev = emc->dev; + emc->provider.set = emc_icc_set; + emc->provider.data = &emc->provider; + emc->provider.aggregate = soc->icc_ops->aggregate; + emc->provider.xlate_extended = emc_of_icc_xlate_extended; + + err = icc_provider_add(&emc->provider); + if (err) + goto err_msg; + + /* create External Memory Controller node */ + node = icc_node_create(TEGRA_ICC_EMC); + if (IS_ERR(node)) { + err = PTR_ERR(node); + goto del_provider; + } + + node->name = "External Memory Controller"; + icc_node_add(node, &emc->provider); + + /* link External Memory Controller to External Memory (DRAM) */ + err = icc_link_create(node, TEGRA_ICC_EMEM); + if (err) + goto remove_nodes; + + /* create External Memory node */ + node = icc_node_create(TEGRA_ICC_EMEM); + if (IS_ERR(node)) { + err = PTR_ERR(node); + goto remove_nodes; + } + + node->name = "External Memory (DRAM)"; + icc_node_add(node, &emc->provider); + + return 0; + +remove_nodes: + icc_nodes_remove(&emc->provider); +del_provider: + icc_provider_del(&emc->provider); +err_msg: + dev_err(emc->dev, "failed to initialize ICC: %d\n", err); + + return err; +} + +static int tegra_emc_opp_table_init(struct tegra_emc *emc) +{ + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); + struct opp_table *clk_opp_table, *hw_opp_table; + int err; + + clk_opp_table = dev_pm_opp_set_clkname(emc->dev, NULL); + err = PTR_ERR_OR_ZERO(clk_opp_table); + if (err) { + dev_err(emc->dev, "failed to set OPP clk: %d\n", err); + return err; + } + + hw_opp_table = dev_pm_opp_set_supported_hw(emc->dev, &hw_version, 1); + err = PTR_ERR_OR_ZERO(hw_opp_table); + if (err) { + dev_err(emc->dev, "failed to set OPP supported HW: %d\n", err); + goto put_clk_table; + } + + err = dev_pm_opp_of_add_table(emc->dev); + if (err) { + /* + * -ENODEV means that this is an older device-tree which + * doesn't have OPP table and EMC driver isn't very useful + * in this case. + */ + if (err == -ENODEV) + dev_err(emc->dev, "OPP table not found, please update your device tree\n"); + else + dev_err(emc->dev, "failed to add OPP table: %d\n", err); + + goto put_hw_table; + } + + dev_info(emc->dev, "OPP HW ver. 0x%x, current clock rate %lu MHz\n", + hw_version, clk_get_rate(emc->clk) / 1000000); + + /* first dummy rate-set initializes voltage state */ + err = dev_pm_opp_set_rate(emc->dev, clk_get_rate(emc->clk)); + if (err) { + dev_err(emc->dev, "failed to initialize OPP clock: %d\n", err); + goto remove_table; + } + + return 0; + +remove_table: + dev_pm_opp_of_remove_table(emc->dev); +put_hw_table: + dev_pm_opp_put_supported_hw(hw_opp_table); +put_clk_table: + dev_pm_opp_put_clkname(clk_opp_table); + + return err; +} + +static void devm_tegra_emc_unset_callback(void *data) +{ + tegra20_clk_set_emc_round_callback(NULL, NULL); +} + +static void devm_tegra_emc_unreg_clk_notifier(void *data) +{ + struct tegra_emc *emc = data; + + clk_notifier_unregister(emc->clk, &emc->clk_nb); +} + +static int tegra_emc_init_clk(struct tegra_emc *emc) +{ + int err; + + tegra20_clk_set_emc_round_callback(emc_round_rate, emc); + + err = devm_add_action_or_reset(emc->dev, devm_tegra_emc_unset_callback, + NULL); + if (err) + return err; + + emc->clk = devm_clk_get(emc->dev, NULL); + if (IS_ERR(emc->clk)) { + dev_err(emc->dev, "failed to get EMC clock: %pe\n", emc->clk); + return PTR_ERR(emc->clk); + } + + err = clk_notifier_register(emc->clk, &emc->clk_nb); + if (err) { + dev_err(emc->dev, "failed to register clk notifier: %d\n", err); + return err; + } + + err = devm_add_action_or_reset(emc->dev, + devm_tegra_emc_unreg_clk_notifier, emc); + if (err) + return err; + + return 0; +} + static int tegra_emc_probe(struct platform_device *pdev) { struct device_node *np; @@ -1280,6 +1596,7 @@ static int tegra_emc_probe(struct platform_device *pdev) if (IS_ERR(emc->mc)) return PTR_ERR(emc->mc);
+ mutex_init(&emc->rate_lock); emc->clk_nb.notifier_call = emc_clk_change_notify; emc->dev = &pdev->dev;
@@ -1312,24 +1629,18 @@ static int tegra_emc_probe(struct platform_device *pdev) return err; }
- tegra20_clk_set_emc_round_callback(emc_round_rate, emc); - - emc->clk = devm_clk_get(&pdev->dev, "emc"); - if (IS_ERR(emc->clk)) { - err = PTR_ERR(emc->clk); - dev_err(&pdev->dev, "failed to get emc clock: %d\n", err); - goto unset_cb; - } + err = tegra_emc_init_clk(emc); + if (err) + return err;
- err = clk_notifier_register(emc->clk, &emc->clk_nb); - if (err) { - dev_err(&pdev->dev, "failed to register clk notifier: %d\n", - err); - goto unset_cb; - } + err = tegra_emc_opp_table_init(emc); + if (err) + return err;
platform_set_drvdata(pdev, emc); + tegra_emc_rate_requests_init(emc); tegra_emc_debugfs_init(emc); + tegra_emc_interconnect_init(emc);
/* * Don't allow the kernel module to be unloaded. Unloading adds some @@ -1339,11 +1650,6 @@ static int tegra_emc_probe(struct platform_device *pdev) try_module_get(THIS_MODULE);
return 0; - -unset_cb: - tegra20_clk_set_emc_round_callback(NULL, NULL); - - return err; }
static int tegra_emc_suspend(struct device *dev) @@ -1397,6 +1703,7 @@ static struct platform_driver tegra_emc_driver = { .of_match_table = tegra_emc_of_match, .pm = &tegra_emc_pm_ops, .suppress_bind_attrs = true, + .sync_state = icc_sync_state, }, }; module_platform_driver(tegra_emc_driver); diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c index d0314f29608d..ea849003014b 100644 --- a/drivers/memory/tegra/tegra30.c +++ b/drivers/memory/tegra/tegra30.c @@ -4,7 +4,8 @@ */
#include <linux/of.h> -#include <linux/mm.h> +#include <linux/of_device.h> +#include <linux/slab.h>
#include <dt-bindings/memory/tegra30-mc.h>
@@ -1083,6 +1084,175 @@ static const struct tegra_mc_reset tegra30_mc_resets[] = { TEGRA30_MC_RESET(VI, 0x200, 0x204, 17), };
+static void tegra30_mc_tune_client_latency(struct tegra_mc *mc, + const struct tegra_mc_client *client, + unsigned int bandwidth_mbytes_sec) +{ + u32 arb_tolerance_compensation_nsec, arb_tolerance_compensation_div; + const struct tegra_mc_la *la = &client->la; + unsigned int fifo_size = client->fifo_size; + u32 arb_nsec, la_ticks, value; + + /* see 18.4.1 Client Configuration in Tegra3 TRM v03p */ + if (bandwidth_mbytes_sec) + arb_nsec = fifo_size * NSEC_PER_USEC / bandwidth_mbytes_sec; + else + arb_nsec = U32_MAX; + + /* + * Latency allowness should be set with consideration for the module's + * latency tolerance and internal buffering capabilities. + * + * Display memory clients use isochronous transfers and have very low + * tolerance to a belated transfers. Hence we need to compensate the + * memory arbitration imperfection for them in order to prevent FIFO + * underflow condition when memory bus is busy. + * + * VI clients also need a stronger compensation. + */ + switch (client->swgroup) { + case TEGRA_SWGROUP_MPCORE: + case TEGRA_SWGROUP_PTC: + /* + * We always want lower latency for these clients, hence + * don't touch them. + */ + return; + + case TEGRA_SWGROUP_DC: + case TEGRA_SWGROUP_DCB: + arb_tolerance_compensation_nsec = 1050; + arb_tolerance_compensation_div = 2; + break; + + case TEGRA_SWGROUP_VI: + arb_tolerance_compensation_nsec = 1050; + arb_tolerance_compensation_div = 1; + break; + + default: + arb_tolerance_compensation_nsec = 150; + arb_tolerance_compensation_div = 1; + break; + } + + if (arb_nsec > arb_tolerance_compensation_nsec) + arb_nsec -= arb_tolerance_compensation_nsec; + else + arb_nsec = 0; + + arb_nsec /= arb_tolerance_compensation_div; + + /* + * Latency allowance is a number of ticks a request from a particular + * client may wait in the EMEM arbiter before it becomes a high-priority + * request. + */ + la_ticks = arb_nsec / mc->tick; + la_ticks = min(la_ticks, la->mask); + + value = mc_readl(mc, la->reg); + value &= ~(la->mask << la->shift); + value |= la_ticks << la->shift; + mc_writel(mc, value, la->reg); +} + +static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{ + struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider); + const struct tegra_mc_client *client = &mc->soc->clients[src->id]; + u64 peak_bandwidth = icc_units_to_bps(src->peak_bw); + + /* + * Skip pre-initialization that is done by icc_node_add(), which sets + * bandwidth to maximum for all clients before drivers are loaded. + * + * This doesn't make sense for us because we don't have drivers for all + * clients and it's okay to keep configuration left from bootloader + * during boot, at least for today. + */ + if (src == dst) + return 0; + + /* convert bytes/sec to megabytes/sec */ + do_div(peak_bandwidth, 1000000); + + tegra30_mc_tune_client_latency(mc, client, peak_bandwidth); + + return 0; +} + +static int tegra30_mc_icc_aggreate(struct icc_node *node, u32 tag, u32 avg_bw, + u32 peak_bw, u32 *agg_avg, u32 *agg_peak) +{ + /* + * ISO clients need to reserve extra bandwidth up-front because + * there could be high bandwidth pressure during initial filling + * of the client's FIFO buffers. Secondly, we need to take into + * account impurities of the memory subsystem. + */ + if (tag & TEGRA_MC_ICC_TAG_ISO) + peak_bw = tegra_mc_scale_percents(peak_bw, 400); + + *agg_avg += avg_bw; + *agg_peak = max(*agg_peak, peak_bw); + + return 0; +} + +static struct icc_node_data * +tegra30_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data) +{ + struct tegra_mc *mc = icc_provider_to_tegra_mc(data); + const struct tegra_mc_client *client; + unsigned int i, idx = spec->args[0]; + struct icc_node_data *ndata; + struct icc_node *node; + + list_for_each_entry(node, &mc->provider.nodes, node_list) { + if (node->id != idx) + continue; + + ndata = kzalloc(sizeof(*ndata), GFP_KERNEL); + if (!ndata) + return ERR_PTR(-ENOMEM); + + client = &mc->soc->clients[idx]; + ndata->node = node; + + switch (client->swgroup) { + case TEGRA_SWGROUP_DC: + case TEGRA_SWGROUP_DCB: + case TEGRA_SWGROUP_PTC: + case TEGRA_SWGROUP_VI: + /* these clients are isochronous by default */ + ndata->tag = TEGRA_MC_ICC_TAG_ISO; + break; + + default: + ndata->tag = TEGRA_MC_ICC_TAG_DEFAULT; + break; + } + + return ndata; + } + + for (i = 0; i < mc->soc->num_clients; i++) { + if (mc->soc->clients[i].id == idx) + return ERR_PTR(-EPROBE_DEFER); + } + + dev_err(mc->dev, "invalid ICC client ID %u\n", idx); + + return ERR_PTR(-EINVAL); +} + +static const struct tegra_mc_icc_ops tegra30_mc_icc_ops = { + .xlate_extended = tegra30_mc_of_icc_xlate_extended, + .aggregate = tegra30_mc_icc_aggreate, + .set = tegra30_mc_icc_set, +}; + const struct tegra_mc_soc tegra30_mc_soc = { .clients = tegra30_mc_clients, .num_clients = ARRAY_SIZE(tegra30_mc_clients), @@ -1097,4 +1267,5 @@ const struct tegra_mc_soc tegra30_mc_soc = { .reset_ops = &tegra_mc_reset_ops_common, .resets = tegra30_mc_resets, .num_resets = ARRAY_SIZE(tegra30_mc_resets), + .icc_ops = &tegra30_mc_icc_ops, };
Hi Dmitry,
Thank you working on this!
On 15.11.20 23:29, Dmitry Osipenko wrote:
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS. MC driver now supports tuning of memory arbitration latency, which needs to be done for ISO memory clients, like a Display client for example.
Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/memory/tegra/Kconfig | 1 + drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++-- drivers/memory/tegra/tegra30.c | 173 +++++++++++++- 3 files changed, 501 insertions(+), 22 deletions(-)
[..]> diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c
index d0314f29608d..ea849003014b 100644 --- a/drivers/memory/tegra/tegra30.c +++ b/drivers/memory/tegra/tegra30.c
[..]
+static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{
- struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
- const struct tegra_mc_client *client = &mc->soc->clients[src->id];
- u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
- /*
* Skip pre-initialization that is done by icc_node_add(), which sets
* bandwidth to maximum for all clients before drivers are loaded.
*
* This doesn't make sense for us because we don't have drivers for all
* clients and it's okay to keep configuration left from bootloader
* during boot, at least for today.
*/
- if (src == dst)
return 0;
Nit: The "proper" way to express this should be to implement the .get_bw() callback to return zero as initial average/peak bandwidth. I'm wondering if this will work here?
The rest looks good to me!
Thanks, Georgi
17.11.2020 23:24, Georgi Djakov пишет:
Hi Dmitry,
Thank you working on this!
On 15.11.20 23:29, Dmitry Osipenko wrote:
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS. MC driver now supports tuning of memory arbitration latency, which needs to be done for ISO memory clients, like a Display client for example.
Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/memory/tegra/Kconfig      |  1 +  drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--  drivers/memory/tegra/tegra30.c    | 173 +++++++++++++-  3 files changed, 501 insertions(+), 22 deletions(-)
[..]> diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c
index d0314f29608d..ea849003014b 100644 --- a/drivers/memory/tegra/tegra30.c +++ b/drivers/memory/tegra/tegra30.c
[..]
+static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{ +Â Â Â struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider); +Â Â Â const struct tegra_mc_client *client = &mc->soc->clients[src->id]; +Â Â Â u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
+Â Â Â /* +Â Â Â Â * Skip pre-initialization that is done by icc_node_add(), which sets +Â Â Â Â * bandwidth to maximum for all clients before drivers are loaded. +Â Â Â Â * +Â Â Â Â * This doesn't make sense for us because we don't have drivers for all +Â Â Â Â * clients and it's okay to keep configuration left from bootloader +Â Â Â Â * during boot, at least for today. +Â Â Â Â */ +Â Â Â if (src == dst) +Â Â Â Â Â Â Â return 0;
Nit: The "proper" way to express this should be to implement the .get_bw() callback to return zero as initial average/peak bandwidth. I'm wondering if this will work here?
The rest looks good to me!
Hello Georgi,
Returning zeros doesn't allow us to skip the initialization that is done by provider->set(node, node) in icc_node_add(). It will reconfigure memory latency in accordance to a zero memory bandwidth, which is wrong to do.
It actually should be more preferred to preset bandwidth to a maximum before all drivers are synced, but this should be done only once we will wire up all drivers to use ICC framework. For now it's safer to keep the default hardware configuration untouched.
On 18.11.20 0:02, Dmitry Osipenko wrote:
17.11.2020 23:24, Georgi Djakov пишет:
Hi Dmitry,
Thank you working on this!
On 15.11.20 23:29, Dmitry Osipenko wrote:
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS. MC driver now supports tuning of memory arbitration latency, which needs to be done for ISO memory clients, like a Display client for example.
Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/memory/tegra/Kconfig      |  1 +  drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--  drivers/memory/tegra/tegra30.c    | 173 +++++++++++++-  3 files changed, 501 insertions(+), 22 deletions(-)
[..]> diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c
index d0314f29608d..ea849003014b 100644 --- a/drivers/memory/tegra/tegra30.c +++ b/drivers/memory/tegra/tegra30.c
[..]
+static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{ +Â Â Â struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider); +Â Â Â const struct tegra_mc_client *client = &mc->soc->clients[src->id]; +Â Â Â u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
+Â Â Â /* +Â Â Â Â * Skip pre-initialization that is done by icc_node_add(), which sets +Â Â Â Â * bandwidth to maximum for all clients before drivers are loaded. +Â Â Â Â * +Â Â Â Â * This doesn't make sense for us because we don't have drivers for all +Â Â Â Â * clients and it's okay to keep configuration left from bootloader +Â Â Â Â * during boot, at least for today. +Â Â Â Â */ +Â Â Â if (src == dst) +Â Â Â Â Â Â Â return 0;
Nit: The "proper" way to express this should be to implement the .get_bw() callback to return zero as initial average/peak bandwidth. I'm wondering if this will work here?
The rest looks good to me!
Hello Georgi,
Returning zeros doesn't allow us to skip the initialization that is done by provider->set(node, node) in icc_node_add(). It will reconfigure memory latency in accordance to a zero memory bandwidth, which is wrong to do.
It actually should be more preferred to preset bandwidth to a maximum before all drivers are synced, but this should be done only once we will wire up all drivers to use ICC framework. For now it's safer to keep the default hardware configuration untouched.
Ok, thanks for clarifying! Is there a way to read this hardware configuration and convert it to initial bandwidth? That's the idea of the get_bw() callback actually. I am just curious and trying to get a better understanding how this works and if it would be useful for Tegra.
Thanks, Georgi
18.11.2020 18:30, Georgi Djakov пишет:
On 18.11.20 0:02, Dmitry Osipenko wrote:
17.11.2020 23:24, Georgi Djakov пишет:
Hi Dmitry,
Thank you working on this!
On 15.11.20 23:29, Dmitry Osipenko wrote:
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS. MC driver now supports tuning of memory arbitration latency, which needs to be done for ISO memory clients, like a Display client for example.
Tested-by: Peter Geis pgwipeout@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/memory/tegra/Kconfig      |  1 +   drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--   drivers/memory/tegra/tegra30.c    | 173 +++++++++++++-   3 files changed, 501 insertions(+), 22 deletions(-)
[..]> diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c
index d0314f29608d..ea849003014b 100644 --- a/drivers/memory/tegra/tegra30.c +++ b/drivers/memory/tegra/tegra30.c
[..]
+static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{ +Â Â Â struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider); +Â Â Â const struct tegra_mc_client *client = &mc->soc->clients[src->id]; +Â Â Â u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
+Â Â Â /* +Â Â Â Â * Skip pre-initialization that is done by icc_node_add(), which sets +Â Â Â Â * bandwidth to maximum for all clients before drivers are loaded. +Â Â Â Â * +Â Â Â Â * This doesn't make sense for us because we don't have drivers for all +Â Â Â Â * clients and it's okay to keep configuration left from bootloader +Â Â Â Â * during boot, at least for today. +Â Â Â Â */ +Â Â Â if (src == dst) +Â Â Â Â Â Â Â return 0;
Nit: The "proper" way to express this should be to implement the .get_bw() callback to return zero as initial average/peak bandwidth. I'm wondering if this will work here?
The rest looks good to me!
Hello Georgi,
Returning zeros doesn't allow us to skip the initialization that is done by provider->set(node, node) in icc_node_add(). It will reconfigure memory latency in accordance to a zero memory bandwidth, which is wrong to do.
It actually should be more preferred to preset bandwidth to a maximum before all drivers are synced, but this should be done only once we will wire up all drivers to use ICC framework. For now it's safer to keep the default hardware configuration untouched.
Ok, thanks for clarifying! Is there a way to read this hardware configuration and convert it to initial bandwidth? That's the idea of the get_bw() callback actually. I am just curious and trying to get a better understanding how this works and if it would be useful for Tegra.
MC driver can't easily retrieve and convert initial bandwidths because they depend on knowing hardware state that is not accessible by the MC driver.
But in fact it's unnecessary to know the initial bandwidth in the case of this MC ICC driver because if configuration is re-set to the same value, then this is equal to leaving configuration unchanged.
It's okay to keep memory latency configuration unchanged if memory clock rate goes up, which is what happens here during init. Please notice that EMC ICC drivers (which control the clock rate) don't skip the initial bandwidth change.
Add modularization support to the Tegra124 EMC driver, which now can be compiled as a loadable kernel module.
Note that EMC clock must be registered at clk-init time, otherwise PLLM will be disabled as unused clock at boot time if EMC driver is compiled as a module. Hence add a prepare/complete callbacks. similarly to what is done for the Tegra20/30 EMC drivers.
Tested-by: Nicolas Chauvet kwizart@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/clk/tegra/Kconfig | 3 ++ drivers/clk/tegra/Makefile | 2 +- drivers/clk/tegra/clk-tegra124-emc.c | 41 ++++++++++++++++++++++++---- drivers/clk/tegra/clk-tegra124.c | 26 ++++++++++++++++-- drivers/clk/tegra/clk.h | 18 ++++++++---- drivers/memory/tegra/Kconfig | 3 +- drivers/memory/tegra/tegra124-emc.c | 31 ++++++++++++++------- include/linux/clk/tegra.h | 8 ++++++ include/soc/tegra/emc.h | 16 ----------- 9 files changed, 106 insertions(+), 42 deletions(-) delete mode 100644 include/soc/tegra/emc.h
diff --git a/drivers/clk/tegra/Kconfig b/drivers/clk/tegra/Kconfig index deaa4605824c..90df619dc087 100644 --- a/drivers/clk/tegra/Kconfig +++ b/drivers/clk/tegra/Kconfig @@ -7,3 +7,6 @@ config TEGRA_CLK_DFLL depends on ARCH_TEGRA_124_SOC || ARCH_TEGRA_210_SOC select PM_OPP def_bool y + +config TEGRA124_CLK_EMC + bool diff --git a/drivers/clk/tegra/Makefile b/drivers/clk/tegra/Makefile index eec2313fd37e..7b1816856eb5 100644 --- a/drivers/clk/tegra/Makefile +++ b/drivers/clk/tegra/Makefile @@ -22,7 +22,7 @@ obj-$(CONFIG_ARCH_TEGRA_3x_SOC) += clk-tegra20-emc.o obj-$(CONFIG_ARCH_TEGRA_114_SOC) += clk-tegra114.o obj-$(CONFIG_ARCH_TEGRA_124_SOC) += clk-tegra124.o obj-$(CONFIG_TEGRA_CLK_DFLL) += clk-tegra124-dfll-fcpu.o -obj-$(CONFIG_TEGRA124_EMC) += clk-tegra124-emc.o +obj-$(CONFIG_TEGRA124_CLK_EMC) += clk-tegra124-emc.o obj-$(CONFIG_ARCH_TEGRA_132_SOC) += clk-tegra124.o obj-y += cvb.o obj-$(CONFIG_ARCH_TEGRA_210_SOC) += clk-tegra210.o diff --git a/drivers/clk/tegra/clk-tegra124-emc.c b/drivers/clk/tegra/clk-tegra124-emc.c index 745f9faa98d8..bdf6f4a51617 100644 --- a/drivers/clk/tegra/clk-tegra124-emc.c +++ b/drivers/clk/tegra/clk-tegra124-emc.c @@ -11,7 +11,9 @@ #include <linux/clk-provider.h> #include <linux/clk.h> #include <linux/clkdev.h> +#include <linux/clk/tegra.h> #include <linux/delay.h> +#include <linux/export.h> #include <linux/io.h> #include <linux/module.h> #include <linux/of_address.h> @@ -21,7 +23,6 @@ #include <linux/string.h>
#include <soc/tegra/fuse.h> -#include <soc/tegra/emc.h>
#include "clk.h"
@@ -80,6 +81,9 @@ struct tegra_clk_emc { int num_timings; struct emc_timing *timings; spinlock_t *lock; + + tegra124_emc_prepare_timing_change_cb *prepare_timing_change; + tegra124_emc_complete_timing_change_cb *complete_timing_change; };
/* Common clock framework callback implementations */ @@ -176,6 +180,9 @@ static struct tegra_emc *emc_ensure_emc_driver(struct tegra_clk_emc *tegra) if (tegra->emc) return tegra->emc;
+ if (!tegra->prepare_timing_change || !tegra->complete_timing_change) + return NULL; + if (!tegra->emc_node) return NULL;
@@ -241,7 +248,7 @@ static int emc_set_timing(struct tegra_clk_emc *tegra,
div = timing->parent_rate / (timing->rate / 2) - 2;
- err = tegra_emc_prepare_timing_change(emc, timing->rate); + err = tegra->prepare_timing_change(emc, timing->rate); if (err) return err;
@@ -259,7 +266,7 @@ static int emc_set_timing(struct tegra_clk_emc *tegra,
spin_unlock_irqrestore(tegra->lock, flags);
- tegra_emc_complete_timing_change(emc, timing->rate); + tegra->complete_timing_change(emc, timing->rate);
clk_hw_reparent(&tegra->hw, __clk_get_hw(timing->parent)); clk_disable_unprepare(tegra->prev_parent); @@ -473,8 +480,8 @@ static const struct clk_ops tegra_clk_emc_ops = { .get_parent = emc_get_parent, };
-struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np, - spinlock_t *lock) +struct clk *tegra124_clk_register_emc(void __iomem *base, struct device_node *np, + spinlock_t *lock) { struct tegra_clk_emc *tegra; struct clk_init_data init; @@ -538,3 +545,27 @@ struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np,
return clk; }; + +void tegra124_clk_set_emc_callbacks(tegra124_emc_prepare_timing_change_cb *prep_cb, + tegra124_emc_complete_timing_change_cb *complete_cb) +{ + struct clk *clk = __clk_lookup("emc"); + struct tegra_clk_emc *tegra; + struct clk_hw *hw; + + if (clk) { + hw = __clk_get_hw(clk); + tegra = container_of(hw, struct tegra_clk_emc, hw); + + tegra->prepare_timing_change = prep_cb; + tegra->complete_timing_change = complete_cb; + } +} +EXPORT_SYMBOL_GPL(tegra124_clk_set_emc_callbacks); + +bool tegra124_clk_emc_driver_available(struct clk_hw *hw) +{ + struct tegra_clk_emc *tegra = container_of(hw, struct tegra_clk_emc, hw); + + return tegra->prepare_timing_change && tegra->complete_timing_change; +} diff --git a/drivers/clk/tegra/clk-tegra124.c b/drivers/clk/tegra/clk-tegra124.c index e931319dcc9d..934520aab6e3 100644 --- a/drivers/clk/tegra/clk-tegra124.c +++ b/drivers/clk/tegra/clk-tegra124.c @@ -1500,6 +1500,26 @@ static void __init tegra124_132_clock_init_pre(struct device_node *np) writel(plld_base, clk_base + PLLD_BASE); }
+static struct clk *tegra124_clk_src_onecell_get(struct of_phandle_args *clkspec, + void *data) +{ + struct clk_hw *hw; + struct clk *clk; + + clk = of_clk_src_onecell_get(clkspec, data); + if (IS_ERR(clk)) + return clk; + + hw = __clk_get_hw(clk); + + if (clkspec->args[0] == TEGRA124_CLK_EMC) { + if (!tegra124_clk_emc_driver_available(hw)) + return ERR_PTR(-EPROBE_DEFER); + } + + return clk; +} + /** * tegra124_132_clock_init_post - clock initialization postamble for T124/T132 * @np: struct device_node * of the DT node for the SoC CAR IP block @@ -1516,10 +1536,10 @@ static void __init tegra124_132_clock_init_post(struct device_node *np) &pll_x_params); tegra_init_special_resets(1, tegra124_reset_assert, tegra124_reset_deassert); - tegra_add_of_provider(np, of_clk_src_onecell_get); + tegra_add_of_provider(np, tegra124_clk_src_onecell_get);
- clks[TEGRA124_CLK_EMC] = tegra_clk_register_emc(clk_base, np, - &emc_lock); + clks[TEGRA124_CLK_EMC] = tegra124_clk_register_emc(clk_base, np, + &emc_lock);
tegra_register_devclks(devclks, ARRAY_SIZE(devclks));
diff --git a/drivers/clk/tegra/clk.h b/drivers/clk/tegra/clk.h index 6b565f6b5f66..c3e36b5dcc75 100644 --- a/drivers/clk/tegra/clk.h +++ b/drivers/clk/tegra/clk.h @@ -881,16 +881,22 @@ void tegra_super_clk_gen5_init(void __iomem *clk_base, void __iomem *pmc_base, struct tegra_clk *tegra_clks, struct tegra_clk_pll_params *pll_params);
-#ifdef CONFIG_TEGRA124_EMC -struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np, - spinlock_t *lock); +#ifdef CONFIG_TEGRA124_CLK_EMC +struct clk *tegra124_clk_register_emc(void __iomem *base, struct device_node *np, + spinlock_t *lock); +bool tegra124_clk_emc_driver_available(struct clk_hw *emc_hw); #else -static inline struct clk *tegra_clk_register_emc(void __iomem *base, - struct device_node *np, - spinlock_t *lock) +static inline struct clk * +tegra124_clk_register_emc(void __iomem *base, struct device_node *np, + spinlock_t *lock) { return NULL; } + +static inline bool tegra124_clk_emc_driver_available(struct clk_hw *emc_hw) +{ + return false; +} #endif
void tegra114_clock_tune_cpu_trimmers_high(void); diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig index ca7077a06f4c..f5b451403c58 100644 --- a/drivers/memory/tegra/Kconfig +++ b/drivers/memory/tegra/Kconfig @@ -32,9 +32,10 @@ config TEGRA30_EMC external memory.
config TEGRA124_EMC - bool "NVIDIA Tegra124 External Memory Controller driver" + tristate "NVIDIA Tegra124 External Memory Controller driver" default y depends on TEGRA_MC && ARCH_TEGRA_124_SOC + select TEGRA124_CLK_EMC help This driver is for the External Memory Controller (EMC) found on Tegra124 chips. The EMC controls the external DRAM on the board. diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c index ee8ee39e98ed..edfbf6d6d357 100644 --- a/drivers/memory/tegra/tegra124-emc.c +++ b/drivers/memory/tegra/tegra124-emc.c @@ -9,16 +9,17 @@ #include <linux/clk-provider.h> #include <linux/clk.h> #include <linux/clkdev.h> +#include <linux/clk/tegra.h> #include <linux/debugfs.h> #include <linux/delay.h> #include <linux/io.h> +#include <linux/module.h> #include <linux/of_address.h> #include <linux/of_platform.h> #include <linux/platform_device.h> #include <linux/sort.h> #include <linux/string.h>
-#include <soc/tegra/emc.h> #include <soc/tegra/fuse.h> #include <soc/tegra/mc.h>
@@ -562,8 +563,8 @@ static struct emc_timing *tegra_emc_find_timing(struct tegra_emc *emc, return timing; }
-int tegra_emc_prepare_timing_change(struct tegra_emc *emc, - unsigned long rate) +static int tegra_emc_prepare_timing_change(struct tegra_emc *emc, + unsigned long rate) { struct emc_timing *timing = tegra_emc_find_timing(emc, rate); struct emc_timing *last = &emc->last_timing; @@ -790,8 +791,8 @@ int tegra_emc_prepare_timing_change(struct tegra_emc *emc, return 0; }
-void tegra_emc_complete_timing_change(struct tegra_emc *emc, - unsigned long rate) +static void tegra_emc_complete_timing_change(struct tegra_emc *emc, + unsigned long rate) { struct emc_timing *timing = tegra_emc_find_timing(emc, rate); struct emc_timing *last = &emc->last_timing; @@ -987,6 +988,7 @@ static const struct of_device_id tegra_emc_of_match[] = { { .compatible = "nvidia,tegra132-emc" }, {} }; +MODULE_DEVICE_TABLE(of, tegra_emc_of_match);
static struct device_node * tegra_emc_find_node_by_ram_code(struct device_node *node, u32 ram_code) @@ -1226,9 +1228,19 @@ static int tegra_emc_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, emc);
+ tegra124_clk_set_emc_callbacks(tegra_emc_prepare_timing_change, + tegra_emc_complete_timing_change); + if (IS_ENABLED(CONFIG_DEBUG_FS)) emc_debugfs_init(&pdev->dev, emc);
+ /* + * Don't allow the kernel module to be unloaded. Unloading adds some + * extra complexity which doesn't really worth the effort in a case of + * this driver. + */ + try_module_get(THIS_MODULE); + return 0; };
@@ -1240,9 +1252,8 @@ static struct platform_driver tegra_emc_driver = { .suppress_bind_attrs = true, }, }; +module_platform_driver(tegra_emc_driver);
-static int tegra_emc_init(void) -{ - return platform_driver_register(&tegra_emc_driver); -} -subsys_initcall(tegra_emc_init); +MODULE_AUTHOR("Mikko Perttunen mperttunen@nvidia.com"); +MODULE_DESCRIPTION("NVIDIA Tegra124 EMC driver"); +MODULE_LICENSE("GPL v2"); diff --git a/include/linux/clk/tegra.h b/include/linux/clk/tegra.h index 3f01d43f0598..eb016fc9cc0b 100644 --- a/include/linux/clk/tegra.h +++ b/include/linux/clk/tegra.h @@ -136,6 +136,7 @@ extern void tegra210_clk_emc_dll_update_setting(u32 emc_dll_src_value); extern void tegra210_clk_emc_update_setting(u32 emc_src_value);
struct clk; +struct tegra_emc;
typedef long (tegra20_clk_emc_round_cb)(unsigned long rate, unsigned long min_rate, @@ -146,6 +147,13 @@ void tegra20_clk_set_emc_round_callback(tegra20_clk_emc_round_cb *round_cb, void *cb_arg); int tegra20_clk_prepare_emc_mc_same_freq(struct clk *emc_clk, bool same);
+typedef int (tegra124_emc_prepare_timing_change_cb)(struct tegra_emc *emc, + unsigned long rate); +typedef void (tegra124_emc_complete_timing_change_cb)(struct tegra_emc *emc, + unsigned long rate); +void tegra124_clk_set_emc_callbacks(tegra124_emc_prepare_timing_change_cb *prep_cb, + tegra124_emc_complete_timing_change_cb *complete_cb); + struct tegra210_clk_emc_config { unsigned long rate; bool same_freq; diff --git a/include/soc/tegra/emc.h b/include/soc/tegra/emc.h deleted file mode 100644 index 05199a97ccf4..000000000000 --- a/include/soc/tegra/emc.h +++ /dev/null @@ -1,16 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-only */ -/* - * Copyright (c) 2014 NVIDIA Corporation. All rights reserved. - */ - -#ifndef __SOC_TEGRA_EMC_H__ -#define __SOC_TEGRA_EMC_H__ - -struct tegra_emc; - -int tegra_emc_prepare_timing_change(struct tegra_emc *emc, - unsigned long rate); -void tegra_emc_complete_timing_change(struct tegra_emc *emc, - unsigned long rate); - -#endif /* __SOC_TEGRA_EMC_H__ */
EMC driver will become mandatory after turning it into interconnect provider because interconnect users, like display controller driver, will fail to probe using newer device-trees that have interconnect properties. Thus make EMC driver to probe even if timings are missing in device-tree.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/memory/tegra/tegra124-emc.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-)
diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c index edfbf6d6d357..8fb8c1af25c9 100644 --- a/drivers/memory/tegra/tegra124-emc.c +++ b/drivers/memory/tegra/tegra124-emc.c @@ -1201,23 +1201,15 @@ static int tegra_emc_probe(struct platform_device *pdev) ram_code = tegra_read_ram_code();
np = tegra_emc_find_node_by_ram_code(pdev->dev.of_node, ram_code); - if (!np) { - dev_err(&pdev->dev, - "no memory timings for RAM code %u found in DT\n", - ram_code); - return -ENOENT; - } - - err = tegra_emc_load_timings_from_dt(emc, np); - of_node_put(np); - if (err) - return err; - - if (emc->num_timings == 0) { - dev_err(&pdev->dev, - "no memory timings for RAM code %u registered\n", - ram_code); - return -ENOENT; + if (np) { + err = tegra_emc_load_timings_from_dt(emc, np); + of_node_put(np); + if (err) + return err; + } else { + dev_info(&pdev->dev, + "no memory timings for RAM code %u found in DT\n", + ram_code); }
err = emc_init(emc);
Now Internal and External memory controllers are memory interconnection providers. This allows us to use interconnect API for tuning of memory configuration. EMC driver now supports OPPs and DVFS.
Tested-by: Nicolas Chauvet kwizart@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/memory/tegra/Kconfig | 1 + drivers/memory/tegra/tegra124-emc.c | 325 +++++++++++++++++++++++++++- drivers/memory/tegra/tegra124.c | 82 ++++++- 3 files changed, 396 insertions(+), 12 deletions(-)
diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig index f5b451403c58..a70967a56e52 100644 --- a/drivers/memory/tegra/Kconfig +++ b/drivers/memory/tegra/Kconfig @@ -36,6 +36,7 @@ config TEGRA124_EMC default y depends on TEGRA_MC && ARCH_TEGRA_124_SOC select TEGRA124_CLK_EMC + select PM_OPP help This driver is for the External Memory Controller (EMC) found on Tegra124 chips. The EMC controls the external DRAM on the board. diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c index 8fb8c1af25c9..ed34ca982364 100644 --- a/drivers/memory/tegra/tegra124-emc.c +++ b/drivers/memory/tegra/tegra124-emc.c @@ -12,20 +12,26 @@ #include <linux/clk/tegra.h> #include <linux/debugfs.h> #include <linux/delay.h> +#include <linux/interconnect-provider.h> #include <linux/io.h> #include <linux/module.h> +#include <linux/mutex.h> #include <linux/of_address.h> #include <linux/of_platform.h> #include <linux/platform_device.h> +#include <linux/pm_opp.h> #include <linux/sort.h> #include <linux/string.h>
#include <soc/tegra/fuse.h> #include <soc/tegra/mc.h>
+#include "mc.h" + #define EMC_FBIO_CFG5 0x104 #define EMC_FBIO_CFG5_DRAM_TYPE_MASK 0x3 #define EMC_FBIO_CFG5_DRAM_TYPE_SHIFT 0 +#define EMC_FBIO_CFG5_DRAM_WIDTH_X64 BIT(4)
#define EMC_INTSTATUS 0x0 #define EMC_INTSTATUS_CLKCHANGE_COMPLETE BIT(4) @@ -461,6 +467,17 @@ struct emc_timing { u32 emc_zcal_interval; };
+enum emc_rate_request_type { + EMC_RATE_DEBUG, + EMC_RATE_ICC, + EMC_RATE_TYPE_MAX, +}; + +struct emc_rate_request { + unsigned long min_rate; + unsigned long max_rate; +}; + struct tegra_emc { struct device *dev;
@@ -471,6 +488,7 @@ struct tegra_emc { struct clk *clk;
enum emc_dram_type dram_type; + unsigned int dram_bus_width; unsigned int dram_num;
struct emc_timing last_timing; @@ -482,6 +500,17 @@ struct tegra_emc { unsigned long min_rate; unsigned long max_rate; } debugfs; + + struct icc_provider provider; + + /* + * There are multiple sources in the EMC driver which could request + * a min/max clock rate, these rates are contained in this array. + */ + struct emc_rate_request requested_rate[EMC_RATE_TYPE_MAX]; + + /* protect shared rate-change code path */ + struct mutex rate_lock; };
/* Timing change sequence functions */ @@ -870,6 +899,14 @@ static void emc_read_current_timing(struct tegra_emc *emc, static int emc_init(struct tegra_emc *emc) { emc->dram_type = readl(emc->regs + EMC_FBIO_CFG5); + + if (emc->dram_type & EMC_FBIO_CFG5_DRAM_WIDTH_X64) + emc->dram_bus_width = 64; + else + emc->dram_bus_width = 32; + + dev_info(emc->dev, "%ubit DRAM bus\n", emc->dram_bus_width); + emc->dram_type &= EMC_FBIO_CFG5_DRAM_TYPE_MASK; emc->dram_type >>= EMC_FBIO_CFG5_DRAM_TYPE_SHIFT;
@@ -1009,6 +1046,83 @@ tegra_emc_find_node_by_ram_code(struct device_node *node, u32 ram_code) return NULL; }
+static void tegra_emc_rate_requests_init(struct tegra_emc *emc) +{ + unsigned int i; + + for (i = 0; i < EMC_RATE_TYPE_MAX; i++) { + emc->requested_rate[i].min_rate = 0; + emc->requested_rate[i].max_rate = ULONG_MAX; + } +} + +static int emc_request_rate(struct tegra_emc *emc, + unsigned long new_min_rate, + unsigned long new_max_rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = emc->requested_rate; + unsigned long min_rate = 0, max_rate = ULONG_MAX; + unsigned int i; + int err; + + /* select minimum and maximum rates among the requested rates */ + for (i = 0; i < EMC_RATE_TYPE_MAX; i++, req++) { + if (i == type) { + min_rate = max(new_min_rate, min_rate); + max_rate = min(new_max_rate, max_rate); + } else { + min_rate = max(req->min_rate, min_rate); + max_rate = min(req->max_rate, max_rate); + } + } + + if (min_rate > max_rate) { + dev_err_ratelimited(emc->dev, "%s: type %u: out of range: %lu %lu\n", + __func__, type, min_rate, max_rate); + return -ERANGE; + } + + /* + * EMC rate-changes should go via OPP API because it manages voltage + * changes. + */ + err = dev_pm_opp_set_rate(emc->dev, min_rate); + if (err) + return err; + + emc->requested_rate[type].min_rate = new_min_rate; + emc->requested_rate[type].max_rate = new_max_rate; + + return 0; +} + +static int emc_set_min_rate(struct tegra_emc *emc, unsigned long rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = &emc->requested_rate[type]; + int ret; + + mutex_lock(&emc->rate_lock); + ret = emc_request_rate(emc, rate, req->max_rate, type); + mutex_unlock(&emc->rate_lock); + + return ret; +} + +static int emc_set_max_rate(struct tegra_emc *emc, unsigned long rate, + enum emc_rate_request_type type) +{ + struct emc_rate_request *req = &emc->requested_rate[type]; + int ret; + + mutex_lock(&emc->rate_lock); + ret = emc_request_rate(emc, req->min_rate, rate, type); + mutex_unlock(&emc->rate_lock); + + return ret; +} + /* * debugfs interface * @@ -1081,7 +1195,7 @@ static int tegra_emc_debug_min_rate_set(void *data, u64 rate) if (!tegra_emc_validate_rate(emc, rate)) return -EINVAL;
- err = clk_set_min_rate(emc->clk, rate); + err = emc_set_min_rate(emc, rate, EMC_RATE_DEBUG); if (err < 0) return err;
@@ -1111,7 +1225,7 @@ static int tegra_emc_debug_max_rate_set(void *data, u64 rate) if (!tegra_emc_validate_rate(emc, rate)) return -EINVAL;
- err = clk_set_max_rate(emc->clk, rate); + err = emc_set_max_rate(emc, rate, EMC_RATE_DEBUG); if (err < 0) return err;
@@ -1129,15 +1243,6 @@ static void emc_debugfs_init(struct device *dev, struct tegra_emc *emc) unsigned int i; int err;
- emc->clk = devm_clk_get(dev, "emc"); - if (IS_ERR(emc->clk)) { - if (PTR_ERR(emc->clk) != -ENODEV) { - dev_err(dev, "failed to get EMC clock: %ld\n", - PTR_ERR(emc->clk)); - return; - } - } - emc->debugfs.min_rate = ULONG_MAX; emc->debugfs.max_rate = 0;
@@ -1177,6 +1282,182 @@ static void emc_debugfs_init(struct device *dev, struct tegra_emc *emc) emc, &tegra_emc_debug_max_rate_fops); }
+static inline struct tegra_emc * +to_tegra_emc_provider(struct icc_provider *provider) +{ + return container_of(provider, struct tegra_emc, provider); +} + +static struct icc_node_data * +emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data) +{ + struct icc_provider *provider = data; + struct icc_node_data *ndata; + struct icc_node *node; + + /* External Memory is the only possible ICC route */ + list_for_each_entry(node, &provider->nodes, node_list) { + if (node->id != TEGRA_ICC_EMEM) + continue; + + ndata = kzalloc(sizeof(*ndata), GFP_KERNEL); + if (!ndata) + return ERR_PTR(-ENOMEM); + + /* + * SRC and DST nodes should have matching TAG in order to have + * it set by default for a requested path. + */ + ndata->tag = TEGRA_MC_ICC_TAG_ISO; + ndata->node = node; + + return ndata; + } + + return ERR_PTR(-EPROBE_DEFER); +} + +static int emc_icc_set(struct icc_node *src, struct icc_node *dst) +{ + struct tegra_emc *emc = to_tegra_emc_provider(dst->provider); + unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw); + unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw); + unsigned long long rate = max(avg_bw, peak_bw); + unsigned int dram_data_bus_width_bytes; + const unsigned int ddr = 2; + int err; + + /* + * Tegra124 EMC runs on a clock rate of SDRAM bus. This means that + * EMC clock rate is twice smaller than the peak data rate because + * data is sampled on both EMC clock edges. + */ + dram_data_bus_width_bytes = emc->dram_bus_width / 8; + do_div(rate, ddr * dram_data_bus_width_bytes); + rate = min_t(u64, rate, U32_MAX); + + err = emc_set_min_rate(emc, rate, EMC_RATE_ICC); + if (err) + return err; + + return 0; +} + +static int tegra_emc_interconnect_init(struct tegra_emc *emc) +{ + const struct tegra_mc_soc *soc = emc->mc->soc; + struct icc_node *node; + int err; + + emc->provider.dev = emc->dev; + emc->provider.set = emc_icc_set; + emc->provider.data = &emc->provider; + emc->provider.aggregate = soc->icc_ops->aggregate; + emc->provider.xlate_extended = emc_of_icc_xlate_extended; + + err = icc_provider_add(&emc->provider); + if (err) + goto err_msg; + + /* create External Memory Controller node */ + node = icc_node_create(TEGRA_ICC_EMC); + if (IS_ERR(node)) { + err = PTR_ERR(node); + goto del_provider; + } + + node->name = "External Memory Controller"; + icc_node_add(node, &emc->provider); + + /* link External Memory Controller to External Memory (DRAM) */ + err = icc_link_create(node, TEGRA_ICC_EMEM); + if (err) + goto remove_nodes; + + /* create External Memory node */ + node = icc_node_create(TEGRA_ICC_EMEM); + if (IS_ERR(node)) { + err = PTR_ERR(node); + goto remove_nodes; + } + + node->name = "External Memory (DRAM)"; + icc_node_add(node, &emc->provider); + + return 0; + +remove_nodes: + icc_nodes_remove(&emc->provider); +del_provider: + icc_provider_del(&emc->provider); +err_msg: + dev_err(emc->dev, "failed to initialize ICC: %d\n", err); + + return err; +} + +static int tegra_emc_opp_table_init(struct tegra_emc *emc) +{ + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); + struct opp_table *clk_opp_table, *hw_opp_table; + int err; + + clk_opp_table = dev_pm_opp_set_clkname(emc->dev, NULL); + err = PTR_ERR_OR_ZERO(clk_opp_table); + if (err) { + dev_err(emc->dev, "failed to set OPP clk: %d\n", err); + return err; + } + + hw_opp_table = dev_pm_opp_set_supported_hw(emc->dev, &hw_version, 1); + err = PTR_ERR_OR_ZERO(hw_opp_table); + if (err) { + dev_err(emc->dev, "failed to set OPP supported HW: %d\n", err); + goto put_clk_table; + } + + err = dev_pm_opp_of_add_table(emc->dev); + if (err) { + /* + * -ENODEV means that this is an older device-tree which + * doesn't have OPP table and EMC driver isn't very useful + * in this case. + */ + if (err == -ENODEV) + dev_err(emc->dev, "OPP table not found, please update your device tree\n"); + else + dev_err(emc->dev, "failed to add OPP table: %d\n", err); + + goto put_hw_table; + } + + dev_info(emc->dev, "OPP HW ver. 0x%x, current clock rate %lu MHz\n", + hw_version, clk_get_rate(emc->clk) / 1000000); + + /* first dummy rate-set initializes voltage state */ + err = dev_pm_opp_set_rate(emc->dev, clk_get_rate(emc->clk)); + if (err) { + dev_err(emc->dev, "failed to initialize OPP clock: %d\n", err); + goto remove_table; + } + + return 0; + +remove_table: + dev_pm_opp_of_remove_table(emc->dev); +put_hw_table: + dev_pm_opp_put_supported_hw(hw_opp_table); +put_clk_table: + dev_pm_opp_put_clkname(clk_opp_table); + + return err; +} + +static void devm_tegra_emc_unset_callback(void *data) +{ + tegra124_clk_set_emc_callbacks(NULL, NULL); +} + static int tegra_emc_probe(struct platform_device *pdev) { struct device_node *np; @@ -1188,6 +1469,7 @@ static int tegra_emc_probe(struct platform_device *pdev) if (!emc) return -ENOMEM;
+ mutex_init(&emc->rate_lock); emc->dev = &pdev->dev;
emc->regs = devm_platform_ioremap_resource(pdev, 0); @@ -1223,9 +1505,29 @@ static int tegra_emc_probe(struct platform_device *pdev) tegra124_clk_set_emc_callbacks(tegra_emc_prepare_timing_change, tegra_emc_complete_timing_change);
+ err = devm_add_action_or_reset(&pdev->dev, devm_tegra_emc_unset_callback, + NULL); + if (err) + return err; + + emc->clk = devm_clk_get(&pdev->dev, "emc"); + if (IS_ERR(emc->clk)) { + err = PTR_ERR(emc->clk); + dev_err(&pdev->dev, "failed to get EMC clock: %d\n", err); + return err; + } + + err = tegra_emc_opp_table_init(emc); + if (err) + return err; + + tegra_emc_rate_requests_init(emc); + if (IS_ENABLED(CONFIG_DEBUG_FS)) emc_debugfs_init(&pdev->dev, emc);
+ tegra_emc_interconnect_init(emc); + /* * Don't allow the kernel module to be unloaded. Unloading adds some * extra complexity which doesn't really worth the effort in a case of @@ -1242,6 +1544,7 @@ static struct platform_driver tegra_emc_driver = { .name = "tegra-emc", .of_match_table = tegra_emc_of_match, .suppress_bind_attrs = true, + .sync_state = icc_sync_state, }, }; module_platform_driver(tegra_emc_driver); diff --git a/drivers/memory/tegra/tegra124.c b/drivers/memory/tegra/tegra124.c index e2389573d3c0..459211f50c08 100644 --- a/drivers/memory/tegra/tegra124.c +++ b/drivers/memory/tegra/tegra124.c @@ -4,7 +4,8 @@ */
#include <linux/of.h> -#include <linux/mm.h> +#include <linux/of_device.h> +#include <linux/slab.h>
#include <dt-bindings/memory/tegra124-mc.h>
@@ -1010,6 +1011,83 @@ static const struct tegra_mc_reset tegra124_mc_resets[] = { TEGRA124_MC_RESET(GPU, 0x970, 0x974, 2), };
+static int tegra124_mc_icc_set(struct icc_node *src, struct icc_node *dst) +{ + /* TODO: program PTSA */ + return 0; +} + +static int tegra124_mc_icc_aggreate(struct icc_node *node, u32 tag, u32 avg_bw, + u32 peak_bw, u32 *agg_avg, u32 *agg_peak) +{ + /* + * ISO clients need to reserve extra bandwidth up-front because + * there could be high bandwidth pressure during initial filling + * of the client's FIFO buffers. Secondly, we need to take into + * account impurities of the memory subsystem. + */ + if (tag & TEGRA_MC_ICC_TAG_ISO) + peak_bw = tegra_mc_scale_percents(peak_bw, 400); + + *agg_avg += avg_bw; + *agg_peak = max(*agg_peak, peak_bw); + + return 0; +} + +static struct icc_node_data * +tegra124_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data) +{ + struct tegra_mc *mc = icc_provider_to_tegra_mc(data); + const struct tegra_mc_client *client; + unsigned int i, idx = spec->args[0]; + struct icc_node_data *ndata; + struct icc_node *node; + + list_for_each_entry(node, &mc->provider.nodes, node_list) { + if (node->id != idx) + continue; + + ndata = kzalloc(sizeof(*ndata), GFP_KERNEL); + if (!ndata) + return ERR_PTR(-ENOMEM); + + client = &mc->soc->clients[idx]; + ndata->node = node; + + switch (client->swgroup) { + case TEGRA_SWGROUP_DC: + case TEGRA_SWGROUP_DCB: + case TEGRA_SWGROUP_PTC: + case TEGRA_SWGROUP_VI: + /* these clients are isochronous by default */ + ndata->tag = TEGRA_MC_ICC_TAG_ISO; + break; + + default: + ndata->tag = TEGRA_MC_ICC_TAG_DEFAULT; + break; + } + + return ndata; + } + + for (i = 0; i < mc->soc->num_clients; i++) { + if (mc->soc->clients[i].id == idx) + return ERR_PTR(-EPROBE_DEFER); + } + + dev_err(mc->dev, "invalid ICC client ID %u\n", idx); + + return ERR_PTR(-EINVAL); +} + +static const struct tegra_mc_icc_ops tegra124_mc_icc_ops = { + .xlate_extended = tegra124_mc_of_icc_xlate_extended, + .aggregate = tegra124_mc_icc_aggreate, + .set = tegra124_mc_icc_set, +}; + #ifdef CONFIG_ARCH_TEGRA_124_SOC static const unsigned long tegra124_mc_emem_regs[] = { MC_EMEM_ARB_CFG, @@ -1061,6 +1139,7 @@ const struct tegra_mc_soc tegra124_mc_soc = { .reset_ops = &tegra_mc_reset_ops_common, .resets = tegra124_mc_resets, .num_resets = ARRAY_SIZE(tegra124_mc_resets), + .icc_ops = &tegra124_mc_icc_ops, }; #endif /* CONFIG_ARCH_TEGRA_124_SOC */
@@ -1091,5 +1170,6 @@ const struct tegra_mc_soc tegra132_mc_soc = { .reset_ops = &tegra_mc_reset_ops_common, .resets = tegra124_mc_resets, .num_resets = ARRAY_SIZE(tegra124_mc_resets), + .icc_ops = &tegra124_mc_icc_ops, }; #endif /* CONFIG_ARCH_TEGRA_132_SOC */
Display controller (DC) performs isochronous memory transfers, and thus, has a requirement for a minimum memory bandwidth that shall be fulfilled, otherwise framebuffer data can't be fetched fast enough and this results in a DC's data-FIFO underflow that follows by a visual corruption.
The Memory Controller drivers provide facility for memory bandwidth management via interconnect API. Let's wire up the interconnect API support to the DC driver in order to fix the distorted display output on T30 Ouya, T124 TK1 and other Tegra devices.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/Kconfig | 1 + drivers/gpu/drm/tegra/dc.c | 349 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tegra/dc.h | 14 ++ drivers/gpu/drm/tegra/drm.c | 14 ++ drivers/gpu/drm/tegra/hub.c | 3 + drivers/gpu/drm/tegra/plane.c | 121 ++++++++++++ drivers/gpu/drm/tegra/plane.h | 15 ++ 7 files changed, 517 insertions(+)
diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig index 5043dcaf1cf9..1650a448eabd 100644 --- a/drivers/gpu/drm/tegra/Kconfig +++ b/drivers/gpu/drm/tegra/Kconfig @@ -9,6 +9,7 @@ config DRM_TEGRA select DRM_MIPI_DSI select DRM_PANEL select TEGRA_HOST1X + select INTERCONNECT select IOMMU_IOVA select CEC_CORE if CEC_NOTIFIER help diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 85dd7131553a..5c587cfd1bb2 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -8,6 +8,7 @@ #include <linux/debugfs.h> #include <linux/delay.h> #include <linux/iommu.h> +#include <linux/interconnect.h> #include <linux/module.h> #include <linux/of_device.h> #include <linux/pm_runtime.h> @@ -616,6 +617,9 @@ static int tegra_plane_atomic_check(struct drm_plane *plane, struct tegra_dc *dc = to_tegra_dc(state->crtc); int err;
+ plane_state->peak_memory_bandwidth = 0; + plane_state->avg_memory_bandwidth = 0; + /* no need for further checks if the plane is being disabled */ if (!state->crtc) return 0; @@ -802,6 +806,12 @@ static struct drm_plane *tegra_primary_plane_create(struct drm_device *drm, formats = dc->soc->primary_formats; modifiers = dc->soc->modifiers;
+ err = tegra_plane_interconnect_init(plane); + if (err) { + kfree(plane); + return ERR_PTR(err); + } + err = drm_universal_plane_init(drm, &plane->base, possible_crtcs, &tegra_plane_funcs, formats, num_formats, modifiers, type, NULL); @@ -833,9 +843,13 @@ static const u32 tegra_cursor_plane_formats[] = { static int tegra_cursor_atomic_check(struct drm_plane *plane, struct drm_plane_state *state) { + struct tegra_plane_state *plane_state = to_tegra_plane_state(state); struct tegra_plane *tegra = to_tegra_plane(plane); int err;
+ plane_state->peak_memory_bandwidth = 0; + plane_state->avg_memory_bandwidth = 0; + /* no need for further checks if the plane is being disabled */ if (!state->crtc) return 0; @@ -973,6 +987,12 @@ static struct drm_plane *tegra_dc_cursor_plane_create(struct drm_device *drm, num_formats = ARRAY_SIZE(tegra_cursor_plane_formats); formats = tegra_cursor_plane_formats;
+ err = tegra_plane_interconnect_init(plane); + if (err) { + kfree(plane); + return ERR_PTR(err); + } + err = drm_universal_plane_init(drm, &plane->base, possible_crtcs, &tegra_plane_funcs, formats, num_formats, NULL, @@ -1087,6 +1107,12 @@ static struct drm_plane *tegra_dc_overlay_plane_create(struct drm_device *drm, num_formats = dc->soc->num_overlay_formats; formats = dc->soc->overlay_formats;
+ err = tegra_plane_interconnect_init(plane); + if (err) { + kfree(plane); + return ERR_PTR(err); + } + if (!cursor) type = DRM_PLANE_TYPE_OVERLAY; else @@ -1204,6 +1230,7 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc) { struct tegra_dc_state *state = to_dc_state(crtc->state); struct tegra_dc_state *copy; + unsigned int i;
copy = kmalloc(sizeof(*copy), GFP_KERNEL); if (!copy) @@ -1215,6 +1242,9 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc) copy->div = state->div; copy->planes = state->planes;
+ for (i = 0; i < ARRAY_SIZE(state->plane_peak_bw); i++) + copy->plane_peak_bw[i] = state->plane_peak_bw[i]; + return ©->base; }
@@ -1741,6 +1771,106 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, unsigned long timeout) return -ETIMEDOUT; }
+static void +tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc, + struct drm_atomic_state *state, + bool prepare_bandwidth_transition) +{ + const struct tegra_plane_state *old_tegra_state, *new_tegra_state; + const struct tegra_dc_state *old_dc_state, *new_dc_state; + u32 i, new_avg_bw, old_avg_bw, new_peak_bw, old_peak_bw; + const struct drm_plane_state *old_plane_state; + const struct drm_crtc_state *old_crtc_state; + struct tegra_dc_window window, old_window; + struct tegra_dc *dc = to_tegra_dc(crtc); + struct tegra_plane *tegra; + struct drm_plane *plane; + + if (dc->soc->has_nvdisplay) + return; + + old_crtc_state = drm_atomic_get_old_crtc_state(state, crtc); + old_dc_state = to_const_dc_state(old_crtc_state); + new_dc_state = to_const_dc_state(crtc->state); + + if (!crtc->state->active) { + if (!old_crtc_state->active) + return; + + /* + * When CRTC is disabled on DPMS, the state of attached planes + * is kept unchanged. Hence we need to enforce removal of the + * bandwidths from the ICC paths. + */ + drm_atomic_crtc_for_each_plane(plane, crtc) { + tegra = to_tegra_plane(plane); + + icc_set_bw(tegra->icc_mem, 0, 0); + icc_set_bw(tegra->icc_mem_vfilter, 0, 0); + } + + return; + } + + for_each_old_plane_in_state(old_crtc_state->state, plane, + old_plane_state, i) { + old_tegra_state = to_const_tegra_plane_state(old_plane_state); + new_tegra_state = to_const_tegra_plane_state(plane->state); + tegra = to_tegra_plane(plane); + + /* + * We're iterating over the global atomic state and it contains + * planes from another CRTC, hence we need to filter out the + * planes unrelated to this CRTC. + */ + if (tegra->dc != dc) + continue; + + new_avg_bw = new_tegra_state->avg_memory_bandwidth; + old_avg_bw = old_tegra_state->avg_memory_bandwidth; + + new_peak_bw = new_dc_state->plane_peak_bw[tegra->index]; + old_peak_bw = old_dc_state->plane_peak_bw[tegra->index]; + + /* + * See the comment related to !crtc->state->active above, + * which explains why bandwidths need to be updated when + * CRTC is turning ON. + */ + if (new_avg_bw == old_avg_bw && new_peak_bw == old_peak_bw && + old_crtc_state->active) + continue; + + window.src.h = drm_rect_height(&plane->state->src) >> 16; + window.dst.h = drm_rect_height(&plane->state->dst); + + old_window.src.h = drm_rect_height(&old_plane_state->src) >> 16; + old_window.dst.h = drm_rect_height(&old_plane_state->dst); + + /* + * During the preparation phase (atomic_begin), the memory + * freq should go high before the DC changes are committed + * if bandwidth requirement goes up, otherwise memory freq + * should to stay high if BW requirement goes down. The + * opposite applies to the completion phase (post_commit). + */ + if (prepare_bandwidth_transition) { + new_avg_bw = max(old_avg_bw, new_avg_bw); + new_peak_bw = max(old_peak_bw, new_peak_bw); + + if (tegra_plane_use_vertical_filtering(tegra, &old_window)) + window = old_window; + } + + icc_set_bw(tegra->icc_mem, new_avg_bw, new_peak_bw); + + if (tegra_plane_use_vertical_filtering(tegra, &window)) + icc_set_bw(tegra->icc_mem_vfilter, new_avg_bw, new_peak_bw); + else + icc_set_bw(tegra->icc_mem_vfilter, 0, 0); + } +} + static void tegra_crtc_atomic_disable(struct drm_crtc *crtc, struct drm_atomic_state *state) { @@ -1922,6 +2052,8 @@ static void tegra_crtc_atomic_begin(struct drm_crtc *crtc, { unsigned long flags;
+ tegra_crtc_update_memory_bandwidth(crtc, state, true); + if (crtc->state->event) { spin_lock_irqsave(&crtc->dev->event_lock, flags);
@@ -1954,7 +2086,212 @@ static void tegra_crtc_atomic_flush(struct drm_crtc *crtc, value = tegra_dc_readl(dc, DC_CMD_STATE_CONTROL); }
+static bool tegra_plane_is_cursor(const struct drm_plane_state *state) +{ + const struct tegra_dc_soc_info *soc = to_tegra_dc(state->crtc)->soc; + const struct drm_format_info *fmt = state->fb->format; + unsigned int src_w = drm_rect_width(&state->src) >> 16; + unsigned int dst_w = drm_rect_width(&state->dst); + + if (state->plane->type != DRM_PLANE_TYPE_CURSOR) + return false; + + if (soc->supports_cursor) + return true; + + if (src_w != dst_w || fmt->num_planes != 1 || src_w * fmt->cpp[0] > 256) + return false; + + return true; +} + +static unsigned long +tegra_plane_overlap_mask(struct drm_crtc_state *state, + const struct drm_plane_state *plane_state) +{ + const struct drm_plane_state *other_state; + const struct tegra_plane *tegra; + unsigned long overlap_mask = 0; + struct drm_plane *plane; + struct drm_rect rect; + + if (!plane_state->visible || !plane_state->fb) + return 0; + + drm_atomic_crtc_state_for_each_plane_state(plane, other_state, state) { + rect = plane_state->dst; + + tegra = to_tegra_plane(other_state->plane); + + if (!other_state->visible || !other_state->fb) + continue; + + /* + * Ignore cursor plane overlaps because it's not practical to + * assume that it contributes to the bandwidth in overlapping + * area if window width is small. + */ + if (tegra_plane_is_cursor(other_state)) + continue; + + if (drm_rect_intersect(&rect, &other_state->dst)) + overlap_mask |= BIT(tegra->index); + } + + /* + * Data prefetch FIFO will easily help to overcome temporal memory + * pressure if other plane overlaps with the cursor plane. + */ + if (tegra_plane_is_cursor(plane_state) && overlap_mask) + return 0; + + return overlap_mask; +} + +static struct drm_plane * +tegra_crtc_get_plane_by_index(struct drm_crtc *crtc, unsigned int index) +{ + struct drm_plane *plane; + + drm_atomic_crtc_for_each_plane(plane, crtc) { + if (to_tegra_plane(plane)->index == index) + return plane; + } + + return NULL; +} + +static int tegra_crtc_check_bandwidth_state(struct drm_crtc *crtc, + struct drm_atomic_state *state) +{ + ulong overlap_mask[TEGRA_DC_LEGACY_PLANES_NUM] = {}, mask; + u32 plane_peak_bw[TEGRA_DC_LEGACY_PLANES_NUM] = {}; + bool all_planes_overlap_simultaneously = true; + const struct tegra_plane_state *tegra_state; + const struct drm_plane_state *plane_state; + const struct tegra_dc_state *old_dc_state; + struct tegra_dc *dc = to_tegra_dc(crtc); + const struct drm_crtc_state *old_state; + struct tegra_dc_state *new_dc_state; + struct drm_crtc_state *new_state; + struct tegra_plane *tegra; + struct drm_plane *plane; + u32 i, k, overlap_bw; + + /* + * The nv-display uses shared planes. The algorithm below assumes + * maximum 3 planes per-CRTC, this assumption isn't applicable to + * the nv-display. Note that T124 support has additional windows, + * but currently they aren't supported by the driver. + */ + if (dc->soc->has_nvdisplay) + return 0; + + new_state = drm_atomic_get_new_crtc_state(state, crtc); + new_dc_state = to_dc_state(new_state); + + /* + * For overlapping planes pixel's data is fetched for each plane at + * the same time, hence bandwidths are accumulated in this case. + * This needs to be taken into account for calculating total bandwidth + * consumed by all planes. + * + * Here we get the overlapping state of each plane, which is a + * bitmask of plane indices telling with what planes there is an + * overlap. Note that bitmask[plane] includes BIT(plane) in order + * to make further code nicer and simpler. + */ + drm_atomic_crtc_state_for_each_plane_state(plane, plane_state, new_state) { + tegra_state = to_const_tegra_plane_state(plane_state); + tegra = to_tegra_plane(plane); + + plane_peak_bw[tegra->index] = tegra_state->peak_memory_bandwidth; + mask = tegra_plane_overlap_mask(new_state, plane_state); + overlap_mask[tegra->index] = mask; + + if (hweight_long(mask) != 3) + all_planes_overlap_simultaneously = false; + } + + old_state = drm_atomic_get_old_crtc_state(state, crtc); + old_dc_state = to_const_dc_state(old_state); + + /* + * Then we calculate maximum bandwidth of each plane state. + * The bandwidth includes the plane BW + BW of the "simultaneously" + * overlapping planes, where "simultaneously" means areas where DC + * fetches from the planes simultaneously during of scan-out process. + * + * For example, if plane A overlaps with planes B and C, but B and C + * don't overlap, then the peak bandwidth will be either in area where + * A-and-B or A-and-C planes overlap. + * + * The plane_peak_bw[] contains peak memory bandwidth values of + * each plane, this information is needed by interconnect provider + * in order to set up latency allowness based on the peak BW, see + * tegra_crtc_update_memory_bandwidth(). + */ + for (i = 0; i < ARRAY_SIZE(plane_peak_bw); i++) { + overlap_bw = 0; + + for_each_set_bit(k, &overlap_mask[i], 3) { + if (k == i) + continue; + + if (all_planes_overlap_simultaneously) + overlap_bw += plane_peak_bw[k]; + else + overlap_bw = max(overlap_bw, plane_peak_bw[k]); + } + + new_dc_state->plane_peak_bw[i] = plane_peak_bw[i] + overlap_bw; + + /* + * If plane's peak bandwidth changed (for example plane isn't + * overlapped anymore) and plane isn't in the atomic state, + * then add plane to the state in order to have the bandwidth + * updated. + */ + if (old_dc_state->plane_peak_bw[i] != + new_dc_state->plane_peak_bw[i]) { + plane = tegra_crtc_get_plane_by_index(crtc, i); + if (!plane) + continue; + + plane_state = drm_atomic_get_plane_state(state, plane); + if (IS_ERR(plane_state)) + return PTR_ERR(plane_state); + } + } + + return 0; +} + +static int tegra_crtc_atomic_check(struct drm_crtc *crtc, + struct drm_atomic_state *state) +{ + int err; + + err = tegra_crtc_check_bandwidth_state(crtc, state); + if (err) + return err; + + return 0; +} + +void tegra_crtc_atomic_post_commit(struct drm_crtc *crtc, + struct drm_atomic_state *state) +{ + /* + * Display bandwidth is allowed to go down only once hardware state + * is known to be armed, i.e. state was committed and VBLANK event + * received. + */ + tegra_crtc_update_memory_bandwidth(crtc, state, false); +} + static const struct drm_crtc_helper_funcs tegra_crtc_helper_funcs = { + .atomic_check = tegra_crtc_atomic_check, .atomic_begin = tegra_crtc_atomic_begin, .atomic_flush = tegra_crtc_atomic_flush, .atomic_enable = tegra_crtc_atomic_enable, @@ -2245,7 +2582,9 @@ static const struct tegra_dc_soc_info tegra20_dc_soc_info = { .overlay_formats = tegra20_overlay_formats, .modifiers = tegra20_modifiers, .has_win_a_without_filters = true, + .has_win_b_vfilter_mem_client = true, .has_win_c_without_vert_filter = true, + .plane_tiled_memory_bandwidth_x2 = false, };
static const struct tegra_dc_soc_info tegra30_dc_soc_info = { @@ -2264,7 +2603,9 @@ static const struct tegra_dc_soc_info tegra30_dc_soc_info = { .overlay_formats = tegra20_overlay_formats, .modifiers = tegra20_modifiers, .has_win_a_without_filters = false, + .has_win_b_vfilter_mem_client = true, .has_win_c_without_vert_filter = false, + .plane_tiled_memory_bandwidth_x2 = true, };
static const struct tegra_dc_soc_info tegra114_dc_soc_info = { @@ -2283,7 +2624,9 @@ static const struct tegra_dc_soc_info tegra114_dc_soc_info = { .overlay_formats = tegra114_overlay_formats, .modifiers = tegra20_modifiers, .has_win_a_without_filters = false, + .has_win_b_vfilter_mem_client = false, .has_win_c_without_vert_filter = false, + .plane_tiled_memory_bandwidth_x2 = true, };
static const struct tegra_dc_soc_info tegra124_dc_soc_info = { @@ -2302,7 +2645,9 @@ static const struct tegra_dc_soc_info tegra124_dc_soc_info = { .overlay_formats = tegra124_overlay_formats, .modifiers = tegra124_modifiers, .has_win_a_without_filters = false, + .has_win_b_vfilter_mem_client = false, .has_win_c_without_vert_filter = false, + .plane_tiled_memory_bandwidth_x2 = false, };
static const struct tegra_dc_soc_info tegra210_dc_soc_info = { @@ -2321,7 +2666,9 @@ static const struct tegra_dc_soc_info tegra210_dc_soc_info = { .overlay_formats = tegra114_overlay_formats, .modifiers = tegra124_modifiers, .has_win_a_without_filters = false, + .has_win_b_vfilter_mem_client = false, .has_win_c_without_vert_filter = false, + .plane_tiled_memory_bandwidth_x2 = false, };
static const struct tegra_windowgroup_soc tegra186_dc_wgrps[] = { @@ -2370,6 +2717,7 @@ static const struct tegra_dc_soc_info tegra186_dc_soc_info = { .has_nvdisplay = true, .wgrps = tegra186_dc_wgrps, .num_wgrps = ARRAY_SIZE(tegra186_dc_wgrps), + .plane_tiled_memory_bandwidth_x2 = false, };
static const struct tegra_windowgroup_soc tegra194_dc_wgrps[] = { @@ -2418,6 +2766,7 @@ static const struct tegra_dc_soc_info tegra194_dc_soc_info = { .has_nvdisplay = true, .wgrps = tegra194_dc_wgrps, .num_wgrps = ARRAY_SIZE(tegra194_dc_wgrps), + .plane_tiled_memory_bandwidth_x2 = false, };
static const struct of_device_id tegra_dc_of_match[] = { diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h index 051d03dcb9b0..0d7bdf66a1ec 100644 --- a/drivers/gpu/drm/tegra/dc.h +++ b/drivers/gpu/drm/tegra/dc.h @@ -15,6 +15,8 @@
struct tegra_output;
+#define TEGRA_DC_LEGACY_PLANES_NUM 6 + struct tegra_dc_state { struct drm_crtc_state base;
@@ -23,6 +25,8 @@ struct tegra_dc_state { unsigned int div;
u32 planes; + + unsigned long plane_peak_bw[TEGRA_DC_LEGACY_PLANES_NUM]; };
static inline struct tegra_dc_state *to_dc_state(struct drm_crtc_state *state) @@ -33,6 +37,12 @@ static inline struct tegra_dc_state *to_dc_state(struct drm_crtc_state *state) return NULL; }
+static inline const struct tegra_dc_state * +to_const_dc_state(const struct drm_crtc_state *state) +{ + return to_dc_state((struct drm_crtc_state *)state); +} + struct tegra_dc_stats { unsigned long frames; unsigned long vblank; @@ -65,7 +75,9 @@ struct tegra_dc_soc_info { unsigned int num_overlay_formats; const u64 *modifiers; bool has_win_a_without_filters; + bool has_win_b_vfilter_mem_client; bool has_win_c_without_vert_filter; + unsigned int plane_tiled_memory_bandwidth_x2; };
struct tegra_dc { @@ -151,6 +163,8 @@ int tegra_dc_state_setup_clock(struct tegra_dc *dc, struct drm_crtc_state *crtc_state, struct clk *clk, unsigned long pclk, unsigned int div); +void tegra_crtc_atomic_post_commit(struct drm_crtc *crtc, + struct drm_atomic_state *state);
/* from rgb.c */ int tegra_dc_rgb_probe(struct tegra_dc *dc); diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index e45c8414e2a3..1472042da2a7 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -20,6 +20,7 @@ #include <drm/drm_prime.h> #include <drm/drm_vblank.h>
+#include "dc.h" #include "drm.h" #include "gem.h"
@@ -59,6 +60,17 @@ static const struct drm_mode_config_funcs tegra_drm_mode_config_funcs = { .atomic_commit = drm_atomic_helper_commit, };
+static void tegra_atomic_post_commit(struct drm_device *drm, + struct drm_atomic_state *old_state) +{ + struct drm_crtc_state *old_crtc_state __maybe_unused; + struct drm_crtc *crtc; + unsigned int i; + + for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) + tegra_crtc_atomic_post_commit(crtc, old_state); +} + static void tegra_atomic_commit_tail(struct drm_atomic_state *old_state) { struct drm_device *drm = old_state->dev; @@ -75,6 +87,8 @@ static void tegra_atomic_commit_tail(struct drm_atomic_state *old_state) } else { drm_atomic_helper_commit_tail_rpm(old_state); } + + tegra_atomic_post_commit(drm, old_state); }
static const struct drm_mode_config_helper_funcs diff --git a/drivers/gpu/drm/tegra/hub.c b/drivers/gpu/drm/tegra/hub.c index 22a03f7ffdc1..4fa338dc7eb2 100644 --- a/drivers/gpu/drm/tegra/hub.c +++ b/drivers/gpu/drm/tegra/hub.c @@ -344,6 +344,9 @@ static int tegra_shared_plane_atomic_check(struct drm_plane *plane, struct tegra_dc *dc = to_tegra_dc(state->crtc); int err;
+ plane_state->peak_memory_bandwidth = 0; + plane_state->avg_memory_bandwidth = 0; + /* no need for further checks if the plane is being disabled */ if (!state->crtc || !state->fb) return 0; diff --git a/drivers/gpu/drm/tegra/plane.c b/drivers/gpu/drm/tegra/plane.c index 539d14935728..1e589c5af143 100644 --- a/drivers/gpu/drm/tegra/plane.c +++ b/drivers/gpu/drm/tegra/plane.c @@ -4,6 +4,7 @@ */
#include <linux/iommu.h> +#include <linux/interconnect.h>
#include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h> @@ -64,6 +65,8 @@ tegra_plane_atomic_duplicate_state(struct drm_plane *plane) copy->reflect_x = state->reflect_x; copy->reflect_y = state->reflect_y; copy->opaque = state->opaque; + copy->peak_memory_bandwidth = state->peak_memory_bandwidth; + copy->avg_memory_bandwidth = state->avg_memory_bandwidth;
for (i = 0; i < 2; i++) copy->blending[i] = state->blending[i]; @@ -212,6 +215,87 @@ void tegra_plane_cleanup_fb(struct drm_plane *plane, tegra_dc_unpin(dc, to_tegra_plane_state(state)); }
+static int tegra_plane_check_memory_bandwidth(struct drm_plane_state *state) +{ + struct tegra_plane_state *tegra_state = to_tegra_plane_state(state); + unsigned int i, bpp, bpp_plane, dst_w, src_w, src_h, mul; + const struct tegra_dc_soc_info *soc; + const struct drm_format_info *fmt; + struct drm_crtc_state *crtc_state; + u32 avg_bandwidth, peak_bandwidth; + + if (!state->visible) + return 0; + + crtc_state = drm_atomic_get_new_crtc_state(state->state, state->crtc); + if (!crtc_state) + return -EINVAL; + + src_w = drm_rect_width(&state->src) >> 16; + src_h = drm_rect_height(&state->src) >> 16; + dst_w = drm_rect_width(&state->dst); + + fmt = state->fb->format; + soc = to_tegra_dc(state->crtc)->soc; + + /* + * Note that real memory bandwidth vary depending on format and + * memory layout, we are not taking that into account because small + * estimation error isn't important since bandwidth is rounded up + * anyway. + */ + for (i = 0, bpp = 0; i < fmt->num_planes; i++) { + bpp_plane = fmt->cpp[i] * 8; + + /* + * Sub-sampling is relevant for chroma planes only and vertical + * readouts are not cached, hence only horizontal sub-sampling + * matters. + */ + if (i > 0) + bpp_plane /= fmt->hsub; + + bpp += bpp_plane; + } + + /* + * Horizontal downscale takes extra bandwidth which roughly depends + * on the scaled width. + */ + if (src_w > dst_w) + mul = (src_w - dst_w) * bpp / 2048 + 1; + else + mul = 1; + + /* average bandwidth in bytes/s */ + avg_bandwidth = src_w * src_h * bpp / 8 * mul; + avg_bandwidth *= drm_mode_vrefresh(&crtc_state->mode); + + /* mode.clock in kHz, peak bandwidth in kbit/s */ + peak_bandwidth = crtc_state->mode.clock * bpp * mul; + + /* ICC bandwidth in kbyte/s */ + peak_bandwidth = kbps_to_icc(peak_bandwidth); + avg_bandwidth = Bps_to_icc(avg_bandwidth); + + /* + * Tegra30/114 Memory Controller can't interleave DC memory requests + * and DC uses 16-bytes atom for the tiled windows, while DDR3 uses 32 + * bytes atom. Hence there is x2 memory overfetch for tiled framebuffer + * and DDR3 on older SoCs. + */ + if (soc->plane_tiled_memory_bandwidth_x2 && + tegra_state->tiling.mode == TEGRA_BO_TILING_MODE_TILED) { + peak_bandwidth *= 2; + avg_bandwidth *= 2; + } + + tegra_state->peak_memory_bandwidth = peak_bandwidth; + tegra_state->avg_memory_bandwidth = avg_bandwidth; + + return 0; +} + int tegra_plane_state_add(struct tegra_plane *plane, struct drm_plane_state *state) { @@ -230,6 +314,10 @@ int tegra_plane_state_add(struct tegra_plane *plane, if (err < 0) return err;
+ err = tegra_plane_check_memory_bandwidth(state); + if (err < 0) + return err; + tegra = to_dc_state(crtc_state);
tegra->planes |= WIN_A_ACT_REQ << plane->index; @@ -595,3 +683,36 @@ int tegra_plane_setup_legacy_state(struct tegra_plane *tegra,
return 0; } + +static const char * const tegra_plane_icc_names[] = { + "wina", "winb", "winc", "", "", "", "cursor", +}; + +int tegra_plane_interconnect_init(struct tegra_plane *plane) +{ + const char *icc_name = tegra_plane_icc_names[plane->index]; + struct device *dev = plane->dc->dev; + struct tegra_dc *dc = plane->dc; + int err; + + plane->icc_mem = devm_of_icc_get(dev, icc_name); + err = PTR_ERR_OR_ZERO(plane->icc_mem); + if (err) { + dev_err_probe(dev, err, "failed to get %s interconnect\n", + icc_name); + return err; + } + + /* plane B on T20/30 has a dedicated memory client for a 6-tap vertical filter */ + if (plane->index == 1 && dc->soc->has_win_b_vfilter_mem_client) { + plane->icc_mem_vfilter = devm_of_icc_get(dev, "winb-vfilter"); + err = PTR_ERR_OR_ZERO(plane->icc_mem_vfilter); + if (err) { + dev_err_probe(dev, err, "failed to get %s interconnect\n", + "winb-vfilter"); + return err; + } + } + + return 0; +} diff --git a/drivers/gpu/drm/tegra/plane.h b/drivers/gpu/drm/tegra/plane.h index c691dd79b27b..f2731aae7d01 100644 --- a/drivers/gpu/drm/tegra/plane.h +++ b/drivers/gpu/drm/tegra/plane.h @@ -8,6 +8,7 @@
#include <drm/drm_plane.h>
+struct icc_path; struct tegra_bo; struct tegra_dc;
@@ -16,6 +17,9 @@ struct tegra_plane { struct tegra_dc *dc; unsigned int offset; unsigned int index; + + struct icc_path *icc_mem; + struct icc_path *icc_mem_vfilter; };
struct tegra_cursor { @@ -52,6 +56,10 @@ struct tegra_plane_state { /* used for legacy blending support only */ struct tegra_plane_legacy_blending_state blending[2]; bool opaque; + + /* bandwidths are in ICC units, i.e. kbytes/sec */ + u32 peak_memory_bandwidth; + u32 avg_memory_bandwidth; };
static inline struct tegra_plane_state * @@ -63,6 +71,12 @@ to_tegra_plane_state(struct drm_plane_state *state) return NULL; }
+static inline const struct tegra_plane_state * +to_const_tegra_plane_state(const struct drm_plane_state *state) +{ + return to_tegra_plane_state((struct drm_plane_state *)state); +} + extern const struct drm_plane_funcs tegra_plane_funcs;
int tegra_plane_prepare_fb(struct drm_plane *plane, @@ -77,5 +91,6 @@ int tegra_plane_format(u32 fourcc, u32 *format, u32 *swap); bool tegra_plane_format_is_yuv(unsigned int format, bool *planar); int tegra_plane_setup_legacy_state(struct tegra_plane *tegra, struct tegra_plane_state *state); +int tegra_plane_interconnect_init(struct tegra_plane *plane);
#endif /* TEGRA_PLANE_H */
It's useful to know the total number of underflow events and currently the debug stats are getting reset each time CRTC is being disabled. Let's account the overall number of events that doesn't get a reset.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/dc.c | 10 ++++++++++ drivers/gpu/drm/tegra/dc.h | 5 +++++ 2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 5c587cfd1bb2..b6676f1fe358 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -1539,6 +1539,11 @@ static int tegra_dc_show_stats(struct seq_file *s, void *data) seq_printf(s, "underflow: %lu\n", dc->stats.underflow); seq_printf(s, "overflow: %lu\n", dc->stats.overflow);
+ seq_printf(s, "frames total: %lu\n", dc->stats.frames_total); + seq_printf(s, "vblank total: %lu\n", dc->stats.vblank_total); + seq_printf(s, "underflow total: %lu\n", dc->stats.underflow_total); + seq_printf(s, "overflow total: %lu\n", dc->stats.overflow_total); + return 0; }
@@ -2310,6 +2315,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data) /* dev_dbg(dc->dev, "%s(): frame end\n", __func__); */ + dc->stats.frames_total++; dc->stats.frames++; }
@@ -2318,6 +2324,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data) dev_dbg(dc->dev, "%s(): vertical blank\n", __func__); */ drm_crtc_handle_vblank(&dc->base); + dc->stats.vblank_total++; dc->stats.vblank++; }
@@ -2325,6 +2332,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data) /* dev_dbg(dc->dev, "%s(): underflow\n", __func__); */ + dc->stats.underflow_total++; dc->stats.underflow++; }
@@ -2332,11 +2340,13 @@ static irqreturn_t tegra_dc_irq(int irq, void *data) /* dev_dbg(dc->dev, "%s(): overflow\n", __func__); */ + dc->stats.overflow_total++; dc->stats.overflow++; }
if (status & HEAD_UF_INT) { dev_dbg_ratelimited(dc->dev, "%s(): head underflow\n", __func__); + dc->stats.underflow_total++; dc->stats.underflow++; }
diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h index 0d7bdf66a1ec..ba4ed35139fb 100644 --- a/drivers/gpu/drm/tegra/dc.h +++ b/drivers/gpu/drm/tegra/dc.h @@ -48,6 +48,11 @@ struct tegra_dc_stats { unsigned long vblank; unsigned long underflow; unsigned long overflow; + + unsigned long frames_total; + unsigned long vblank_total; + unsigned long underflow_total; + unsigned long overflow_total; };
struct tegra_windowgroup_soc {
This patch moves ACTMON driver away from generating OPP table by itself, transitioning it to use the table which comes from device-tree. This change breaks compatibility with older device-trees in order to bring support for the interconnect framework to the driver. This is a mandatory change which needs to be done in order to implement interconnect-based memory DVFS. Users of legacy device-trees will get a message telling that theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++--------------- 1 file changed, 44 insertions(+), 42 deletions(-)
diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index 38cc0d014738..ed6d4469c8c7 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -19,6 +19,8 @@ #include <linux/reset.h> #include <linux/workqueue.h>
+#include <soc/tegra/fuse.h> + #include "governor.h"
#define ACTMON_GLB_STATUS 0x0 @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
struct tegra_devfreq { struct devfreq *devfreq; + struct opp_table *opp_table;
struct reset_control *reset; struct clk *clock; @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra) static int tegra_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) { - struct tegra_devfreq *tegra = dev_get_drvdata(dev); - struct devfreq *devfreq = tegra->devfreq; struct dev_pm_opp *opp; - unsigned long rate; - int err; + int ret;
opp = devfreq_recommended_opp(dev, freq, flags); if (IS_ERR(opp)) { dev_err(dev, "Failed to find opp for %lu Hz\n", *freq); return PTR_ERR(opp); } - rate = dev_pm_opp_get_freq(opp); - dev_pm_opp_put(opp); - - err = clk_set_min_rate(tegra->emc_clock, rate * KHZ); - if (err) - return err; - - err = clk_set_rate(tegra->emc_clock, 0); - if (err) - goto restore_min_rate;
- return 0; - -restore_min_rate: - clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq); + ret = dev_pm_opp_set_bw(dev, opp); + dev_pm_opp_put(opp);
- return err; + return ret; }
static int tegra_devfreq_get_dev_status(struct device *dev, @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->private_data = tegra;
/* The below are to be used by the other governors */ - stat->current_frequency = cur_freq; + stat->current_frequency = cur_freq * KHZ;
actmon_dev = &tegra->devices[MCALL];
@@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq, target_freq = max(target_freq, dev->target_freq); }
- *freq = target_freq; + /* + * tegra-devfreq driver operates with KHz units, while OPP table + * entries use Hz units. Hence we need to convert the units for the + * devfreq core. + */ + *freq = target_freq * KHZ;
return 0; } @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
static int tegra_devfreq_probe(struct platform_device *pdev) { + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); struct tegra_devfreq_device *dev; struct tegra_devfreq *tegra; struct devfreq *devfreq; @@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev) long rate; int err;
+ /* legacy device-trees don't have OPP table and must be updated */ + if (!device_property_present(&pdev->dev, "operating-points-v2")) { + dev_err(&pdev->dev, + "OPP table not found, please update your device tree\n"); + return -ENODEV; + } + tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL); if (!tegra) return -ENOMEM; @@ -822,11 +823,25 @@ static int tegra_devfreq_probe(struct platform_device *pdev) return err; }
+ tegra->opp_table = dev_pm_opp_set_supported_hw(&pdev->dev, + &hw_version, 1); + err = PTR_ERR_OR_ZERO(tegra->opp_table); + if (err) { + dev_err(&pdev->dev, "Failed to set supported HW: %d\n", err); + return err; + } + + err = dev_pm_opp_of_add_table(&pdev->dev); + if (err) { + dev_err(&pdev->dev, "Failed to add OPP table: %d\n", err); + goto put_hw; + } + err = clk_prepare_enable(tegra->clock); if (err) { dev_err(&pdev->dev, "Failed to prepare and enable ACTMON clock\n"); - return err; + goto remove_table; }
err = reset_control_reset(tegra->reset); @@ -850,23 +865,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev) dev->regs = tegra->regs + dev->config->offset; }
- for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) { - rate = clk_round_rate(tegra->emc_clock, rate); - - if (rate < 0) { - dev_err(&pdev->dev, - "Failed to round clock rate: %ld\n", rate); - err = rate; - goto remove_opps; - } - - err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0); - if (err) { - dev_err(&pdev->dev, "Failed to add OPP: %d\n", err); - goto remove_opps; - } - } - platform_set_drvdata(pdev, tegra);
tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb; @@ -882,7 +880,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev) }
tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock); - tegra_devfreq_profile.initial_freq /= KHZ;
devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile, "tegra_actmon", NULL); @@ -902,6 +899,10 @@ static int tegra_devfreq_probe(struct platform_device *pdev) reset_control_reset(tegra->reset); disable_clk: clk_disable_unprepare(tegra->clock); +remove_table: + dev_pm_opp_of_remove_table(&pdev->dev); +put_hw: + dev_pm_opp_put_supported_hw(tegra->opp_table);
return err; } @@ -913,11 +914,12 @@ static int tegra_devfreq_remove(struct platform_device *pdev) devfreq_remove_device(tegra->devfreq); devfreq_remove_governor(&tegra_devfreq_governor);
- dev_pm_opp_remove_all_dynamic(&pdev->dev); - reset_control_reset(tegra->reset); clk_disable_unprepare(tegra->clock);
+ dev_pm_opp_of_remove_table(&pdev->dev); + dev_pm_opp_put_supported_hw(tegra->opp_table); + return 0; }
On 16-11-20, 00:29, Dmitry Osipenko wrote:
This patch moves ACTMON driver away from generating OPP table by itself, transitioning it to use the table which comes from device-tree. This change breaks compatibility with older device-trees in order to bring support for the interconnect framework to the driver. This is a mandatory change which needs to be done in order to implement interconnect-based memory DVFS. Users of legacy device-trees will get a message telling that theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++--------------- 1 file changed, 44 insertions(+), 42 deletions(-)
diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index 38cc0d014738..ed6d4469c8c7 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -19,6 +19,8 @@ #include <linux/reset.h> #include <linux/workqueue.h>
+#include <soc/tegra/fuse.h>
#include "governor.h"
#define ACTMON_GLB_STATUS 0x0 @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
struct tegra_devfreq { struct devfreq *devfreq;
struct opp_table *opp_table;
struct reset_control *reset; struct clk *clock;
@@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra) static int tegra_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) {
- struct tegra_devfreq *tegra = dev_get_drvdata(dev);
- struct devfreq *devfreq = tegra->devfreq; struct dev_pm_opp *opp;
- unsigned long rate;
- int err;
int ret;
opp = devfreq_recommended_opp(dev, freq, flags); if (IS_ERR(opp)) { dev_err(dev, "Failed to find opp for %lu Hz\n", *freq); return PTR_ERR(opp); }
rate = dev_pm_opp_get_freq(opp);
dev_pm_opp_put(opp);
err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
if (err)
return err;
err = clk_set_rate(tegra->emc_clock, 0);
if (err)
goto restore_min_rate;
return 0;
-restore_min_rate:
- clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
- ret = dev_pm_opp_set_bw(dev, opp);
- dev_pm_opp_put(opp);
- return err;
- return ret;
}
static int tegra_devfreq_get_dev_status(struct device *dev, @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->private_data = tegra;
/* The below are to be used by the other governors */
- stat->current_frequency = cur_freq;
stat->current_frequency = cur_freq * KHZ;
actmon_dev = &tegra->devices[MCALL];
@@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq, target_freq = max(target_freq, dev->target_freq); }
- *freq = target_freq;
/*
* tegra-devfreq driver operates with KHz units, while OPP table
* entries use Hz units. Hence we need to convert the units for the
* devfreq core.
*/
*freq = target_freq * KHZ;
return 0;
} @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
static int tegra_devfreq_probe(struct platform_device *pdev) {
- u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); struct tegra_devfreq_device *dev; struct tegra_devfreq *tegra; struct devfreq *devfreq;
@@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev) long rate; int err;
- /* legacy device-trees don't have OPP table and must be updated */
- if (!device_property_present(&pdev->dev, "operating-points-v2")) {
dev_err(&pdev->dev,
"OPP table not found, please update your device tree\n");
return -ENODEV;
- }
You forgot to remove this ?
tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL); if (!tegra) return -ENOMEM; @@ -822,11 +823,25 @@ static int tegra_devfreq_probe(struct platform_device *pdev) return err; }
- tegra->opp_table = dev_pm_opp_set_supported_hw(&pdev->dev,
&hw_version, 1);
- err = PTR_ERR_OR_ZERO(tegra->opp_table);
- if (err) {
dev_err(&pdev->dev, "Failed to set supported HW: %d\n", err);
return err;
- }
- err = dev_pm_opp_of_add_table(&pdev->dev);
- if (err) {
dev_err(&pdev->dev, "Failed to add OPP table: %d\n", err);
goto put_hw;
- }
- err = clk_prepare_enable(tegra->clock); if (err) { dev_err(&pdev->dev, "Failed to prepare and enable ACTMON clock\n");
return err;
goto remove_table;
}
err = reset_control_reset(tegra->reset);
@@ -850,23 +865,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev) dev->regs = tegra->regs + dev->config->offset; }
for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) {
rate = clk_round_rate(tegra->emc_clock, rate);
if (rate < 0) {
dev_err(&pdev->dev,
"Failed to round clock rate: %ld\n", rate);
err = rate;
goto remove_opps;
}
err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0);
if (err) {
dev_err(&pdev->dev, "Failed to add OPP: %d\n", err);
goto remove_opps;
}
}
platform_set_drvdata(pdev, tegra);
tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb;
@@ -882,7 +880,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev) }
tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock);
tegra_devfreq_profile.initial_freq /= KHZ;
devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile, "tegra_actmon", NULL);
@@ -902,6 +899,10 @@ static int tegra_devfreq_probe(struct platform_device *pdev) reset_control_reset(tegra->reset); disable_clk: clk_disable_unprepare(tegra->clock); +remove_table:
- dev_pm_opp_of_remove_table(&pdev->dev);
+put_hw:
dev_pm_opp_put_supported_hw(tegra->opp_table);
return err;
} @@ -913,11 +914,12 @@ static int tegra_devfreq_remove(struct platform_device *pdev) devfreq_remove_device(tegra->devfreq); devfreq_remove_governor(&tegra_devfreq_governor);
- dev_pm_opp_remove_all_dynamic(&pdev->dev);
- reset_control_reset(tegra->reset); clk_disable_unprepare(tegra->clock);
- dev_pm_opp_of_remove_table(&pdev->dev);
- dev_pm_opp_put_supported_hw(tegra->opp_table);
- return 0;
}
-- 2.29.2
17.11.2020 13:07, Viresh Kumar пишет:
On 16-11-20, 00:29, Dmitry Osipenko wrote:
This patch moves ACTMON driver away from generating OPP table by itself, transitioning it to use the table which comes from device-tree. This change breaks compatibility with older device-trees in order to bring support for the interconnect framework to the driver. This is a mandatory change which needs to be done in order to implement interconnect-based memory DVFS. Users of legacy device-trees will get a message telling that theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++--------------- 1 file changed, 44 insertions(+), 42 deletions(-)
diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index 38cc0d014738..ed6d4469c8c7 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -19,6 +19,8 @@ #include <linux/reset.h> #include <linux/workqueue.h>
+#include <soc/tegra/fuse.h>
#include "governor.h"
#define ACTMON_GLB_STATUS 0x0 @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
struct tegra_devfreq { struct devfreq *devfreq;
struct opp_table *opp_table;
struct reset_control *reset; struct clk *clock;
@@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra) static int tegra_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) {
- struct tegra_devfreq *tegra = dev_get_drvdata(dev);
- struct devfreq *devfreq = tegra->devfreq; struct dev_pm_opp *opp;
- unsigned long rate;
- int err;
int ret;
opp = devfreq_recommended_opp(dev, freq, flags); if (IS_ERR(opp)) { dev_err(dev, "Failed to find opp for %lu Hz\n", *freq); return PTR_ERR(opp); }
rate = dev_pm_opp_get_freq(opp);
dev_pm_opp_put(opp);
err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
if (err)
return err;
err = clk_set_rate(tegra->emc_clock, 0);
if (err)
goto restore_min_rate;
return 0;
-restore_min_rate:
- clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
- ret = dev_pm_opp_set_bw(dev, opp);
- dev_pm_opp_put(opp);
- return err;
- return ret;
}
static int tegra_devfreq_get_dev_status(struct device *dev, @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->private_data = tegra;
/* The below are to be used by the other governors */
- stat->current_frequency = cur_freq;
stat->current_frequency = cur_freq * KHZ;
actmon_dev = &tegra->devices[MCALL];
@@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq, target_freq = max(target_freq, dev->target_freq); }
- *freq = target_freq;
/*
* tegra-devfreq driver operates with KHz units, while OPP table
* entries use Hz units. Hence we need to convert the units for the
* devfreq core.
*/
*freq = target_freq * KHZ;
return 0;
} @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
static int tegra_devfreq_probe(struct platform_device *pdev) {
- u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); struct tegra_devfreq_device *dev; struct tegra_devfreq *tegra; struct devfreq *devfreq;
@@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev) long rate; int err;
- /* legacy device-trees don't have OPP table and must be updated */
- if (!device_property_present(&pdev->dev, "operating-points-v2")) {
dev_err(&pdev->dev,
"OPP table not found, please update your device tree\n");
return -ENODEV;
- }
You forgot to remove this ?
Yes, good catch. I'm planning to replace this code with a common helper sometime soon, so if there won't be another reasons to make a new revision, then I'd prefer to keep it as-is for now.
On 17-11-20, 17:17, Dmitry Osipenko wrote:
17.11.2020 13:07, Viresh Kumar пишет:
On 16-11-20, 00:29, Dmitry Osipenko wrote:
This patch moves ACTMON driver away from generating OPP table by itself, transitioning it to use the table which comes from device-tree. This change breaks compatibility with older device-trees in order to bring support for the interconnect framework to the driver. This is a mandatory change which needs to be done in order to implement interconnect-based memory DVFS. Users of legacy device-trees will get a message telling that theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com
drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++--------------- 1 file changed, 44 insertions(+), 42 deletions(-)
diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index 38cc0d014738..ed6d4469c8c7 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -19,6 +19,8 @@ #include <linux/reset.h> #include <linux/workqueue.h>
+#include <soc/tegra/fuse.h>
#include "governor.h"
#define ACTMON_GLB_STATUS 0x0 @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
struct tegra_devfreq { struct devfreq *devfreq;
struct opp_table *opp_table;
struct reset_control *reset; struct clk *clock;
@@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra) static int tegra_devfreq_target(struct device *dev, unsigned long *freq, u32 flags) {
- struct tegra_devfreq *tegra = dev_get_drvdata(dev);
- struct devfreq *devfreq = tegra->devfreq; struct dev_pm_opp *opp;
- unsigned long rate;
- int err;
int ret;
opp = devfreq_recommended_opp(dev, freq, flags); if (IS_ERR(opp)) { dev_err(dev, "Failed to find opp for %lu Hz\n", *freq); return PTR_ERR(opp); }
rate = dev_pm_opp_get_freq(opp);
dev_pm_opp_put(opp);
err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
if (err)
return err;
err = clk_set_rate(tegra->emc_clock, 0);
if (err)
goto restore_min_rate;
return 0;
-restore_min_rate:
- clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
- ret = dev_pm_opp_set_bw(dev, opp);
- dev_pm_opp_put(opp);
- return err;
- return ret;
}
static int tegra_devfreq_get_dev_status(struct device *dev, @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev, stat->private_data = tegra;
/* The below are to be used by the other governors */
- stat->current_frequency = cur_freq;
stat->current_frequency = cur_freq * KHZ;
actmon_dev = &tegra->devices[MCALL];
@@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq, target_freq = max(target_freq, dev->target_freq); }
- *freq = target_freq;
/*
* tegra-devfreq driver operates with KHz units, while OPP table
* entries use Hz units. Hence we need to convert the units for the
* devfreq core.
*/
*freq = target_freq * KHZ;
return 0;
} @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
static int tegra_devfreq_probe(struct platform_device *pdev) {
- u32 hw_version = BIT(tegra_sku_info.soc_speedo_id); struct tegra_devfreq_device *dev; struct tegra_devfreq *tegra; struct devfreq *devfreq;
@@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev) long rate; int err;
- /* legacy device-trees don't have OPP table and must be updated */
- if (!device_property_present(&pdev->dev, "operating-points-v2")) {
dev_err(&pdev->dev,
"OPP table not found, please update your device tree\n");
return -ENODEV;
- }
You forgot to remove this ?
Yes, good catch. I'm planning to replace this code with a common helper sometime soon, so if there won't be another reasons to make a new revision, then I'd prefer to keep it as-is for now.
You should just replace this patch only with a version of V9.1 and you aren't really required to resend the whole series. And you should fix it before it gets merged. This isn't a formatting issue which we just let through. I trust you when you say that you will fix it, but this must be fixed now.
18.11.2020 07:21, Viresh Kumar пишет: ...
- /* legacy device-trees don't have OPP table and must be updated */
- if (!device_property_present(&pdev->dev, "operating-points-v2")) {
dev_err(&pdev->dev,
"OPP table not found, please update your device tree\n");
return -ENODEV;
- }
You forgot to remove this ?
Yes, good catch. I'm planning to replace this code with a common helper sometime soon, so if there won't be another reasons to make a new revision, then I'd prefer to keep it as-is for now.
You should just replace this patch only with a version of V9.1 and you aren't really required to resend the whole series. And you should fix it before it gets merged. This isn't a formatting issue which we just let through. I trust you when you say that you will fix it, but this must be fixed now.
Should be easier to resend it all. I'll update it over the weekend, thanks.
Previously we were using count-weight of the T124 for T30 in order to get EMC clock rate that was reasonable for T30. In fact the count-weight should be x2 times smaller on T30, but then devfreq was producing a bit too low EMC clock rate for ISO memory clients, like display controller for example.
Now both Tegra ACTMON and Tegra DRM display drivers support interconnect framework and display driver tells to ICC what a minimum memory bandwidth is needed, preventing FIFO underflows. Thus, now we can use a proper count-weight value for Tegra30 and MC_ALL device config needs a bit more aggressive boosting.
Add a separate ACTMON driver configuration that is specific to Tegra30.
Tested-by: Peter Geis pgwipeout@gmail.com Tested-by: Nicolas Chauvet kwizart@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/devfreq/tegra30-devfreq.c | 68 ++++++++++++++++++++++++------- 1 file changed, 54 insertions(+), 14 deletions(-)
diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c index ed6d4469c8c7..36ba82c3898c 100644 --- a/drivers/devfreq/tegra30-devfreq.c +++ b/drivers/devfreq/tegra30-devfreq.c @@ -57,13 +57,6 @@ #define ACTMON_BELOW_WMARK_WINDOW 3 #define ACTMON_BOOST_FREQ_STEP 16000
-/* - * Activity counter is incremented every 256 memory transactions, and each - * transaction takes 4 EMC clocks for Tegra124; So the COUNT_WEIGHT is - * 4 * 256 = 1024. - */ -#define ACTMON_COUNT_WEIGHT 0x400 - /* * ACTMON_AVERAGE_WINDOW_LOG2: default value for @DEV_CTRL_K_VAL, which * translates to 2 ^ (K_VAL + 1). ex: 2 ^ (6 + 1) = 128 @@ -111,7 +104,7 @@ enum tegra_actmon_device { MCCPU, };
-static const struct tegra_devfreq_device_config actmon_device_configs[] = { +static const struct tegra_devfreq_device_config tegra124_device_configs[] = { { /* MCALL: All memory accesses (including from the CPUs) */ .offset = 0x1c0, @@ -133,6 +126,28 @@ static const struct tegra_devfreq_device_config actmon_device_configs[] = { }, };
+static const struct tegra_devfreq_device_config tegra30_device_configs[] = { + { + /* MCALL: All memory accesses (including from the CPUs) */ + .offset = 0x1c0, + .irq_mask = 1 << 26, + .boost_up_coeff = 200, + .boost_down_coeff = 50, + .boost_up_threshold = 20, + .boost_down_threshold = 10, + }, + { + /* MCCPU: memory accesses from the CPUs */ + .offset = 0x200, + .irq_mask = 1 << 25, + .boost_up_coeff = 800, + .boost_down_coeff = 40, + .boost_up_threshold = 27, + .boost_down_threshold = 10, + .avg_dependency_threshold = 16000, /* 16MHz in kHz units */ + }, +}; + /** * struct tegra_devfreq_device - state specific to an ACTMON device * @@ -155,6 +170,12 @@ struct tegra_devfreq_device { unsigned long target_freq; };
+struct tegra_devfreq_soc_data { + const struct tegra_devfreq_device_config *configs; + /* Weight value for count measurements */ + unsigned int count_weight; +}; + struct tegra_devfreq { struct devfreq *devfreq; struct opp_table *opp_table; @@ -171,11 +192,13 @@ struct tegra_devfreq { struct delayed_work cpufreq_update_work; struct notifier_block cpu_rate_change_nb;
- struct tegra_devfreq_device devices[ARRAY_SIZE(actmon_device_configs)]; + struct tegra_devfreq_device devices[2];
unsigned int irq;
bool started; + + const struct tegra_devfreq_soc_data *soc; };
struct tegra_actmon_emc_ratio { @@ -488,7 +511,7 @@ static void tegra_actmon_configure_device(struct tegra_devfreq *tegra, tegra_devfreq_update_avg_wmark(tegra, dev); tegra_devfreq_update_wmark(tegra, dev);
- device_writel(dev, ACTMON_COUNT_WEIGHT, ACTMON_DEV_COUNT_WEIGHT); + device_writel(dev, tegra->soc->count_weight, ACTMON_DEV_COUNT_WEIGHT); device_writel(dev, ACTMON_INTR_STATUS_CLEAR, ACTMON_DEV_INTR_STATUS);
val |= ACTMON_DEV_CTRL_ENB_PERIODIC; @@ -786,6 +809,8 @@ static int tegra_devfreq_probe(struct platform_device *pdev) if (!tegra) return -ENOMEM;
+ tegra->soc = of_device_get_match_data(&pdev->dev); + tegra->regs = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(tegra->regs)) return PTR_ERR(tegra->regs); @@ -859,9 +884,9 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
tegra->max_freq = rate / KHZ;
- for (i = 0; i < ARRAY_SIZE(actmon_device_configs); i++) { + for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) { dev = tegra->devices + i; - dev->config = actmon_device_configs + i; + dev->config = tegra->soc->configs + i; dev->regs = tegra->regs + dev->config->offset; }
@@ -923,9 +948,24 @@ static int tegra_devfreq_remove(struct platform_device *pdev) return 0; }
+static const struct tegra_devfreq_soc_data tegra124_soc = { + .configs = tegra124_device_configs, + + /* + * Activity counter is incremented every 256 memory transactions, + * and each transaction takes 4 EMC clocks. + */ + .count_weight = 4 * 256, +}; + +static const struct tegra_devfreq_soc_data tegra30_soc = { + .configs = tegra30_device_configs, + .count_weight = 2 * 256, +}; + static const struct of_device_id tegra_devfreq_of_match[] = { - { .compatible = "nvidia,tegra30-actmon" }, - { .compatible = "nvidia,tegra124-actmon" }, + { .compatible = "nvidia,tegra30-actmon", .data = &tegra30_soc, }, + { .compatible = "nvidia,tegra124-actmon", .data = &tegra124_soc, }, { }, };
Remove tegra20-devfreq in order to replace it with a EMC_STAT based devfreq driver. Previously we were going to use MC_STAT based tegra20-devfreq driver because EMC_STAT wasn't working properly, but now that problem is resolved. This resolves complications imposed by the removed driver since it was depending on both EMC and MC drivers simultaneously.
Acked-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Dmitry Osipenko digetx@gmail.com --- MAINTAINERS | 1 - drivers/devfreq/Kconfig | 10 -- drivers/devfreq/Makefile | 1 - drivers/devfreq/tegra20-devfreq.c | 210 ------------------------------ 4 files changed, 222 deletions(-) delete mode 100644 drivers/devfreq/tegra20-devfreq.c
diff --git a/MAINTAINERS b/MAINTAINERS index 0818a5b03832..f1fd25d7e9ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11389,7 +11389,6 @@ L: linux-pm@vger.kernel.org L: linux-tegra@vger.kernel.org T: git git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git S: Maintained -F: drivers/devfreq/tegra20-devfreq.c F: drivers/devfreq/tegra30-devfreq.c
MEMORY MANAGEMENT diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig index 0ee36ae2fa79..00704efe6398 100644 --- a/drivers/devfreq/Kconfig +++ b/drivers/devfreq/Kconfig @@ -121,16 +121,6 @@ config ARM_TEGRA_DEVFREQ It reads ACTMON counters of memory controllers and adjusts the operating frequencies and voltages with OPP support.
-config ARM_TEGRA20_DEVFREQ - tristate "NVIDIA Tegra20 DEVFREQ Driver" - depends on ARCH_TEGRA_2x_SOC || COMPILE_TEST - depends on COMMON_CLK - select DEVFREQ_GOV_SIMPLE_ONDEMAND - help - This adds the DEVFREQ driver for the Tegra20 family of SoCs. - It reads Memory Controller counters and adjusts the operating - frequencies and voltages with OPP support. - config ARM_RK3399_DMC_DEVFREQ tristate "ARM RK3399 DMC DEVFREQ Driver" depends on (ARCH_ROCKCHIP && HAVE_ARM_SMCCC) || \ diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile index 3ca1ad0ecb97..a16333ea7034 100644 --- a/drivers/devfreq/Makefile +++ b/drivers/devfreq/Makefile @@ -13,7 +13,6 @@ obj-$(CONFIG_ARM_IMX_BUS_DEVFREQ) += imx-bus.o obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ) += imx8m-ddrc.o obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ) += rk3399_dmc.o obj-$(CONFIG_ARM_TEGRA_DEVFREQ) += tegra30-devfreq.o -obj-$(CONFIG_ARM_TEGRA20_DEVFREQ) += tegra20-devfreq.o
# DEVFREQ Event Drivers obj-$(CONFIG_PM_DEVFREQ_EVENT) += event/ diff --git a/drivers/devfreq/tegra20-devfreq.c b/drivers/devfreq/tegra20-devfreq.c deleted file mode 100644 index fd801534771d..000000000000 --- a/drivers/devfreq/tegra20-devfreq.c +++ /dev/null @@ -1,210 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * NVIDIA Tegra20 devfreq driver - * - * Copyright (C) 2019 GRATE-DRIVER project - */ - -#include <linux/clk.h> -#include <linux/devfreq.h> -#include <linux/io.h> -#include <linux/kernel.h> -#include <linux/module.h> -#include <linux/of_device.h> -#include <linux/platform_device.h> -#include <linux/pm_opp.h> -#include <linux/slab.h> - -#include <soc/tegra/mc.h> - -#include "governor.h" - -#define MC_STAT_CONTROL 0x90 -#define MC_STAT_EMC_CLOCK_LIMIT 0xa0 -#define MC_STAT_EMC_CLOCKS 0xa4 -#define MC_STAT_EMC_CONTROL 0xa8 -#define MC_STAT_EMC_COUNT 0xb8 - -#define EMC_GATHER_CLEAR (1 << 8) -#define EMC_GATHER_ENABLE (3 << 8) - -struct tegra_devfreq { - struct devfreq *devfreq; - struct clk *emc_clock; - void __iomem *regs; -}; - -static int tegra_devfreq_target(struct device *dev, unsigned long *freq, - u32 flags) -{ - struct tegra_devfreq *tegra = dev_get_drvdata(dev); - struct devfreq *devfreq = tegra->devfreq; - struct dev_pm_opp *opp; - unsigned long rate; - int err; - - opp = devfreq_recommended_opp(dev, freq, flags); - if (IS_ERR(opp)) - return PTR_ERR(opp); - - rate = dev_pm_opp_get_freq(opp); - dev_pm_opp_put(opp); - - err = clk_set_min_rate(tegra->emc_clock, rate); - if (err) - return err; - - err = clk_set_rate(tegra->emc_clock, 0); - if (err) - goto restore_min_rate; - - return 0; - -restore_min_rate: - clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq); - - return err; -} - -static int tegra_devfreq_get_dev_status(struct device *dev, - struct devfreq_dev_status *stat) -{ - struct tegra_devfreq *tegra = dev_get_drvdata(dev); - - /* - * EMC_COUNT returns number of memory events, that number is lower - * than the number of clocks. Conversion ratio of 1/8 results in a - * bit higher bandwidth than actually needed, it is good enough for - * the time being because drivers don't support requesting minimum - * needed memory bandwidth yet. - * - * TODO: adjust the ratio value once relevant drivers will support - * memory bandwidth management. - */ - stat->busy_time = readl_relaxed(tegra->regs + MC_STAT_EMC_COUNT); - stat->total_time = readl_relaxed(tegra->regs + MC_STAT_EMC_CLOCKS) / 8; - stat->current_frequency = clk_get_rate(tegra->emc_clock); - - writel_relaxed(EMC_GATHER_CLEAR, tegra->regs + MC_STAT_CONTROL); - writel_relaxed(EMC_GATHER_ENABLE, tegra->regs + MC_STAT_CONTROL); - - return 0; -} - -static struct devfreq_dev_profile tegra_devfreq_profile = { - .polling_ms = 500, - .target = tegra_devfreq_target, - .get_dev_status = tegra_devfreq_get_dev_status, -}; - -static struct tegra_mc *tegra_get_memory_controller(void) -{ - struct platform_device *pdev; - struct device_node *np; - struct tegra_mc *mc; - - np = of_find_compatible_node(NULL, NULL, "nvidia,tegra20-mc-gart"); - if (!np) - return ERR_PTR(-ENOENT); - - pdev = of_find_device_by_node(np); - of_node_put(np); - if (!pdev) - return ERR_PTR(-ENODEV); - - mc = platform_get_drvdata(pdev); - if (!mc) - return ERR_PTR(-EPROBE_DEFER); - - return mc; -} - -static int tegra_devfreq_probe(struct platform_device *pdev) -{ - struct tegra_devfreq *tegra; - struct tegra_mc *mc; - unsigned long max_rate; - unsigned long rate; - int err; - - mc = tegra_get_memory_controller(); - if (IS_ERR(mc)) { - err = PTR_ERR(mc); - dev_err(&pdev->dev, "failed to get memory controller: %d\n", - err); - return err; - } - - tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL); - if (!tegra) - return -ENOMEM; - - /* EMC is a system-critical clock that is always enabled */ - tegra->emc_clock = devm_clk_get(&pdev->dev, "emc"); - if (IS_ERR(tegra->emc_clock)) - return dev_err_probe(&pdev->dev, PTR_ERR(tegra->emc_clock), - "failed to get emc clock\n"); - - tegra->regs = mc->regs; - - max_rate = clk_round_rate(tegra->emc_clock, ULONG_MAX); - - for (rate = 0; rate <= max_rate; rate++) { - rate = clk_round_rate(tegra->emc_clock, rate); - - err = dev_pm_opp_add(&pdev->dev, rate, 0); - if (err) { - dev_err(&pdev->dev, "failed to add opp: %d\n", err); - goto remove_opps; - } - } - - /* - * Reset statistic gathers state, select global bandwidth for the - * statistics collection mode and set clocks counter saturation - * limit to maximum. - */ - writel_relaxed(0x00000000, tegra->regs + MC_STAT_CONTROL); - writel_relaxed(0x00000000, tegra->regs + MC_STAT_EMC_CONTROL); - writel_relaxed(0xffffffff, tegra->regs + MC_STAT_EMC_CLOCK_LIMIT); - - platform_set_drvdata(pdev, tegra); - - tegra->devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile, - DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL); - if (IS_ERR(tegra->devfreq)) { - err = PTR_ERR(tegra->devfreq); - goto remove_opps; - } - - return 0; - -remove_opps: - dev_pm_opp_remove_all_dynamic(&pdev->dev); - - return err; -} - -static int tegra_devfreq_remove(struct platform_device *pdev) -{ - struct tegra_devfreq *tegra = platform_get_drvdata(pdev); - - devfreq_remove_device(tegra->devfreq); - dev_pm_opp_remove_all_dynamic(&pdev->dev); - - return 0; -} - -static struct platform_driver tegra_devfreq_driver = { - .probe = tegra_devfreq_probe, - .remove = tegra_devfreq_remove, - .driver = { - .name = "tegra20-devfreq", - }, -}; -module_platform_driver(tegra_devfreq_driver); - -MODULE_ALIAS("platform:tegra20-devfreq"); -MODULE_AUTHOR("Dmitry Osipenko digetx@gmail.com"); -MODULE_DESCRIPTION("NVIDIA Tegra20 devfreq driver"); -MODULE_LICENSE("GPL v2");
Fix the size of Tegra20 EMC registers, which should be twice bigger.
Acked-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra20.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi index 72a4211a618f..9347f7789245 100644 --- a/arch/arm/boot/dts/tegra20.dtsi +++ b/arch/arm/boot/dts/tegra20.dtsi @@ -634,7 +634,7 @@ mc: memory-controller@7000f000 {
memory-controller@7000f400 { compatible = "nvidia,tegra20-emc"; - reg = <0x7000f400 0x200>; + reg = <0x7000f400 0x400>; interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>; clocks = <&tegra_car TEGRA20_CLK_EMC>; #address-cells = <1>;
Add interconnect properties to the Memory Controller, External Memory Controller and the Display Controller nodes in order to describe hardware interconnection.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra20.dtsi | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi index 9347f7789245..2e1304493f7d 100644 --- a/arch/arm/boot/dts/tegra20.dtsi +++ b/arch/arm/boot/dts/tegra20.dtsi @@ -111,6 +111,17 @@ dc@54200000 {
nvidia,head = <0>;
+ interconnects = <&mc TEGRA20_MC_DISPLAY0A &emc>, + <&mc TEGRA20_MC_DISPLAY0B &emc>, + <&mc TEGRA20_MC_DISPLAY1B &emc>, + <&mc TEGRA20_MC_DISPLAY0C &emc>, + <&mc TEGRA20_MC_DISPLAYHC &emc>; + interconnect-names = "wina", + "winb", + "winb-vfilter", + "winc", + "cursor"; + rgb { status = "disabled"; }; @@ -128,6 +139,17 @@ dc@54240000 {
nvidia,head = <1>;
+ interconnects = <&mc TEGRA20_MC_DISPLAY0AB &emc>, + <&mc TEGRA20_MC_DISPLAY0BB &emc>, + <&mc TEGRA20_MC_DISPLAY1BB &emc>, + <&mc TEGRA20_MC_DISPLAY0CB &emc>, + <&mc TEGRA20_MC_DISPLAYHCB &emc>; + interconnect-names = "wina", + "winb", + "winb-vfilter", + "winc", + "cursor"; + rgb { status = "disabled"; }; @@ -630,15 +652,17 @@ mc: memory-controller@7000f000 { interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>; #reset-cells = <1>; #iommu-cells = <0>; + #interconnect-cells = <1>; };
- memory-controller@7000f400 { + emc: memory-controller@7000f400 { compatible = "nvidia,tegra20-emc"; reg = <0x7000f400 0x400>; interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>; clocks = <&tegra_car TEGRA20_CLK_EMC>; #address-cells = <1>; #size-cells = <0>; + #interconnect-cells = <0>; };
fuse@7000f800 {
Add interconnect properties to the Memory Controller, External Memory Controller and the Display Controller nodes in order to describe hardware interconnection.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra30.dtsi | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi index aeae8c092d41..2caf6cc6f4b1 100644 --- a/arch/arm/boot/dts/tegra30.dtsi +++ b/arch/arm/boot/dts/tegra30.dtsi @@ -210,6 +210,17 @@ dc@54200000 {
nvidia,head = <0>;
+ interconnects = <&mc TEGRA30_MC_DISPLAY0A &emc>, + <&mc TEGRA30_MC_DISPLAY0B &emc>, + <&mc TEGRA30_MC_DISPLAY1B &emc>, + <&mc TEGRA30_MC_DISPLAY0C &emc>, + <&mc TEGRA30_MC_DISPLAYHC &emc>; + interconnect-names = "wina", + "winb", + "winb-vfilter", + "winc", + "cursor"; + rgb { status = "disabled"; }; @@ -229,6 +240,17 @@ dc@54240000 {
nvidia,head = <1>;
+ interconnects = <&mc TEGRA30_MC_DISPLAY0AB &emc>, + <&mc TEGRA30_MC_DISPLAY0BB &emc>, + <&mc TEGRA30_MC_DISPLAY1BB &emc>, + <&mc TEGRA30_MC_DISPLAY0CB &emc>, + <&mc TEGRA30_MC_DISPLAYHCB &emc>; + interconnect-names = "wina", + "winb", + "winb-vfilter", + "winc", + "cursor"; + rgb { status = "disabled"; }; @@ -748,15 +770,18 @@ mc: memory-controller@7000f000 {
#iommu-cells = <1>; #reset-cells = <1>; + #interconnect-cells = <1>; };
- memory-controller@7000f400 { + emc: memory-controller@7000f400 { compatible = "nvidia,tegra30-emc"; reg = <0x7000f400 0x400>; interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>; clocks = <&tegra_car TEGRA30_CLK_EMC>;
nvidia,memory-controller = <&mc>; + + #interconnect-cells = <0>; };
fuse@7000f800 {
Add interconnect properties to the Memory Controller, External Memory Controller and the Display Controller nodes in order to describe hardware interconnection.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra124.dtsi | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi index 64f488ba1e72..1801e30b1d3a 100644 --- a/arch/arm/boot/dts/tegra124.dtsi +++ b/arch/arm/boot/dts/tegra124.dtsi @@ -113,6 +113,19 @@ dc@54200000 { iommus = <&mc TEGRA_SWGROUP_DC>;
nvidia,head = <0>; + + interconnects = <&mc TEGRA124_MC_DISPLAY0A &emc>, + <&mc TEGRA124_MC_DISPLAY0B &emc>, + <&mc TEGRA124_MC_DISPLAY0C &emc>, + <&mc TEGRA124_MC_DISPLAYHC &emc>, + <&mc TEGRA124_MC_DISPLAYD &emc>, + <&mc TEGRA124_MC_DISPLAYT &emc>; + interconnect-names = "wina", + "winb", + "winc", + "cursor", + "wind", + "wint"; };
dc@54240000 { @@ -127,6 +140,15 @@ dc@54240000 { iommus = <&mc TEGRA_SWGROUP_DCB>;
nvidia,head = <1>; + + interconnects = <&mc TEGRA124_MC_DISPLAY0AB &emc>, + <&mc TEGRA124_MC_DISPLAY0BB &emc>, + <&mc TEGRA124_MC_DISPLAY0CB &emc>, + <&mc TEGRA124_MC_DISPLAYHCB &emc>; + interconnect-names = "wina", + "winb", + "winc", + "cursor"; };
hdmi: hdmi@54280000 { @@ -628,6 +650,7 @@ mc: memory-controller@70019000 {
#iommu-cells = <1>; #reset-cells = <1>; + #interconnect-cells = <1>; };
emc: external-memory-controller@7001b000 { @@ -637,6 +660,8 @@ emc: external-memory-controller@7001b000 { clock-names = "emc";
nvidia,memory-controller = <&mc>; + + #interconnect-cells = <0>; };
sata@70020000 {
Add nvidia,memory-controller to the Tegra20 External Memory Controller node. This allows to perform a direct lookup of the Memory Controller instead of walking up the whole tree. This puts Tegra20 device-tree on par with Tegra30+.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra20.dtsi | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi index 2e1304493f7d..8f8ad81916e7 100644 --- a/arch/arm/boot/dts/tegra20.dtsi +++ b/arch/arm/boot/dts/tegra20.dtsi @@ -663,6 +663,8 @@ emc: memory-controller@7000f400 { #address-cells = <1>; #size-cells = <0>; #interconnect-cells = <0>; + + nvidia,memory-controller = <&mc>; };
fuse@7000f800 {
Add EMC OPP DVFS tables and update board device-trees by removing unsupported OPPs.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- .../boot/dts/tegra20-acer-a500-picasso.dts | 5 + arch/arm/boot/dts/tegra20-colibri.dtsi | 4 + arch/arm/boot/dts/tegra20-paz00.dts | 4 + .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 92 +++++++++++++++++++ arch/arm/boot/dts/tegra20.dtsi | 3 + 5 files changed, 108 insertions(+) create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
diff --git a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts index dd6fb134ee39..a29b44837855 100644 --- a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts +++ b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts @@ -1451,3 +1451,8 @@ emc-table@300000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@666000000; + /delete-node/ opp@760000000; +}; diff --git a/arch/arm/boot/dts/tegra20-colibri.dtsi b/arch/arm/boot/dts/tegra20-colibri.dtsi index 6162d193e12c..585a5b441cf6 100644 --- a/arch/arm/boot/dts/tegra20-colibri.dtsi +++ b/arch/arm/boot/dts/tegra20-colibri.dtsi @@ -742,6 +742,10 @@ sound { }; };
+&emc_icc_dvfs_opp_table { + /delete-node/ opp@760000000; +}; + &gpio { lan-reset-n { gpio-hog; diff --git a/arch/arm/boot/dts/tegra20-paz00.dts b/arch/arm/boot/dts/tegra20-paz00.dts index ada2bed8b1b5..7e49112cd9a1 100644 --- a/arch/arm/boot/dts/tegra20-paz00.dts +++ b/arch/arm/boot/dts/tegra20-paz00.dts @@ -662,3 +662,7 @@ cpu@1 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@760000000; +}; diff --git a/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi b/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi new file mode 100644 index 000000000000..25b1ba73951e --- /dev/null +++ b/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 + +/ { + emc_icc_dvfs_opp_table: emc-dvfs-opp-table { + compatible = "operating-points-v2"; + + opp@36000000 { + opp-microvolt = <950000 950000 1300000>; + opp-hz = /bits/ 64 <36000000>; + }; + + opp@47500000 { + opp-microvolt = <950000 950000 1300000>; + opp-hz = /bits/ 64 <47500000>; + }; + + opp@50000000 { + opp-microvolt = <950000 950000 1300000>; + opp-hz = /bits/ 64 <50000000>; + }; + + opp@54000000 { + opp-microvolt = <950000 950000 1300000>; + opp-hz = /bits/ 64 <54000000>; + }; + + opp@57000000 { + opp-microvolt = <950000 950000 1300000>; + opp-hz = /bits/ 64 <57000000>; + }; + + opp@100000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <100000000>; + }; + + opp@108000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <108000000>; + }; + + opp@126666000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <126666000>; + }; + + opp@150000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <150000000>; + }; + + opp@190000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <190000000>; + }; + + opp@216000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <216000000>; + }; + + opp@300000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <300000000>; + }; + + opp@333000000 { + opp-microvolt = <1000000 1000000 1300000>; + opp-hz = /bits/ 64 <333000000>; + }; + + opp@380000000 { + opp-microvolt = <1100000 1100000 1300000>; + opp-hz = /bits/ 64 <380000000>; + }; + + opp@600000000 { + opp-microvolt = <1200000 1200000 1300000>; + opp-hz = /bits/ 64 <600000000>; + }; + + opp@666000000 { + opp-microvolt = <1200000 1200000 1300000>; + opp-hz = /bits/ 64 <666000000>; + }; + + opp@760000000 { + opp-microvolt = <1300000 1300000 1300000>; + opp-hz = /bits/ 64 <760000000>; + }; + }; +}; diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi index 8f8ad81916e7..6ce498178105 100644 --- a/arch/arm/boot/dts/tegra20.dtsi +++ b/arch/arm/boot/dts/tegra20.dtsi @@ -6,6 +6,8 @@ #include <dt-bindings/interrupt-controller/arm-gic.h> #include <dt-bindings/soc/tegra-pmc.h>
+#include "tegra20-peripherals-opp.dtsi" + / { compatible = "nvidia,tegra20"; interrupt-parent = <&lic>; @@ -664,6 +666,7 @@ emc: memory-controller@7000f400 { #size-cells = <0>; #interconnect-cells = <0>;
+ operating-points-v2 = <&emc_icc_dvfs_opp_table>; nvidia,memory-controller = <&mc>; };
Add EMC OPP tables and interconnect paths that will be used for dynamic memory bandwidth scaling based on memory utilization statistics. Update board device-trees by removing unsupported EMC OPPs.
Note that ACTMON watches all memory interconnect paths, but we use a single CPU-READ interconnect path for driving memory bandwidth, for simplicity.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- ...30-asus-nexus7-grouper-memory-timings.dtsi | 12 + arch/arm/boot/dts/tegra30-ouya.dts | 8 + .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 383 ++++++++++++++++++ arch/arm/boot/dts/tegra30.dtsi | 6 + 4 files changed, 409 insertions(+) create mode 100644 arch/arm/boot/dts/tegra30-peripherals-opp.dtsi
diff --git a/arch/arm/boot/dts/tegra30-asus-nexus7-grouper-memory-timings.dtsi b/arch/arm/boot/dts/tegra30-asus-nexus7-grouper-memory-timings.dtsi index bc0f6f29b956..bcff0997ee51 100644 --- a/arch/arm/boot/dts/tegra30-asus-nexus7-grouper-memory-timings.dtsi +++ b/arch/arm/boot/dts/tegra30-asus-nexus7-grouper-memory-timings.dtsi @@ -1563,3 +1563,15 @@ timing-667000000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@750000000,1300; + /delete-node/ opp@800000000,1300; + /delete-node/ opp@900000000,1350; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@750000000; + /delete-node/ opp@800000000; + /delete-node/ opp@900000000; +}; diff --git a/arch/arm/boot/dts/tegra30-ouya.dts b/arch/arm/boot/dts/tegra30-ouya.dts index a5f16ad6c8f4..74da1360d297 100644 --- a/arch/arm/boot/dts/tegra30-ouya.dts +++ b/arch/arm/boot/dts/tegra30-ouya.dts @@ -4509,3 +4509,11 @@ drive_groups { nvidia,slew-rate-falling = <TEGRA_PIN_SLEW_RATE_SLOWEST>; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@900000000,1350; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@900000000; +}; diff --git a/arch/arm/boot/dts/tegra30-peripherals-opp.dtsi b/arch/arm/boot/dts/tegra30-peripherals-opp.dtsi new file mode 100644 index 000000000000..cbe84d25e726 --- /dev/null +++ b/arch/arm/boot/dts/tegra30-peripherals-opp.dtsi @@ -0,0 +1,383 @@ +// SPDX-License-Identifier: GPL-2.0 + +/ { + emc_icc_dvfs_opp_table: emc-dvfs-opp-table { + compatible = "operating-points-v2"; + + opp@12750000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0006>; + }; + + opp@12750000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0001>; + }; + + opp@12750000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0008>; + }; + + opp@25500000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <25500000>; + opp-supported-hw = <0x0006>; + }; + + opp@25500000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <25500000>; + opp-supported-hw = <0x0001>; + }; + + opp@25500000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <25500000>; + opp-supported-hw = <0x0008>; + }; + + opp@27000000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <27000000>; + opp-supported-hw = <0x0006>; + }; + + opp@27000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <27000000>; + opp-supported-hw = <0x0001>; + }; + + opp@27000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <27000000>; + opp-supported-hw = <0x0008>; + }; + + opp@51000000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <51000000>; + opp-supported-hw = <0x0006>; + }; + + opp@51000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <51000000>; + opp-supported-hw = <0x0001>; + }; + + opp@51000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <51000000>; + opp-supported-hw = <0x0008>; + }; + + opp@54000000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <54000000>; + opp-supported-hw = <0x0006>; + }; + + opp@54000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <54000000>; + opp-supported-hw = <0x0001>; + }; + + opp@54000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <54000000>; + opp-supported-hw = <0x0008>; + }; + + opp@102000000,950 { + opp-microvolt = <950000 950000 1350000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0006>; + }; + + opp@102000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0001>; + }; + + opp@102000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0008>; + }; + + opp@108000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <108000000>; + opp-supported-hw = <0x0007>; + }; + + opp@108000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <108000000>; + opp-supported-hw = <0x0008>; + }; + + opp@204000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0007>; + }; + + opp@204000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0008>; + }; + + opp@333500000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <333500000>; + opp-supported-hw = <0x0006>; + }; + + opp@333500000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <333500000>; + opp-supported-hw = <0x0001>; + }; + + opp@333500000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <333500000>; + opp-supported-hw = <0x0008>; + }; + + opp@375000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <375000000>; + opp-supported-hw = <0x0006>; + }; + + opp@375000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <375000000>; + opp-supported-hw = <0x0001>; + }; + + opp@375000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <375000000>; + opp-supported-hw = <0x0008>; + }; + + opp@400000000,1000 { + opp-microvolt = <1000000 1000000 1350000>; + opp-hz = /bits/ 64 <400000000>; + opp-supported-hw = <0x0006>; + }; + + opp@400000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <400000000>; + opp-supported-hw = <0x0001>; + }; + + opp@400000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <400000000>; + opp-supported-hw = <0x0008>; + }; + + opp@416000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <416000000>; + opp-supported-hw = <0x0007>; + }; + + opp@416000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <416000000>; + opp-supported-hw = <0x0008>; + }; + + opp@450000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <450000000>; + opp-supported-hw = <0x0007>; + }; + + opp@450000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <450000000>; + opp-supported-hw = <0x0008>; + }; + + opp@533000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <533000000>; + opp-supported-hw = <0x0007>; + }; + + opp@533000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <533000000>; + opp-supported-hw = <0x0008>; + }; + + opp@625000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <625000000>; + opp-supported-hw = <0x0006>; + }; + + opp@625000000,1250 { + opp-microvolt = <1250000 1250000 1350000>; + opp-hz = /bits/ 64 <625000000>; + opp-supported-hw = <0x0008>; + }; + + opp@667000000,1200 { + opp-microvolt = <1200000 1200000 1350000>; + opp-hz = /bits/ 64 <667000000>; + opp-supported-hw = <0x0006>; + }; + + opp@750000000,1300 { + opp-microvolt = <1300000 1300000 1350000>; + opp-hz = /bits/ 64 <750000000>; + opp-supported-hw = <0x0004>; + }; + + opp@800000000,1300 { + opp-microvolt = <1300000 1300000 1350000>; + opp-hz = /bits/ 64 <800000000>; + opp-supported-hw = <0x0004>; + }; + + opp@900000000,1350 { + opp-microvolt = <1350000 1350000 1350000>; + opp-hz = /bits/ 64 <900000000>; + opp-supported-hw = <0x0004>; + }; + }; + + emc_bw_dfs_opp_table: emc-bandwidth-opp-table { + compatible = "operating-points-v2"; + + opp@12750000 { + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <102000>; + }; + + opp@25500000 { + opp-hz = /bits/ 64 <25500000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <204000>; + }; + + opp@27000000 { + opp-hz = /bits/ 64 <27000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <216000>; + }; + + opp@51000000 { + opp-hz = /bits/ 64 <51000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <408000>; + }; + + opp@54000000 { + opp-hz = /bits/ 64 <54000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <432000>; + }; + + opp@102000000 { + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <816000>; + }; + + opp@108000000 { + opp-hz = /bits/ 64 <108000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <864000>; + }; + + opp@204000000 { + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <1632000>; + }; + + opp@333500000 { + opp-hz = /bits/ 64 <333500000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <2668000>; + }; + + opp@375000000 { + opp-hz = /bits/ 64 <375000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <3000000>; + }; + + opp@400000000 { + opp-hz = /bits/ 64 <400000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <3200000>; + }; + + opp@416000000 { + opp-hz = /bits/ 64 <416000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <3328000>; + }; + + opp@450000000 { + opp-hz = /bits/ 64 <450000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <3600000>; + }; + + opp@533000000 { + opp-hz = /bits/ 64 <533000000>; + opp-supported-hw = <0x000F>; + opp-peak-kBps = <4264000>; + }; + + opp@625000000 { + opp-hz = /bits/ 64 <625000000>; + opp-supported-hw = <0x000E>; + opp-peak-kBps = <5000000>; + }; + + opp@667000000 { + opp-hz = /bits/ 64 <667000000>; + opp-supported-hw = <0x0006>; + opp-peak-kBps = <5336000>; + }; + + opp@750000000 { + opp-hz = /bits/ 64 <750000000>; + opp-supported-hw = <0x0004>; + opp-peak-kBps = <6000000>; + }; + + opp@800000000 { + opp-hz = /bits/ 64 <800000000>; + opp-supported-hw = <0x0004>; + opp-peak-kBps = <6400000>; + }; + + opp@900000000 { + opp-hz = /bits/ 64 <900000000>; + opp-supported-hw = <0x0004>; + opp-peak-kBps = <7200000>; + }; + }; +}; diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi index 2caf6cc6f4b1..44a6dbba7081 100644 --- a/arch/arm/boot/dts/tegra30.dtsi +++ b/arch/arm/boot/dts/tegra30.dtsi @@ -6,6 +6,8 @@ #include <dt-bindings/interrupt-controller/arm-gic.h> #include <dt-bindings/soc/tegra-pmc.h>
+#include "tegra30-peripherals-opp.dtsi" + / { compatible = "nvidia,tegra30"; interrupt-parent = <&lic>; @@ -417,6 +419,9 @@ actmon@6000c800 { clock-names = "actmon", "emc"; resets = <&tegra_car TEGRA30_CLK_ACTMON>; reset-names = "actmon"; + operating-points-v2 = <&emc_bw_dfs_opp_table>; + interconnects = <&mc TEGRA30_MC_MPCORER &emc>; + interconnect-names = "cpu-read"; };
gpio: gpio@6000d000 { @@ -780,6 +785,7 @@ emc: memory-controller@7000f400 { clocks = <&tegra_car TEGRA30_CLK_EMC>;
nvidia,memory-controller = <&mc>; + operating-points-v2 = <&emc_icc_dvfs_opp_table>;
#interconnect-cells = <0>; };
Add EMC OPP DVFS/DFS tables and interconnect paths that will be used for dynamic memory bandwidth scaling based on memory utilization statistics. Update board device-trees by removing unsupported EMC OPPs.
Note that ACTMON watches all memory interconnect paths, but we use a single CPU-READ interconnect path for driving memory bandwidth, for simplicity.
Signed-off-by: Dmitry Osipenko digetx@gmail.com --- arch/arm/boot/dts/tegra124-apalis-emc.dtsi | 8 + .../arm/boot/dts/tegra124-jetson-tk1-emc.dtsi | 8 + arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi | 10 + .../arm/boot/dts/tegra124-nyan-blaze-emc.dtsi | 10 + .../boot/dts/tegra124-peripherals-opp.dtsi | 419 ++++++++++++++++++ arch/arm/boot/dts/tegra124.dtsi | 6 + 6 files changed, 461 insertions(+) create mode 100644 arch/arm/boot/dts/tegra124-peripherals-opp.dtsi
diff --git a/arch/arm/boot/dts/tegra124-apalis-emc.dtsi b/arch/arm/boot/dts/tegra124-apalis-emc.dtsi index 32401457ae71..a7ac805eeed5 100644 --- a/arch/arm/boot/dts/tegra124-apalis-emc.dtsi +++ b/arch/arm/boot/dts/tegra124-apalis-emc.dtsi @@ -1465,3 +1465,11 @@ timing-924000000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@1200000000,1100; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@1200000000; +}; diff --git a/arch/arm/boot/dts/tegra124-jetson-tk1-emc.dtsi b/arch/arm/boot/dts/tegra124-jetson-tk1-emc.dtsi index 861d3f22116b..df4e463afbd1 100644 --- a/arch/arm/boot/dts/tegra124-jetson-tk1-emc.dtsi +++ b/arch/arm/boot/dts/tegra124-jetson-tk1-emc.dtsi @@ -2420,3 +2420,11 @@ timing-924000000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@1200000000,1100; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@1200000000; +}; diff --git a/arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi b/arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi index c91647d13a50..a0f56cc9da5c 100644 --- a/arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi +++ b/arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi @@ -6649,3 +6649,13 @@ timing-792000000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@924000000,1100; + /delete-node/ opp@1200000000,1100; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@924000000; + /delete-node/ opp@1200000000; +}; diff --git a/arch/arm/boot/dts/tegra124-nyan-blaze-emc.dtsi b/arch/arm/boot/dts/tegra124-nyan-blaze-emc.dtsi index d2beea0bd15f..35c98734d35f 100644 --- a/arch/arm/boot/dts/tegra124-nyan-blaze-emc.dtsi +++ b/arch/arm/boot/dts/tegra124-nyan-blaze-emc.dtsi @@ -2048,3 +2048,13 @@ timing-792000000 { }; }; }; + +&emc_icc_dvfs_opp_table { + /delete-node/ opp@924000000,1100; + /delete-node/ opp@1200000000,1100; +}; + +&emc_bw_dfs_opp_table { + /delete-node/ opp@924000000; + /delete-node/ opp@1200000000; +}; diff --git a/arch/arm/boot/dts/tegra124-peripherals-opp.dtsi b/arch/arm/boot/dts/tegra124-peripherals-opp.dtsi new file mode 100644 index 000000000000..49d9420a3289 --- /dev/null +++ b/arch/arm/boot/dts/tegra124-peripherals-opp.dtsi @@ -0,0 +1,419 @@ +// SPDX-License-Identifier: GPL-2.0 + +/ { + emc_icc_dvfs_opp_table: emc-dvfs-opp-table { + compatible = "operating-points-v2"; + + opp@12750000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0003>; + }; + + opp@12750000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0008>; + }; + + opp@12750000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0010>; + }; + + opp@12750000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x0004>; + }; + + opp@20400000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <20400000>; + opp-supported-hw = <0x0003>; + }; + + opp@20400000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <20400000>; + opp-supported-hw = <0x0008>; + }; + + opp@20400000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <20400000>; + opp-supported-hw = <0x0010>; + }; + + opp@20400000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <20400000>; + opp-supported-hw = <0x0004>; + }; + + opp@40800000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <40800000>; + opp-supported-hw = <0x0003>; + }; + + opp@40800000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <40800000>; + opp-supported-hw = <0x0008>; + }; + + opp@40800000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <40800000>; + opp-supported-hw = <0x0010>; + }; + + opp@40800000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <40800000>; + opp-supported-hw = <0x0004>; + }; + + opp@68000000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <68000000>; + opp-supported-hw = <0x0003>; + }; + + opp@68000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <68000000>; + opp-supported-hw = <0x0008>; + }; + + opp@68000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <68000000>; + opp-supported-hw = <0x0010>; + }; + + opp@68000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <68000000>; + opp-supported-hw = <0x0004>; + }; + + opp@102000000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0003>; + }; + + opp@102000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0008>; + }; + + opp@102000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0010>; + }; + + opp@102000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x0004>; + }; + + opp@204000000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0003>; + }; + + opp@204000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0008>; + }; + + opp@204000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0010>; + }; + + opp@204000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x0004>; + }; + + opp@264000000,800 { + opp-microvolt = <800000 800000 1150000>; + opp-hz = /bits/ 64 <264000000>; + opp-supported-hw = <0x0003>; + }; + + opp@264000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <264000000>; + opp-supported-hw = <0x0008>; + }; + + opp@264000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <264000000>; + opp-supported-hw = <0x0010>; + }; + + opp@264000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <264000000>; + opp-supported-hw = <0x0004>; + }; + + opp@300000000,850 { + opp-microvolt = <850000 850000 1150000>; + opp-hz = /bits/ 64 <300000000>; + opp-supported-hw = <0x0003>; + }; + + opp@300000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <300000000>; + opp-supported-hw = <0x0008>; + }; + + opp@300000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <300000000>; + opp-supported-hw = <0x0010>; + }; + + opp@300000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <300000000>; + opp-supported-hw = <0x0004>; + }; + + opp@348000000,850 { + opp-microvolt = <850000 850000 1150000>; + opp-hz = /bits/ 64 <348000000>; + opp-supported-hw = <0x0003>; + }; + + opp@348000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <348000000>; + opp-supported-hw = <0x0008>; + }; + + opp@348000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <348000000>; + opp-supported-hw = <0x0010>; + }; + + opp@348000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <348000000>; + opp-supported-hw = <0x0004>; + }; + + opp@396000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <396000000>; + opp-supported-hw = <0x0008>; + }; + + opp@396000000,1000 { + opp-microvolt = <1000000 1000000 1150000>; + opp-hz = /bits/ 64 <396000000>; + opp-supported-hw = <0x0003>; + }; + + opp@396000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <396000000>; + opp-supported-hw = <0x0010>; + }; + + opp@396000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <396000000>; + opp-supported-hw = <0x0004>; + }; + + opp@528000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <528000000>; + opp-supported-hw = <0x0008>; + }; + + opp@528000000,1000 { + opp-microvolt = <1000000 1000000 1150000>; + opp-hz = /bits/ 64 <528000000>; + opp-supported-hw = <0x0003>; + }; + + opp@528000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <528000000>; + opp-supported-hw = <0x0010>; + }; + + opp@528000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <528000000>; + opp-supported-hw = <0x0004>; + }; + + opp@600000000,950 { + opp-microvolt = <950000 950000 1150000>; + opp-hz = /bits/ 64 <600000000>; + opp-supported-hw = <0x0008>; + }; + + opp@600000000,1000 { + opp-microvolt = <1000000 1000000 1150000>; + opp-hz = /bits/ 64 <600000000>; + opp-supported-hw = <0x0003>; + }; + + opp@600000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <600000000>; + opp-supported-hw = <0x0010>; + }; + + opp@600000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <600000000>; + opp-supported-hw = <0x0004>; + }; + + opp@792000000,1000 { + opp-microvolt = <1000000 1000000 1150000>; + opp-hz = /bits/ 64 <792000000>; + opp-supported-hw = <0x000B>; + }; + + opp@792000000,1050 { + opp-microvolt = <1050000 1050000 1150000>; + opp-hz = /bits/ 64 <792000000>; + opp-supported-hw = <0x0010>; + }; + + opp@792000000,1110 { + opp-microvolt = <1110000 1110000 1150000>; + opp-hz = /bits/ 64 <792000000>; + opp-supported-hw = <0x0004>; + }; + + opp@924000000,1100 { + opp-microvolt = <1100000 1100000 1150000>; + opp-hz = /bits/ 64 <924000000>; + opp-supported-hw = <0x0013>; + }; + + opp@1200000000,1100 { + opp-microvolt = <1100000 1100000 1150000>; + opp-hz = /bits/ 64 <1200000000>; + opp-supported-hw = <0x0003>; + }; + }; + + emc_bw_dfs_opp_table: emc-bandwidth-opp-table { + compatible = "operating-points-v2"; + + opp@12750000 { + opp-hz = /bits/ 64 <12750000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <204000>; + }; + + opp@20400000 { + opp-hz = /bits/ 64 <20400000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <326400>; + }; + + opp@40800000 { + opp-hz = /bits/ 64 <40800000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <652800>; + }; + + opp@68000000 { + opp-hz = /bits/ 64 <68000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <1088000>; + }; + + opp@102000000 { + opp-hz = /bits/ 64 <102000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <1632000>; + }; + + opp@204000000 { + opp-hz = /bits/ 64 <204000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <3264000>; + }; + + opp@264000000 { + opp-hz = /bits/ 64 <264000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <4224000>; + }; + + opp@300000000 { + opp-hz = /bits/ 64 <300000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <4800000>; + }; + + opp@348000000 { + opp-hz = /bits/ 64 <348000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <5568000>; + }; + + opp@396000000 { + opp-hz = /bits/ 64 <396000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <6336000>; + }; + + opp@528000000 { + opp-hz = /bits/ 64 <528000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <8448000>; + }; + + opp@600000000 { + opp-hz = /bits/ 64 <600000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <9600000>; + }; + + opp@792000000 { + opp-hz = /bits/ 64 <792000000>; + opp-supported-hw = <0x001F>; + opp-peak-kBps = <12672000>; + }; + + opp@924000000 { + opp-hz = /bits/ 64 <924000000>; + opp-supported-hw = <0x0013>; + opp-peak-kBps = <14784000>; + }; + + opp@1200000000 { + opp-hz = /bits/ 64 <1200000000>; + opp-supported-hw = <0x0003>; + opp-peak-kBps = <19200000>; + }; + }; +}; diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi index 1801e30b1d3a..46441d10a3fc 100644 --- a/arch/arm/boot/dts/tegra124.dtsi +++ b/arch/arm/boot/dts/tegra124.dtsi @@ -8,6 +8,8 @@ #include <dt-bindings/thermal/tegra124-soctherm.h> #include <dt-bindings/soc/tegra-pmc.h>
+#include "tegra124-peripherals-opp.dtsi" + / { compatible = "nvidia,tegra124"; interrupt-parent = <&lic>; @@ -290,6 +292,9 @@ actmon@6000c800 { clock-names = "actmon", "emc"; resets = <&tegra_car 119>; reset-names = "actmon"; + operating-points-v2 = <&emc_bw_dfs_opp_table>; + interconnects = <&mc TEGRA124_MC_MPCORER &emc>; + interconnect-names = "cpu-read"; };
gpio: gpio@6000d000 { @@ -660,6 +665,7 @@ emc: external-memory-controller@7001b000 { clock-names = "emc";
nvidia,memory-controller = <&mc>; + operating-points-v2 = <&emc_icc_dvfs_opp_table>;
#interconnect-cells = <0>; };
dri-devel@lists.freedesktop.org