In a perfect world we would be able to read GPU registers of interest via the command stream with a 'read-register' command/package. For perf counters it is a must to read them synchronized with the GPU to put the values in relation to a draw command. As Vivante GPUs do not provide this functionality we need to emulate it in software.
We need to support three different kind of perf register types:
1) normal register This is the easierst case where we can simply read the register and we are done.
2) debug register We need to configure the mux register and then read the debug register value.
3) pipeline register We need to 'iterate' over all pixel pipes and sum up the values. The 'iteration' is done by select the pipe of interest via HI_CLOCK_CONTROL_DEBUG_PIXEL_PIPE. There is also need to configure the mux register.
Allowing the userspace to do it all by its own feels quite error prone and not future-proof. Thats why the kernel exports all performance domains and their signals to the userspace via two new ioctls. So the kernel knows all performance counters and how to sample them.
At the moment all performacne domains and signals get exported to all gpu pipe types, but that can be changed in follow-up patches.
struct drm_etnaviv_gem_submit was extended to include so-called performance monitor requests (pmrs). A request defines what domain and signal should be sampled (pre/post draw cmdbuffer) and where to store the result.
The whole series can be found here: https://github.com/austriancoder/linux/tree/perfmon
Happy reviewing!
Christian Gmeiner (21): drm/etnaviv: add infrastructure to query perf counter drm/etnaviv: add uapi for perfmon feature drm/etnaviv: add internal representation of perfmon_request drm/etnaviv: extend etnaviv_gpu_cmdbuf_new(..) with nr_pmrs drm/etnaviv: add performance monitor request validation drm/etnaviv: copy pmrs from userspace drm/etnaviv: add performance monitor request processing drm/etnaviv: add 'sync point' support drm/etnaviv: clear alloced event drm/etnaviv: use 'sync points' for performance monitor requests drm/etnaviv: add HI perf domain drm/etnaviv: add PE perf domain drm/etnaviv: add SH perf domain drm/etnaviv: add PA perf domain drm/etnaviv: add SE perf domain drm/etnaviv: add RA perf domain drm/etnaviv: add TX perf domain drm/etnaviv: add MC perf domain drm/etnaviv: need to disable clock gating when doing profiling drm/etnaviv: enable debug registers on demand drm/etnaviv: submit supports performance monitor requests
drivers/gpu/drm/etnaviv/Makefile | 3 +- drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 36 +++ drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c | 13 +- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 6 +- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 39 ++- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 + drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 68 +++- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 151 ++++++++- drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 6 + drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 447 +++++++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 48 +++ include/uapi/drm/etnaviv_drm.h | 40 ++- 12 files changed, 836 insertions(+), 22 deletions(-) create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.h
Make it possible for userspace to query all performance domains and its signals. This information is needed to sample those signals via submit ioctl.
At the moment no performance domain is available.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/Makefile | 3 +- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 37 ++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 81 +++++++++++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 31 ++++++++++++ include/uapi/drm/etnaviv_drm.h | 24 ++++++++- 5 files changed, 174 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.h
diff --git a/drivers/gpu/drm/etnaviv/Makefile b/drivers/gpu/drm/etnaviv/Makefile index 4f76c99..15c3bfa 100644 --- a/drivers/gpu/drm/etnaviv/Makefile +++ b/drivers/gpu/drm/etnaviv/Makefile @@ -10,6 +10,7 @@ etnaviv-y := \ etnaviv_gpu.o \ etnaviv_iommu_v2.o \ etnaviv_iommu.o \ - etnaviv_mmu.o + etnaviv_mmu.o \ + etnaviv_perfmon.o
obj-$(CONFIG_DRM_ETNAVIV) += etnaviv.o diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 91e17ae..536760a 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -23,6 +23,7 @@ #include "etnaviv_gpu.h" #include "etnaviv_gem.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h"
#ifdef CONFIG_DRM_ETNAVIV_REGISTER_LOGGING static bool reglog; @@ -451,6 +452,40 @@ static int etnaviv_ioctl_gem_wait(struct drm_device *dev, void *data, return ret; }
+static int etnaviv_ioctl_pm_query_dom(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct etnaviv_drm_private *priv = dev->dev_private; + struct drm_etnaviv_pm_domain *args = data; + struct etnaviv_gpu *gpu; + + if (args->pipe >= ETNA_MAX_PIPES) + return -EINVAL; + + gpu = priv->gpu[args->pipe]; + if (!gpu) + return -ENXIO; + + return etnaviv_pm_query_dom(gpu, args); +} + +static int etnaviv_ioctl_pm_query_sig(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct etnaviv_drm_private *priv = dev->dev_private; + struct drm_etnaviv_pm_signal *args = data; + struct etnaviv_gpu *gpu; + + if (args->pipe >= ETNA_MAX_PIPES) + return -EINVAL; + + gpu = priv->gpu[args->pipe]; + if (!gpu) + return -ENXIO; + + return etnaviv_pm_query_sig(gpu, args); +} + static const struct drm_ioctl_desc etnaviv_ioctls[] = { #define ETNA_IOCTL(n, func, flags) \ DRM_IOCTL_DEF_DRV(ETNAVIV_##n, etnaviv_ioctl_##func, flags) @@ -463,6 +498,8 @@ static const struct drm_ioctl_desc etnaviv_ioctls[] = { ETNA_IOCTL(WAIT_FENCE, wait_fence, DRM_AUTH|DRM_RENDER_ALLOW), ETNA_IOCTL(GEM_USERPTR, gem_userptr, DRM_AUTH|DRM_RENDER_ALLOW), ETNA_IOCTL(GEM_WAIT, gem_wait, DRM_AUTH|DRM_RENDER_ALLOW), + ETNA_IOCTL(PM_QUERY_DOM, pm_query_dom, DRM_AUTH|DRM_RENDER_ALLOW), + ETNA_IOCTL(PM_QUERY_SIG, pm_query_sig, DRM_AUTH|DRM_RENDER_ALLOW), };
static const struct vm_operations_struct vm_ops = { diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c new file mode 100644 index 0000000..9c6f284 --- /dev/null +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c @@ -0,0 +1,81 @@ +/* + * Copyright (C) 2017 Etnaviv Project + * Copyright (C) 2017 Zodiac Inflight Innovations + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#include "etnaviv_gpu.h" + +struct etnaviv_pm_domain; + +struct etnaviv_pm_signal { + char name[64]; + u32 data; + + u32 (*sample)(struct etnaviv_gpu *gpu, + const struct etnaviv_pm_domain *domain, + const struct etnaviv_pm_signal *signal); +}; + +struct etnaviv_pm_domain { + char name[64]; + u8 nr_signals; + const struct etnaviv_pm_signal *signal; +}; + +static const struct etnaviv_pm_domain doms[] = { +}; + +int etnaviv_pm_query_dom(struct etnaviv_gpu *gpu, + struct drm_etnaviv_pm_domain *domain) +{ + const struct etnaviv_pm_domain *dom; + + if (domain->iter >= ARRAY_SIZE(doms)) + return -EINVAL; + + dom = &doms[domain->iter]; + + domain->nr_signals = dom->nr_signals; + strncpy(domain->name, dom->name, sizeof(domain->name)); + + if (domain->iter + 1 == ARRAY_SIZE(doms)) + domain->iter = 0xff; + + return 0; +} + +int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu, + struct drm_etnaviv_pm_signal *signal) +{ + const struct etnaviv_pm_domain *dom; + const struct etnaviv_pm_signal *sig; + + if (signal->domain >= ARRAY_SIZE(doms)) + return -EINVAL; + + dom = &doms[signal->domain]; + + if (signal->iter > dom->nr_signals) + return -EINVAL; + + sig = &dom->signal[signal->iter]; + + strncpy(signal->name, sig->name, sizeof(signal->name)); + + if (signal->iter + 1 == dom->nr_signals) + signal->iter = 0xff; + + return 0; +} diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h new file mode 100644 index 0000000..4589cad --- /dev/null +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h @@ -0,0 +1,31 @@ +/* + * Copyright (C) 2017 Etnaviv Project + * Copyright (C) 2017 Zodiac Inflight Innovations + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef __ETNAVIV_PERFMON_H__ +#define __ETNAVIV_PERFMON_H__ + +struct etnaviv_gpu; +struct drm_etnaviv_pm_domain; +struct drm_etnaviv_pm_signal; + +int etnaviv_pm_query_dom(struct etnaviv_gpu *gpu, + struct drm_etnaviv_pm_domain *domain); + +int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu, + struct drm_etnaviv_pm_signal *signal); + +#endif /* __ETNAVIV_PERFMON_H__ */ diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h index 76f6f78..9e488a0 100644 --- a/include/uapi/drm/etnaviv_drm.h +++ b/include/uapi/drm/etnaviv_drm.h @@ -210,6 +210,24 @@ struct drm_etnaviv_gem_wait { struct drm_etnaviv_timespec timeout; /* in */ };
+/* + * Performance Monitor (PM): + */ + +struct drm_etnaviv_pm_domain { + __u32 pipe; /* in */ + __u8 iter; /* in/out, select pm domain at index iter */ + __u8 nr_signals; /* out, how many signals does this domain provide */ + char name[64]; /* out, name of domain */ +}; + +struct drm_etnaviv_pm_signal { + __u32 pipe; /* in */ + __u8 domain; /* in, pm domain index */ + __u8 iter; /* in/out, select pm source at index iter */ + char name[64]; /* out, name of domain */ +}; + #define DRM_ETNAVIV_GET_PARAM 0x00 /* placeholder: #define DRM_ETNAVIV_SET_PARAM 0x01 @@ -222,7 +240,9 @@ struct drm_etnaviv_gem_wait { #define DRM_ETNAVIV_WAIT_FENCE 0x07 #define DRM_ETNAVIV_GEM_USERPTR 0x08 #define DRM_ETNAVIV_GEM_WAIT 0x09 -#define DRM_ETNAVIV_NUM_IOCTLS 0x0a +#define DRM_ETNAVIV_PM_QUERY_DOM 0x0a +#define DRM_ETNAVIV_PM_QUERY_SIG 0x0b +#define DRM_ETNAVIV_NUM_IOCTLS 0x0c
#define DRM_IOCTL_ETNAVIV_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_ETNAVIV_GET_PARAM, struct drm_etnaviv_param) #define DRM_IOCTL_ETNAVIV_GEM_NEW DRM_IOWR(DRM_COMMAND_BASE + DRM_ETNAVIV_GEM_NEW, struct drm_etnaviv_gem_new) @@ -233,6 +253,8 @@ struct drm_etnaviv_gem_wait { #define DRM_IOCTL_ETNAVIV_WAIT_FENCE DRM_IOW(DRM_COMMAND_BASE + DRM_ETNAVIV_WAIT_FENCE, struct drm_etnaviv_wait_fence) #define DRM_IOCTL_ETNAVIV_GEM_USERPTR DRM_IOWR(DRM_COMMAND_BASE + DRM_ETNAVIV_GEM_USERPTR, struct drm_etnaviv_gem_userptr) #define DRM_IOCTL_ETNAVIV_GEM_WAIT DRM_IOW(DRM_COMMAND_BASE + DRM_ETNAVIV_GEM_WAIT, struct drm_etnaviv_gem_wait) +#define DRM_IOCTL_ETNAVIV_PM_QUERY_DOM DRM_IOWR(DRM_COMMAND_BASE + DRM_ETNAVIV_PM_QUERY_DOM, struct drm_etnaviv_pm_domain) +#define DRM_IOCTL_ETNAVIV_PM_QUERY_SIG DRM_IOWR(DRM_COMMAND_BASE + DRM_ETNAVIV_PM_QUERY_SIG, struct drm_etnaviv_pm_signal)
#if defined(__cplusplus) }
Sadly we can not read any registers via command stream so we need to extend the drm_etnaviv_gem_submit struct with performance monitor requests. Those requests gets process before and/or after the actual submitted command stream.
The Vivante kernel driver has a special ioctl to read all perfmon registers at once and return it. There is no connection between a specifc command stream and the read perf values.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- include/uapi/drm/etnaviv_drm.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h index 9e488a0..b67a7dd 100644 --- a/include/uapi/drm/etnaviv_drm.h +++ b/include/uapi/drm/etnaviv_drm.h @@ -150,6 +150,19 @@ struct drm_etnaviv_gem_submit_bo { __u64 presumed; /* in/out, presumed buffer address */ };
+/* performance monitor request (pmr) */ +#define ETNA_PM_PROCESS_PRE 0x0001 +#define ETNA_PM_PROCESS_POST 0x0002 +struct drm_etnaviv_gem_submit_pmr { + __u32 flags; /* in, when to process request (ETNA_PM_PROCESS_x) */ + __u8 domain; /* in, pm domain */ + __u8 signal; /* in, pm signal */ + __u16 pad[3]; + __u32 sequence; /* in, sequence number */ + __u32 read_offset; /* in, offset from read_bo */ + __u32 read_idx; /* in, index of read_bo buffer */ +}; + /* Each cmdstream submit consists of a table of buffers involved, and * one or more cmdstream buffers. This allows for conditional execution * (context-restore), and IB buffers needed for per tile/bin draw cmds. @@ -175,6 +188,9 @@ struct drm_etnaviv_gem_submit { __u64 stream; /* in, ptr to cmdstream */ __u32 flags; /* in, mask of ETNA_SUBMIT_x */ __s32 fence_fd; /* in/out, fence fd (see ETNA_SUBMIT_FENCE_FD_x) */ + __u64 pmrs; /* in, ptr to array of submit_pmr's */ + __u32 nr_pmrs; /* in, number of submit_pmr's */ + __u32 pad; };
/* The normal way to synchronize with the GPU is just to CPU_PREP on
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
Sadly we can not read any registers via command stream so we need to extend the drm_etnaviv_gem_submit struct with performance monitor requests. Those requests gets process before and/or after the actual submitted command stream.
The Vivante kernel driver has a special ioctl to read all perfmon registers at once and return it. There is no connection between a specifc command stream and the read perf values.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
include/uapi/drm/etnaviv_drm.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h index 9e488a0..b67a7dd 100644 --- a/include/uapi/drm/etnaviv_drm.h +++ b/include/uapi/drm/etnaviv_drm.h @@ -150,6 +150,19 @@ struct drm_etnaviv_gem_submit_bo { __u64 presumed; /* in/out, presumed buffer address */ }; +/* performance monitor request (pmr) */ +#define ETNA_PM_PROCESS_PRE 0x0001 +#define ETNA_PM_PROCESS_POST 0x0002 +struct drm_etnaviv_gem_submit_pmr {
- __u32 flags; /* in, when to process request
(ETNA_PM_PROCESS_x) */
- __u8 domain; /* in, pm domain */
- __u8 signal; /* in, pm signal */
Are we sure that no domain will ever have more than 256 signals?
- __u16 pad[3];
Unnecessary large pad. A single u16 is enough to naturally align the next member.
- __u32 sequence; /* in, sequence number */
- __u32 read_offset; /* in, offset from read_bo */
- __u32 read_idx; /* in, index of read_bo buffer */
+};
/* Each cmdstream submit consists of a table of buffers involved, and * one or more cmdstream buffers. This allows for conditional execution * (context-restore), and IB buffers needed for per tile/bin draw cmds. @@ -175,6 +188,9 @@ struct drm_etnaviv_gem_submit { __u64 stream; /* in, ptr to cmdstream */ __u32 flags; /* in, mask of ETNA_SUBMIT_x */ __s32 fence_fd; /* in/out, fence fd (see ETNA_SUBMIT_FENCE_FD_x) */
- __u64 pmrs; /* in, ptr to array of submit_pmr's */
- __u32 nr_pmrs; /* in, number of submit_pmr's */
- __u32 pad;
This pad isn't needed.
}; /* The normal way to synchronize with the GPU is just to CPU_PREP on
2017-06-26 14:56 GMT+02:00 Lucas Stach l.stach@pengutronix.de:
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
Sadly we can not read any registers via command stream so we need to extend the drm_etnaviv_gem_submit struct with performance monitor requests. Those requests gets process before and/or after the actual submitted command stream.
The Vivante kernel driver has a special ioctl to read all perfmon registers at once and return it. There is no connection between a specifc command stream and the read perf values.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
include/uapi/drm/etnaviv_drm.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h index 9e488a0..b67a7dd 100644 --- a/include/uapi/drm/etnaviv_drm.h +++ b/include/uapi/drm/etnaviv_drm.h @@ -150,6 +150,19 @@ struct drm_etnaviv_gem_submit_bo { __u64 presumed; /* in/out, presumed buffer address */ };
+/* performance monitor request (pmr) */ +#define ETNA_PM_PROCESS_PRE 0x0001 +#define ETNA_PM_PROCESS_POST 0x0002 +struct drm_etnaviv_gem_submit_pmr {
__u32 flags; /* in, when to process request
(ETNA_PM_PROCESS_x) */
__u8 domain; /* in, pm domain */
__u8 signal; /* in, pm signal */
Are we sure that no domain will ever have more than 256 signals?
My crystal ball is not working as expected so I don't know - but I will change it to a __u16.
__u16 pad[3];
Unnecessary large pad. A single u16 is enough to naturally align the next member.
Correct.
__u32 sequence; /* in, sequence number */
__u32 read_offset; /* in, offset from read_bo */
__u32 read_idx; /* in, index of read_bo buffer */
+};
/* Each cmdstream submit consists of a table of buffers involved, and
- one or more cmdstream buffers. This allows for conditional
execution
- (context-restore), and IB buffers needed for per tile/bin draw
cmds. @@ -175,6 +188,9 @@ struct drm_etnaviv_gem_submit { __u64 stream; /* in, ptr to cmdstream */ __u32 flags; /* in, mask of ETNA_SUBMIT_x */ __s32 fence_fd; /* in/out, fence fd (see ETNA_SUBMIT_FENCE_FD_x) */
__u64 pmrs; /* in, ptr to array of submit_pmr's */
__u32 nr_pmrs; /* in, number of submit_pmr's */
__u32 pad;
This pad isn't needed.
* Pad the entire struct to a multiple of 64-bits if the structure contains 64-bit types - the structure size will otherwise differ on 32-bit versus 64-bit. Having a different structure size hurts when passing arrays of structures to the kernel, or if the kernel checks the structure size, which e.g. the drm core does.
$ pahole drivers/gpu/drm/etnaviv/etnaviv.o -C drm_etnaviv_gem_submit struct drm_etnaviv_gem_submit { __u32 fence; /* 0 4 */ __u32 pipe; /* 4 4 */ __u32 exec_state; /* 8 4 */ __u32 nr_bos; /* 12 4 */ __u32 nr_relocs; /* 16 4 */ __u32 stream_size; /* 20 4 */ __u64 bos; /* 24 8 */ __u64 relocs; /* 32 8 */ __u64 stream; /* 40 8 */ __u32 flags; /* 48 4 */ __s32 fence_fd; /* 52 4 */ __u64 pmrs; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ __u32 nr_pmrs; /* 64 4 */ __u32 pad; /* 68 4 */
/* size: 72, cachelines: 2, members: 14 */ /* last cacheline: 8 bytes */ };
};
/* The normal way to synchronize with the GPU is just to CPU_PREP on
greets -- Christian Gmeiner, MSc
https://www.youtube.com/user/AloryOFFICIAL https://soundcloud.com/christian-gmeiner
Am Sonntag, den 02.07.2017, 12:04 +0200 schrieb Christian Gmeiner:
2017-06-26 14:56 GMT+02:00 Lucas Stach l.stach@pengutronix.de:
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
Sadly we can not read any registers via command stream so we need to extend the drm_etnaviv_gem_submit struct with performance monitor requests. Those requests gets process before and/or after the actual submitted command stream.
The Vivante kernel driver has a special ioctl to read all perfmon registers at once and return it. There is no connection between a specifc command stream and the read perf values.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
include/uapi/drm/etnaviv_drm.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/include/uapi/drm/etnaviv_drm.h b/include/uapi/drm/etnaviv_drm.h index 9e488a0..b67a7dd 100644 --- a/include/uapi/drm/etnaviv_drm.h +++ b/include/uapi/drm/etnaviv_drm.h @@ -150,6 +150,19 @@ struct drm_etnaviv_gem_submit_bo { __u64 presumed; /* in/out, presumed buffer address */ };
+/* performance monitor request (pmr) */ +#define ETNA_PM_PROCESS_PRE 0x0001 +#define ETNA_PM_PROCESS_POST 0x0002 +struct drm_etnaviv_gem_submit_pmr {
__u32 flags; /* in, when to process request
(ETNA_PM_PROCESS_x) */
__u8 domain; /* in, pm domain */
__u8 signal; /* in, pm signal */
Are we sure that no domain will ever have more than 256 signals?
My crystal ball is not working as expected so I don't know - but I will change it to a __u16.
__u16 pad[3];
Unnecessary large pad. A single u16 is enough to naturally align the next member.
Correct.
__u32 sequence; /* in, sequence number */
__u32 read_offset; /* in, offset from read_bo */
__u32 read_idx; /* in, index of read_bo buffer */
+};
/* Each cmdstream submit consists of a table of buffers involved, and
- one or more cmdstream buffers. This allows for conditional
execution
- (context-restore), and IB buffers needed for per tile/bin draw
cmds. @@ -175,6 +188,9 @@ struct drm_etnaviv_gem_submit { __u64 stream; /* in, ptr to cmdstream */ __u32 flags; /* in, mask of ETNA_SUBMIT_x */ __s32 fence_fd; /* in/out, fence fd (see ETNA_SUBMIT_FENCE_FD_x) */
__u64 pmrs; /* in, ptr to array of submit_pmr's */
__u32 nr_pmrs; /* in, number of submit_pmr's */
__u32 pad;
This pad isn't needed.
- Pad the entire struct to a multiple of 64-bits if the structure contains 64-bit types - the structure size will otherwise differ on 32-bit versus 64-bit. Having a different structure size hurts when passing arrays of structures to the kernel, or if the kernel checks the structure size, which e.g. the drm core does.
Okay it is still not really necessary (we don't ever have pass an array of struct drm_etnaviv_gem_submit and the worst that could happen is that the DRM core needs to zero the added struct padding on 64bit), but it also doesn't hurt and skipping the zero fill on 64bit sounds good, so please keep as-is.
Regards, Lucas
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 4 ++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 12 ++++++++++++ 2 files changed, 16 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h index 80d7807..1b549f0 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h @@ -21,6 +21,7 @@
struct etnaviv_gpu; struct etnaviv_cmdbuf_suballoc; +struct etnaviv_perfmon_request;
struct etnaviv_cmdbuf { /* suballocator this cmdbuf is allocated from */ @@ -38,6 +39,9 @@ struct etnaviv_cmdbuf { u32 exec_state; /* per GPU in-flight list */ struct list_head node; + /* perfmon requests */ + unsigned int nr_pmrs; + struct etnaviv_perfmon_request *pmrs; /* BOs attached to this command buffer */ unsigned int nr_bos; struct etnaviv_vram_mapping *bo_map[0]; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h index 4589cad..4b2b518 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h @@ -22,6 +22,18 @@ struct etnaviv_gpu; struct drm_etnaviv_pm_domain; struct drm_etnaviv_pm_signal;
+struct etnaviv_perfmon_request +{ + u32 flags; + u8 domain; + u8 signal; + u32 sequence; + + /* bo to store a value */ + u32 *bo_vma; + u32 offset; +}; + int etnaviv_pm_query_dom(struct etnaviv_gpu *gpu, struct drm_etnaviv_pm_domain *domain);
This commits extends etnaviv_gpu_cmdbuf_new(..) to define the number of struct etnaviv_perfmon elements gets stored.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c | 13 ++++++++++++- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 4 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c index 633e0f0..bb3f82e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c @@ -19,6 +19,7 @@ #include "etnaviv_cmdbuf.h" #include "etnaviv_gpu.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h"
#define SUBALLOC_SIZE SZ_256K #define SUBALLOC_GRANULE SZ_4K @@ -87,9 +88,10 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc)
struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size, - size_t nr_bos) + size_t nr_bos, size_t nr_pmrs) { struct etnaviv_cmdbuf *cmdbuf; + struct etnaviv_perfmon_request *pmrs; size_t sz = size_vstruct(nr_bos, sizeof(cmdbuf->bo_map[0]), sizeof(*cmdbuf)); int granule_offs, order, ret; @@ -98,6 +100,14 @@ etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size, if (!cmdbuf) return NULL;
+ sz = sizeof(*pmrs) * nr_pmrs; + pmrs = kzalloc(sz, GFP_KERNEL); + if (!pmrs) { + kfree(cmdbuf); + return NULL; + } + + cmdbuf->pmrs = pmrs; cmdbuf->suballoc = suballoc; cmdbuf->size = size;
@@ -139,6 +149,7 @@ void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf) suballoc->free_space = 1; mutex_unlock(&suballoc->lock); wake_up_all(&suballoc->free_event); + kfree(cmdbuf->pmrs); kfree(cmdbuf); }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h index 1b549f0..b6348b9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h @@ -53,7 +53,7 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc);
struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size, - size_t nr_bos); + size_t nr_bos, size_t nr_pmrs); void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf);
u32 etnaviv_cmdbuf_get_va(struct etnaviv_cmdbuf *buf); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index de80ee1..fdfe31d 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -350,7 +350,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8, - args->nr_bos); + args->nr_bos, 0); if (!bos || !relocs || !stream || !cmdbuf) { ret = -ENOMEM; goto err_submit_cmds; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index ada45fd..037087e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -721,7 +721,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu) }
/* Create buffer: */ - gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, PAGE_SIZE, 0); + gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, PAGE_SIZE, 0, 0); if (!gpu->buffer) { ret = -ENOMEM; dev_err(gpu->dev, "could not create command buffer\n");
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
This commits extends etnaviv_gpu_cmdbuf_new(..) to define the number of struct etnaviv_perfmon elements gets stored.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c | 13 ++++++++++++- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 4 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c index 633e0f0..bb3f82e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c @@ -19,6 +19,7 @@ #include "etnaviv_cmdbuf.h" #include "etnaviv_gpu.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h" #define SUBALLOC_SIZE SZ_256K #define SUBALLOC_GRANULE SZ_4K @@ -87,9 +88,10 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc) struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size,
size_t nr_bos)
size_t nr_bos, size_t nr_pmrs)
{ struct etnaviv_cmdbuf *cmdbuf;
- struct etnaviv_perfmon_request *pmrs;
size_t sz = size_vstruct(nr_bos, sizeof(cmdbuf->bo_map[0]), sizeof(*cmdbuf)); int granule_offs, order, ret; @@ -98,6 +100,14 @@ etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size, if (!cmdbuf) return NULL;
- sz = sizeof(*pmrs) * nr_pmrs;
- pmrs = kzalloc(sz, GFP_KERNEL);
- if (!pmrs) {
kfree(cmdbuf);
return NULL;
- }
While this is okay as-is, I have some more additions to this code planned, so I would appreciate if you could change this into a "goto out_free_cmdbuf" and move this down to the end of the function.
Regards, Lucas
- cmdbuf->pmrs = pmrs;
cmdbuf->suballoc = suballoc; cmdbuf->size = size; @@ -139,6 +149,7 @@ void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf) suballoc->free_space = 1; mutex_unlock(&suballoc->lock); wake_up_all(&suballoc->free_event);
- kfree(cmdbuf->pmrs);
kfree(cmdbuf); } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h index 1b549f0..b6348b9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h @@ -53,7 +53,7 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc); struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size,
size_t nr_bos);
size_t nr_bos, size_t nr_pmrs);
void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf); u32 etnaviv_cmdbuf_get_va(struct etnaviv_cmdbuf *buf); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index de80ee1..fdfe31d 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -350,7 +350,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8,
args->nr_bos);
args->nr_bos, 0);
if (!bos || !relocs || !stream || !cmdbuf) { ret = -ENOMEM; goto err_submit_cmds; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index ada45fd..037087e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -721,7 +721,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu) } /* Create buffer: */
- gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc,
PAGE_SIZE, 0);
- gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc,
PAGE_SIZE, 0, 0); if (!gpu->buffer) { ret = -ENOMEM; dev_err(gpu->dev, "could not create command buffer\n");
2017-06-26 15:02 GMT+02:00 Lucas Stach l.stach@pengutronix.de:
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
This commits extends etnaviv_gpu_cmdbuf_new(..) to define the number of struct etnaviv_perfmon elements gets stored.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c | 13 ++++++++++++- drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 4 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c index 633e0f0..bb3f82e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c @@ -19,6 +19,7 @@ #include "etnaviv_cmdbuf.h" #include "etnaviv_gpu.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h"
#define SUBALLOC_SIZE SZ_256K #define SUBALLOC_GRANULE SZ_4K @@ -87,9 +88,10 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc)
struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size,
size_t nr_bos)
size_t nr_bos, size_t nr_pmrs)
{ struct etnaviv_cmdbuf *cmdbuf;
struct etnaviv_perfmon_request *pmrs; size_t sz = size_vstruct(nr_bos, sizeof(cmdbuf->bo_map[0]), sizeof(*cmdbuf)); int granule_offs, order, ret;
@@ -98,6 +100,14 @@ etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size, if (!cmdbuf) return NULL;
sz = sizeof(*pmrs) * nr_pmrs;
pmrs = kzalloc(sz, GFP_KERNEL);
if (!pmrs) {
kfree(cmdbuf);
return NULL;
}
While this is okay as-is, I have some more additions to this code planned, so I would appreciate if you could change this into a "goto out_free_cmdbuf" and move this down to the end of the function.
Sure - will change it that way.
greets -- Christian Gmeiner, MSc
https://www.youtube.com/user/AloryOFFICIAL https://soundcloud.com/christian-gmeiner
Regards, Lucas
cmdbuf->pmrs = pmrs; cmdbuf->suballoc = suballoc; cmdbuf->size = size;
@@ -139,6 +149,7 @@ void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf) suballoc->free_space = 1; mutex_unlock(&suballoc->lock); wake_up_all(&suballoc->free_event);
kfree(cmdbuf->pmrs); kfree(cmdbuf);
}
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h index 1b549f0..b6348b9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h @@ -53,7 +53,7 @@ void etnaviv_cmdbuf_suballoc_destroy(struct etnaviv_cmdbuf_suballoc *suballoc);
struct etnaviv_cmdbuf * etnaviv_cmdbuf_new(struct etnaviv_cmdbuf_suballoc *suballoc, u32 size,
size_t nr_bos);
size_t nr_bos, size_t nr_pmrs);
void etnaviv_cmdbuf_free(struct etnaviv_cmdbuf *cmdbuf);
u32 etnaviv_cmdbuf_get_va(struct etnaviv_cmdbuf *buf); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index de80ee1..fdfe31d 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -350,7 +350,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8,
args->nr_bos);
args->nr_bos, 0); if (!bos || !relocs || !stream || !cmdbuf) { ret = -ENOMEM; goto err_submit_cmds;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index ada45fd..037087e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -721,7 +721,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu) }
/* Create buffer: */
gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc,
PAGE_SIZE, 0);
gpu->buffer = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc,
PAGE_SIZE, 0, 0); if (!gpu->buffer) { ret = -ENOMEM; dev_err(gpu->dev, "could not create command buffer\n");
Check if the selected domain and signal combination exists.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 15 +++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 2 ++ 2 files changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c index 9c6f284..b8dc5fa 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c @@ -79,3 +79,18 @@ int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu,
return 0; } + +int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r) +{ + const struct etnaviv_pm_domain *dom; + + if (r->domain >= ARRAY_SIZE(doms)) + return -EINVAL; + + dom = &doms[r->domain]; + + if (r->signal > dom->nr_signals) + return -EINVAL; + + return 0; +} diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h index 4b2b518..f20b69c 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h @@ -40,4 +40,6 @@ int etnaviv_pm_query_dom(struct etnaviv_gpu *gpu, int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu, struct drm_etnaviv_pm_signal *signal);
+int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r); + #endif /* __ETNAVIV_PERFMON_H__ */
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
Check if the selected domain and signal combination exists.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
Reviewed-by: Lucas Stach l.stach@pengutronix.de
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 15 +++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 2 ++ 2 files changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c index 9c6f284..b8dc5fa 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c @@ -79,3 +79,18 @@ int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu, return 0; }
+int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r) +{
- const struct etnaviv_pm_domain *dom;
- if (r->domain >= ARRAY_SIZE(doms))
return -EINVAL;
- dom = &doms[r->domain];
- if (r->signal > dom->nr_signals)
return -EINVAL;
- return 0;
+} diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h index 4b2b518..f20b69c 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h @@ -40,4 +40,6 @@ int etnaviv_pm_query_dom(struct etnaviv_gpu *gpu, int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu, struct drm_etnaviv_pm_signal *signal); +int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r);
#endif /* __ETNAVIV_PERFMON_H__ */
All performance monitor requests will be validated during this phase.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 68 +++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index fdfe31d..ed05b64 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -21,6 +21,7 @@ #include "etnaviv_drv.h" #include "etnaviv_gpu.h" #include "etnaviv_gem.h" +#include "etnaviv_perfmon.h"
/* * Cmdstream submission: @@ -283,6 +284,53 @@ static int submit_reloc(struct etnaviv_gem_submit *submit, void *stream, return 0; }
+static int submit_perfmon_request(struct etnaviv_gem_submit *submit, + struct etnaviv_cmdbuf *cmdbuf, + const struct drm_etnaviv_gem_submit_pmr *pmrs, + u32 nr_pms) +{ + u32 i; + + for (i = 0; i < nr_pms; i++) { + const struct drm_etnaviv_gem_submit_pmr *r = pmrs + i; + struct etnaviv_gem_submit_bo *bo; + int ret; + + ret = submit_bo(submit, r->read_idx, &bo); + if (ret) + return ret; + + if (r->read_offset == 0) { + DRM_ERROR("perfmon request: offset is 0"); + return -EINVAL; + } + + if (r->read_offset >= bo->obj->base.size - sizeof(u32)) { + DRM_ERROR("perfmon request: offset %u outside object", i); + return -EINVAL; + } + + if (!r->flags) { + DRM_ERROR("perfmon request: flags are 0"); + return -EINVAL; + } + + if (etnaviv_pm_req_validate(r)) { + DRM_ERROR("perfmon request: domain or signal not valid"); + return -EINVAL; + } + + cmdbuf->pmrs[i].flags = r->flags; + cmdbuf->pmrs[i].domain = r->domain; + cmdbuf->pmrs[i].signal = r->signal; + cmdbuf->pmrs[i].sequence = r->sequence; + cmdbuf->pmrs[i].offset = r->read_offset; + cmdbuf->pmrs[i].bo_vma = etnaviv_gem_vmap(&bo->obj->base); + } + + return 0; +} + static void submit_cleanup(struct etnaviv_gem_submit *submit) { unsigned i; @@ -306,6 +354,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, struct etnaviv_drm_private *priv = dev->dev_private; struct drm_etnaviv_gem_submit *args = data; struct drm_etnaviv_gem_submit_reloc *relocs; + struct drm_etnaviv_gem_submit_pmr *pmrs; struct drm_etnaviv_gem_submit_bo *bos; struct etnaviv_gem_submit *submit; struct etnaviv_cmdbuf *cmdbuf; @@ -347,11 +396,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, */ bos = drm_malloc_ab(args->nr_bos, sizeof(*bos)); relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs)); + pmrs = drm_malloc_ab(args->nr_pmrs, sizeof(*pmrs)); stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8, - args->nr_bos, 0); - if (!bos || !relocs || !stream || !cmdbuf) { + args->nr_bos, args->nr_pmrs); + if (!bos || !relocs || !pmrs || !stream || !cmdbuf) { ret = -ENOMEM; goto err_submit_cmds; } @@ -373,6 +423,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, goto err_submit_cmds; }
+ ret = copy_from_user(pmrs, u64_to_user_ptr(args->pmrs), + args->nr_pmrs * sizeof(*pmrs)); + if (ret) { + ret = -EFAULT; + goto err_submit_cmds; + } + ret = copy_from_user(stream, u64_to_user_ptr(args->stream), args->stream_size); if (ret) { @@ -441,8 +498,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
+ ret = submit_perfmon_request(submit, cmdbuf, pmrs, args->nr_pmrs); + if (ret) + goto out; + memcpy(cmdbuf->vaddr, stream, args->stream_size); cmdbuf->user_size = ALIGN(args->stream_size, 8); + cmdbuf->nr_pmrs = args->nr_pmrs;
ret = etnaviv_gpu_submit(gpu, submit, cmdbuf); if (ret == 0) @@ -494,6 +556,8 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, drm_free_large(bos); if (relocs) drm_free_large(relocs); + if (pmrs) + drm_free_large(pmrs);
return ret; }
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
All performance monitor requests will be validated during this phase.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 68 +++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index fdfe31d..ed05b64 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -21,6 +21,7 @@ #include "etnaviv_drv.h" #include "etnaviv_gpu.h" #include "etnaviv_gem.h" +#include "etnaviv_perfmon.h" /* * Cmdstream submission: @@ -283,6 +284,53 @@ static int submit_reloc(struct etnaviv_gem_submit *submit, void *stream, return 0; } +static int submit_perfmon_request(struct etnaviv_gem_submit *submit,
Please name this after what is does: submit_perfmon_validate()
struct etnaviv_cmdbuf *cmdbuf,
const struct drm_etnaviv_gem_submit_pmr *pmrs,
u32 nr_pms)
+{
- u32 i;
- for (i = 0; i < nr_pms; i++) {
const struct drm_etnaviv_gem_submit_pmr *r = pmrs +
i;
struct etnaviv_gem_submit_bo *bo;
int ret;
ret = submit_bo(submit, r->read_idx, &bo);
if (ret)
return ret;
if (r->read_offset == 0) {
DRM_ERROR("perfmon request: offset is 0");
return -EINVAL;
}
Why is a 0 offset invalid? Can't the readback be placed at the beginning of the BO?
if (r->read_offset >= bo->obj->base.size -
sizeof(u32)) {
DRM_ERROR("perfmon request: offset %u
outside object", i);
return -EINVAL;
}
if (!r->flags) {
DRM_ERROR("perfmon request: flags are 0");
return -EINVAL;
}
This needs to validate that the flags are valid, not just that there are flags -> reject any non-allocated flag bits.
if (etnaviv_pm_req_validate(r)) {
DRM_ERROR("perfmon request: domain or signal
not valid");
return -EINVAL;
}
cmdbuf->pmrs[i].flags = r->flags;
cmdbuf->pmrs[i].domain = r->domain;
cmdbuf->pmrs[i].signal = r->signal;
cmdbuf->pmrs[i].sequence = r->sequence;
cmdbuf->pmrs[i].offset = r->read_offset;
cmdbuf->pmrs[i].bo_vma = etnaviv_gem_vmap(&bo->obj-
base);
- }
- return 0;
+}
static void submit_cleanup(struct etnaviv_gem_submit *submit) { unsigned i; @@ -306,6 +354,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, struct etnaviv_drm_private *priv = dev->dev_private; struct drm_etnaviv_gem_submit *args = data; struct drm_etnaviv_gem_submit_reloc *relocs;
- struct drm_etnaviv_gem_submit_pmr *pmrs;
struct drm_etnaviv_gem_submit_bo *bos; struct etnaviv_gem_submit *submit; struct etnaviv_cmdbuf *cmdbuf; @@ -347,11 +396,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, */ bos = drm_malloc_ab(args->nr_bos, sizeof(*bos)); relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
- pmrs = drm_malloc_ab(args->nr_pmrs, sizeof(*pmrs));
stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8,
args->nr_bos, 0);
- if (!bos || !relocs || !stream || !cmdbuf) {
args->nr_bos, args->nr_pmrs);
- if (!bos || !relocs || !pmrs || !stream || !cmdbuf) {
ret = -ENOMEM; goto err_submit_cmds; } @@ -373,6 +423,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, goto err_submit_cmds; }
- ret = copy_from_user(pmrs, u64_to_user_ptr(args->pmrs),
args->nr_pmrs * sizeof(*pmrs));
- if (ret) {
ret = -EFAULT;
goto err_submit_cmds;
- }
ret = copy_from_user(stream, u64_to_user_ptr(args->stream), args->stream_size); if (ret) { @@ -441,8 +498,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
- ret = submit_perfmon_request(submit, cmdbuf, pmrs, args-
nr_pmrs);
- if (ret)
goto out;
memcpy(cmdbuf->vaddr, stream, args->stream_size); cmdbuf->user_size = ALIGN(args->stream_size, 8);
- cmdbuf->nr_pmrs = args->nr_pmrs;
Please move this below the copy_from_user of the pmrs.
ret = etnaviv_gpu_submit(gpu, submit, cmdbuf); if (ret == 0) @@ -494,6 +556,8 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, drm_free_large(bos); if (relocs) drm_free_large(relocs);
- if (pmrs)
drm_free_large(pmrs);
return ret; }
2017-06-26 15:18 GMT+02:00 Lucas Stach l.stach@pengutronix.de:
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
All performance monitor requests will be validated during this phase.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 68 +++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index fdfe31d..ed05b64 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -21,6 +21,7 @@ #include "etnaviv_drv.h" #include "etnaviv_gpu.h" #include "etnaviv_gem.h" +#include "etnaviv_perfmon.h"
/*
- Cmdstream submission:
@@ -283,6 +284,53 @@ static int submit_reloc(struct etnaviv_gem_submit *submit, void *stream, return 0; }
+static int submit_perfmon_request(struct etnaviv_gem_submit *submit,
Please name this after what is does: submit_perfmon_validate()
OK
struct etnaviv_cmdbuf *cmdbuf,
const struct drm_etnaviv_gem_submit_pmr *pmrs,
u32 nr_pms)
+{
u32 i;
for (i = 0; i < nr_pms; i++) {
const struct drm_etnaviv_gem_submit_pmr *r = pmrs +
i;
struct etnaviv_gem_submit_bo *bo;
int ret;
ret = submit_bo(submit, r->read_idx, &bo);
if (ret)
return ret;
if (r->read_offset == 0) {
DRM_ERROR("perfmon request: offset is 0");
return -EINVAL;
}
Why is a 0 offset invalid? Can't the readback be placed at the beginning of the BO?
At offset 0 the sequence number gets stored. This sequence number is used to sync kernel space and user space. See sync_point_perfmon_sample_post() in [PATCH 10/21] drm/etnaviv: use 'sync points' for performance monitor requests Will add a comment about this fact.
if (r->read_offset >= bo->obj->base.size -
sizeof(u32)) {
DRM_ERROR("perfmon request: offset %u
outside object", i);
return -EINVAL;
}
if (!r->flags) {
DRM_ERROR("perfmon request: flags are 0");
return -EINVAL;
}
This needs to validate that the flags are valid, not just that there are flags -> reject any non-allocated flag bits.
Good point - will add flags validation.
if (etnaviv_pm_req_validate(r)) {
DRM_ERROR("perfmon request: domain or signal
not valid");
return -EINVAL;
}
cmdbuf->pmrs[i].flags = r->flags;
cmdbuf->pmrs[i].domain = r->domain;
cmdbuf->pmrs[i].signal = r->signal;
cmdbuf->pmrs[i].sequence = r->sequence;
cmdbuf->pmrs[i].offset = r->read_offset;
cmdbuf->pmrs[i].bo_vma = etnaviv_gem_vmap(&bo->obj-
base);
}
return 0;
+}
static void submit_cleanup(struct etnaviv_gem_submit *submit) { unsigned i; @@ -306,6 +354,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, struct etnaviv_drm_private *priv = dev->dev_private; struct drm_etnaviv_gem_submit *args = data; struct drm_etnaviv_gem_submit_reloc *relocs;
struct drm_etnaviv_gem_submit_pmr *pmrs; struct drm_etnaviv_gem_submit_bo *bos; struct etnaviv_gem_submit *submit; struct etnaviv_cmdbuf *cmdbuf;
@@ -347,11 +396,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, */ bos = drm_malloc_ab(args->nr_bos, sizeof(*bos)); relocs = drm_malloc_ab(args->nr_relocs, sizeof(*relocs));
pmrs = drm_malloc_ab(args->nr_pmrs, sizeof(*pmrs)); stream = drm_malloc_ab(1, args->stream_size); cmdbuf = etnaviv_cmdbuf_new(gpu->cmdbuf_suballoc, ALIGN(args->stream_size, 8) + 8,
args->nr_bos, 0);
if (!bos || !relocs || !stream || !cmdbuf) {
args->nr_bos, args->nr_pmrs);
if (!bos || !relocs || !pmrs || !stream || !cmdbuf) { ret = -ENOMEM; goto err_submit_cmds; }
@@ -373,6 +423,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, goto err_submit_cmds; }
ret = copy_from_user(pmrs, u64_to_user_ptr(args->pmrs),
args->nr_pmrs * sizeof(*pmrs));
if (ret) {
ret = -EFAULT;
goto err_submit_cmds;
}
ret = copy_from_user(stream, u64_to_user_ptr(args->stream), args->stream_size); if (ret) {
@@ -441,8 +498,13 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
ret = submit_perfmon_request(submit, cmdbuf, pmrs, args-
nr_pmrs);
if (ret)
goto out;
memcpy(cmdbuf->vaddr, stream, args->stream_size); cmdbuf->user_size = ALIGN(args->stream_size, 8);
cmdbuf->nr_pmrs = args->nr_pmrs;
Please move this below the copy_from_user of the pmrs.
OK
ret = etnaviv_gpu_submit(gpu, submit, cmdbuf); if (ret == 0)
@@ -494,6 +556,8 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, drm_free_large(bos); if (relocs) drm_free_large(relocs);
if (pmrs)
drm_free_large(pmrs); return ret;
}
greets -- Christian Gmeiner, MSc
https://www.youtube.com/user/AloryOFFICIAL https://soundcloud.com/christian-gmeiner
The signal gets sampled and stored in a bo at defined bo.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 16 ++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 3 +++ 2 files changed, 19 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c index b8dc5fa..a8518bd 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c @@ -16,6 +16,7 @@ */
#include "etnaviv_gpu.h" +#include "etnaviv_perfmon.h"
struct etnaviv_pm_domain;
@@ -94,3 +95,18 @@ int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r)
return 0; } + +void etnaviv_perfmon_process(struct etnaviv_gpu *gpu, + const struct etnaviv_perfmon_request *pmr) +{ + const struct etnaviv_pm_domain *dom; + const struct etnaviv_pm_signal *sig; + u32 *bo = pmr->bo_vma; + u32 val; + + dom = &doms[pmr->domain]; + sig = &dom->signal[pmr->signal]; + val = sig->sample(gpu, dom, sig); + + *(bo + pmr->offset) = val; +} diff --git a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h index f20b69c..f9c8d7e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_perfmon.h @@ -42,4 +42,7 @@ int etnaviv_pm_query_sig(struct etnaviv_gpu *gpu,
int etnaviv_pm_req_validate(const struct drm_etnaviv_gem_submit_pmr *r);
+void etnaviv_perfmon_process(struct etnaviv_gpu *gpu, + const struct etnaviv_perfmon_request *pmr); + #endif /* __ETNAVIV_PERFMON_H__ */
In order to support performance counters in a sane way we need to provide a method to sync the GPU with the CPU. The GPU can process multpile command buffers/events per irq. With the help of a 'sync point' we can trigger an event and stop the GPU/FE immediately. When the CPU is done with is processing it simply needs to restart the FE and the GPU will continue.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 36 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 + drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 14 +++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 2 ++ 4 files changed, 53 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c index ed9588f..9e7098e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c @@ -250,6 +250,42 @@ void etnaviv_buffer_end(struct etnaviv_gpu *gpu) } }
+/* Append a 'sync point' to the ring buffer. */ +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event) +{ + struct etnaviv_cmdbuf *buffer = gpu->buffer; + unsigned int waitlink_offset = buffer->user_size - 16; + u32 dwords, target; + + /* + * We need at most 3 dwords in the return target: + * 1 event + 1 end + 1 wait + 1 link. + */ + dwords = 4; + target = etnaviv_buffer_reserve(gpu, buffer, dwords); + + /* Signal sync point event */ + CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) | + VIVS_GL_EVENT_FROM_PE); + + /* Stop the FE to 'pause' the GPU */ + CMD_END(buffer); + + /* Append waitlink */ + CMD_WAIT(buffer); + CMD_LINK(buffer, 2, etnaviv_cmdbuf_get_va(buffer) + + buffer->user_size - 4); + + /* + * Kick off the 'sync point' command by replacing the previous + * WAIT with a link to the address in the ring buffer. + */ + etnaviv_buffer_replace_wait(buffer, waitlink_offset, + VIV_FE_LINK_HEADER_OP_LINK | + VIV_FE_LINK_HEADER_PREFETCH(dwords), + target); +} + /* Append a command buffer to the ring buffer. */ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h index 058389f..f6cdd69 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h @@ -101,6 +101,7 @@ int etnaviv_gem_new_userptr(struct drm_device *dev, struct drm_file *file, u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu); u16 etnaviv_buffer_config_mmuv2(struct etnaviv_gpu *gpu, u32 mtlb_addr, u32 safe_addr); void etnaviv_buffer_end(struct etnaviv_gpu *gpu); +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event); void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf); void etnaviv_validate_init(void); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 037087e..803fcf4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -25,6 +25,7 @@ #include "etnaviv_gpu.h" #include "etnaviv_gem.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h" #include "common.xml.h" #include "state.xml.h" #include "state_hi.xml.h" @@ -1349,6 +1350,7 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, }
gpu->event[event].fence = fence; + gpu->event[event].sync_point = NULL; submit->fence = dma_fence_get(fence); gpu->active_fence = submit->fence->seqno;
@@ -1394,6 +1396,15 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, return ret; }
+static void etnaviv_process_sync_point(struct etnaviv_gpu *gpu, + struct etnaviv_event *event) +{ + u32 addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS); + + event->sync_point(gpu, event); + etnaviv_gpu_start_fe(gpu, addr + 2, 2); +} + /* * Init/Cleanup: */ @@ -1440,6 +1451,9 @@ static irqreturn_t irq_handler(int irq, void *data)
dev_dbg(gpu->dev, "event %u\n", event);
+ if (gpu->event[event].sync_point) + etnaviv_process_sync_point(gpu, &gpu->event[event]); + fence = gpu->event[event].fence; gpu->event[event].fence = NULL; dma_fence_signal(fence); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h index 689cb8f..fee6ed9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h @@ -90,6 +90,8 @@ struct etnaviv_chip_identity { struct etnaviv_event { bool used; struct dma_fence *fence; + + void (*sync_point)(struct etnaviv_gpu *gpu, struct etnaviv_event *event); };
struct etnaviv_cmdbuf_suballoc;
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
In order to support performance counters in a sane way we need to provide a method to sync the GPU with the CPU. The GPU can process multpile command buffers/events per irq. With the help of a 'sync point' we can trigger an event and stop the GPU/FE immediately. When the CPU is done with is processing it simply needs to restart the FE and the GPU will continue.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 36 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 + drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 14 +++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 2 ++ 4 files changed, 53 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c index ed9588f..9e7098e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c @@ -250,6 +250,42 @@ void etnaviv_buffer_end(struct etnaviv_gpu *gpu) } } +/* Append a 'sync point' to the ring buffer. */ +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event) +{
- struct etnaviv_cmdbuf *buffer = gpu->buffer;
- unsigned int waitlink_offset = buffer->user_size - 16;
- u32 dwords, target;
- /*
- * We need at most 3 dwords in the return target:
- * 1 event + 1 end + 1 wait + 1 link.
- */
- dwords = 4;
- target = etnaviv_buffer_reserve(gpu, buffer, dwords);
- /* Signal sync point event */
- CMD_LOAD_STATE(buffer, VIVS_GL_EVENT,
VIVS_GL_EVENT_EVENT_ID(event) |
VIVS_GL_EVENT_FROM_PE);
- /* Stop the FE to 'pause' the GPU */
- CMD_END(buffer);
- /* Append waitlink */
- CMD_WAIT(buffer);
- CMD_LINK(buffer, 2, etnaviv_cmdbuf_get_va(buffer) +
buffer->user_size - 4);
- /*
- * Kick off the 'sync point' command by replacing the
previous
- * WAIT with a link to the address in the ring buffer.
- */
- etnaviv_buffer_replace_wait(buffer, waitlink_offset,
VIV_FE_LINK_HEADER_OP_LINK |
VIV_FE_LINK_HEADER_PREFETCH(dwor
ds),
target);
+}
/* Append a command buffer to the ring buffer. */ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h index 058389f..f6cdd69 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h @@ -101,6 +101,7 @@ int etnaviv_gem_new_userptr(struct drm_device *dev, struct drm_file *file, u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu); u16 etnaviv_buffer_config_mmuv2(struct etnaviv_gpu *gpu, u32 mtlb_addr, u32 safe_addr); void etnaviv_buffer_end(struct etnaviv_gpu *gpu); +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event); void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf); void etnaviv_validate_init(void); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 037087e..803fcf4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -25,6 +25,7 @@ #include "etnaviv_gpu.h" #include "etnaviv_gem.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h" #include "common.xml.h" #include "state.xml.h" #include "state_hi.xml.h" @@ -1349,6 +1350,7 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, } gpu->event[event].fence = fence;
- gpu->event[event].sync_point = NULL;
submit->fence = dma_fence_get(fence); gpu->active_fence = submit->fence->seqno; @@ -1394,6 +1396,15 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, return ret; } +static void etnaviv_process_sync_point(struct etnaviv_gpu *gpu,
- struct etnaviv_event *event)
+{
- u32 addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
- event->sync_point(gpu, event);
- etnaviv_gpu_start_fe(gpu, addr + 2, 2);
+}
/* * Init/Cleanup: */ @@ -1440,6 +1451,9 @@ static irqreturn_t irq_handler(int irq, void *data) dev_dbg(gpu->dev, "event %u\n", event);
if (gpu->event[event].sync_point)
etnaviv_process_sync_point(gpu,
&gpu->event[event]);
Sorry, but this part is a no-go. This means you are doing all the PM register reads/writes (which might be many) from the IRQ handler. This is a crucial fast path in the driver, which affects the realtime capabilities of the whole system, not just the GPU driver. I'll reject any change which adds significant overhead here.
In general the sync point idea is fine, but what you might need to do here is move the PM register handling and FE restart to a work item, queued on the etnaviv WQ. This might add some latency to the GPU restart after a sync point, but at least it won't affect the system as a whole.
fence = gpu->event[event].fence; gpu->event[event].fence = NULL; dma_fence_signal(fence); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h index 689cb8f..fee6ed9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h @@ -90,6 +90,8 @@ struct etnaviv_chip_identity { struct etnaviv_event { bool used; struct dma_fence *fence;
- void (*sync_point)(struct etnaviv_gpu *gpu, struct
etnaviv_event *event); }; struct etnaviv_cmdbuf_suballoc;
2017-06-26 15:30 GMT+02:00 Lucas Stach l.stach@pengutronix.de:
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
In order to support performance counters in a sane way we need to provide a method to sync the GPU with the CPU. The GPU can process multpile command buffers/events per irq. With the help of a 'sync point' we can trigger an event and stop the GPU/FE immediately. When the CPU is done with is processing it simply needs to restart the FE and the GPU will continue.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 36 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 + drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 14 +++++++++++++ drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 2 ++ 4 files changed, 53 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c index ed9588f..9e7098e 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c @@ -250,6 +250,42 @@ void etnaviv_buffer_end(struct etnaviv_gpu *gpu) } }
+/* Append a 'sync point' to the ring buffer. */ +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event) +{
struct etnaviv_cmdbuf *buffer = gpu->buffer;
unsigned int waitlink_offset = buffer->user_size - 16;
u32 dwords, target;
/*
* We need at most 3 dwords in the return target:
* 1 event + 1 end + 1 wait + 1 link.
*/
dwords = 4;
target = etnaviv_buffer_reserve(gpu, buffer, dwords);
/* Signal sync point event */
CMD_LOAD_STATE(buffer, VIVS_GL_EVENT,
VIVS_GL_EVENT_EVENT_ID(event) |
VIVS_GL_EVENT_FROM_PE);
/* Stop the FE to 'pause' the GPU */
CMD_END(buffer);
/* Append waitlink */
CMD_WAIT(buffer);
CMD_LINK(buffer, 2, etnaviv_cmdbuf_get_va(buffer) +
buffer->user_size - 4);
/*
* Kick off the 'sync point' command by replacing the
previous
* WAIT with a link to the address in the ring buffer.
*/
etnaviv_buffer_replace_wait(buffer, waitlink_offset,
VIV_FE_LINK_HEADER_OP_LINK |
VIV_FE_LINK_HEADER_PREFETCH(dwor
ds),
target);
+}
/* Append a command buffer to the ring buffer. */ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h index 058389f..f6cdd69 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h @@ -101,6 +101,7 @@ int etnaviv_gem_new_userptr(struct drm_device *dev, struct drm_file *file, u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu); u16 etnaviv_buffer_config_mmuv2(struct etnaviv_gpu *gpu, u32 mtlb_addr, u32 safe_addr); void etnaviv_buffer_end(struct etnaviv_gpu *gpu); +void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event); void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, unsigned int event, struct etnaviv_cmdbuf *cmdbuf); void etnaviv_validate_init(void); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 037087e..803fcf4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -25,6 +25,7 @@ #include "etnaviv_gpu.h" #include "etnaviv_gem.h" #include "etnaviv_mmu.h" +#include "etnaviv_perfmon.h" #include "common.xml.h" #include "state.xml.h" #include "state_hi.xml.h" @@ -1349,6 +1350,7 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, }
gpu->event[event].fence = fence;
gpu->event[event].sync_point = NULL; submit->fence = dma_fence_get(fence); gpu->active_fence = submit->fence->seqno;
@@ -1394,6 +1396,15 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, return ret; }
+static void etnaviv_process_sync_point(struct etnaviv_gpu *gpu,
struct etnaviv_event *event)
+{
u32 addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
event->sync_point(gpu, event);
etnaviv_gpu_start_fe(gpu, addr + 2, 2);
+}
/*
- Init/Cleanup:
*/ @@ -1440,6 +1451,9 @@ static irqreturn_t irq_handler(int irq, void *data)
dev_dbg(gpu->dev, "event %u\n", event);
if (gpu->event[event].sync_point)
etnaviv_process_sync_point(gpu,
&gpu->event[event]);
Sorry, but this part is a no-go. This means you are doing all the PM register reads/writes (which might be many) from the IRQ handler. This is a crucial fast path in the driver, which affects the realtime capabilities of the whole system, not just the GPU driver. I'll reject any change which adds significant overhead here.
In general the sync point idea is fine, but what you might need to do here is move the PM register handling and FE restart to a work item, queued on the etnaviv WQ. This might add some latency to the GPU restart after a sync point, but at least it won't affect the system as a whole.
I totally get your point and will switch to a work item.
fence = gpu->event[event].fence; gpu->event[event].fence = NULL; dma_fence_signal(fence);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h index 689cb8f..fee6ed9 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h @@ -90,6 +90,8 @@ struct etnaviv_chip_identity { struct etnaviv_event { bool used; struct dma_fence *fence;
void (*sync_point)(struct etnaviv_gpu *gpu, struct
etnaviv_event *event); };
struct etnaviv_cmdbuf_suballoc;
greets -- Christian Gmeiner, MSc
https://www.youtube.com/user/AloryOFFICIAL https://soundcloud.com/christian-gmeiner
Results in less code as there is no need to set every struct member to 0/NULL.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 803fcf4..0766861 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1156,6 +1156,7 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) /* find first free event */ for (i = 0; i < ARRAY_SIZE(gpu->event); i++) { if (gpu->event[i].used == false) { + memset(&gpu->event[i], 0, sizeof(struct etnaviv_event)); gpu->event[i].used = true; event = i; break; @@ -1350,7 +1351,6 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, }
gpu->event[event].fence = fence; - gpu->event[event].sync_point = NULL; submit->fence = dma_fence_get(fence); gpu->active_fence = submit->fence->seqno;
Am Freitag, den 09.06.2017, 12:21 +0200 schrieb Christian Gmeiner:
Results in less code as there is no need to set every struct member to 0/NULL.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
Reviewed-by: Lucas Stach l.stach@pengutronix.de
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 803fcf4..0766861 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1156,6 +1156,7 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) /* find first free event */ for (i = 0; i < ARRAY_SIZE(gpu->event); i++) { if (gpu->event[i].used == false) {
memset(&gpu->event[i], 0, sizeof(struct
etnaviv_event)); gpu->event[i].used = true; event = i; break; @@ -1350,7 +1351,6 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, } gpu->event[event].fence = fence;
- gpu->event[event].sync_point = NULL;
submit->fence = dma_fence_get(fence); gpu->active_fence = submit->fence->seqno;
dri-devel@lists.freedesktop.org