This reverts commit aafa025c76dcc7d1a8c8f0bdefcbe4eb480b2f6a. That commit
attempted to fix a NULL pointer dereference, caused by the struct fb_info
associated with a framebuffer device to not longer be valid when the file
descriptor was closed.
The issue was exposed by commit 27599aacbaef ("fbdev: Hot-unplug firmware
fb devices on forced removal"), which added a new path that goes through
the struct device removal instead of directly unregistering the fb.
Most fbdev drivers have issues with …
[View More]the fb_info lifetime, because call to
framebuffer_release() from their driver's .remove callback, rather than
doing from fbops.fb_destroy callback. This meant that due to this switch,
the fb_info was now destroyed too early, while references still existed,
while before it was simply leaked.
The patch we're reverting here reinstated that leak, hence "fixed" the
regression. But the proper solution is to fix the drivers to not release
the fb_info too soon.
Suggested-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm(a)redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
---
Changes in v2:
- Add more info in the commit message about why it's crashing and how
the reverted commit was papering over the issue (Daniel Vetter).
- Add Daniel Vetter's Reviewed-by tag.
drivers/video/fbdev/core/fbmem.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 97eb0dee411c..a6bb0e438216 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1434,10 +1434,7 @@ fb_release(struct inode *inode, struct file *file)
__acquires(&info->lock)
__releases(&info->lock)
{
- struct fb_info * const info = file_fb_info(file);
-
- if (!info)
- return -ENODEV;
+ struct fb_info * const info = file->private_data;
lock_fb_info(info);
if (info->fbops->fb_release)
--
2.35.1
[View Less]
This series makes a handful of updates to i915's internal handling of
slice/subslice/EU (SSEU) data to handle recent platforms like Xe_HP in a
more natural manner and to prepare for some additional upcoming
platforms we have in the pipeline (the first of which I'll probably
start sending patches for in the next week or two). One key idea of
this series is that although we have a fixed ABI to convey SSEU data to
userspace (i.e., multiple u8[] arrays with data stored at different
strides), we …
[View More]don't need to use this cumbersome format for the driver's
own internal storage. As long as we can convert into the uapi form
properly when responding to the I915_QUERY ioctl, it's preferable to use
an internal storage format that's easier for the driver to work with.
Doing so can also save us some storage space on modern platforms since
we don't always need to replicate a bunch of data that's architecturally
guaranteed to be identical.
Another key point here is that Xe_HP platforms today have subslice (DSS)
masks that are 32 bits, which maxes out the storage of a u32. On future
platforms the architecture design is going to start spreading their DSS
masks over multiple 32-bit fuse registers. So even for platforms where
the total number of DSS doesn't actually go up, we're going to need
larger storage than just a u32 to express the mask properly. To
accomodate this, we start storing our subslice mask in a new typedef
that can be processed by the linux/bitmap.h operations.
Finally, since no userspace for Xe_HP or beyond is using the legacy
I915_GETPARAM ioctl lookups for I915_PARAM_SLICE_MASK and
I915_PARAM_SUBSLICE_MASK (since they've migrated to the more flexible
I915_QUERY ioctl that can return more than a simple u32 value), we take
the opportunity to officially drop support for those GETPARAM lookups on
modern platforms. Maintaining support for these GETPARAM lookups don't
make sense for a number of reasons:
* Traditional slices no longer exist, and newer ideas like gslices,
cslices, mslices, etc. aren't something userspace needs to query
since it can be inferred from other information.
* The GETPARAM ioctl doesn't have a way to distinguish between geometry
subslice masks and compute subslice masks, which are distinct on
Xe_HP and beyond.
* The I915_GETPARAM ioctl is limited to returning a 32-bit value, so
when subslice masks begin to exceed 32-bits, it simply can't return
the entire mask.
* The GETPARAM ioctl doesn't have a way to give sensible information
for multi-tile devices.
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Matt Roper (5):
drm/i915/sseu: Don't try to store EU mask internally in UAPI format
drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK
drm/i915/xehp: Use separate sseu init function
drm/i915/sseu: Simplify gen11+ SSEU handling
drm/i915/sseu: Disassociate internal subslice mask representation from
uapi
drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 +-
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +-
drivers/gpu/drm/i915/gt/intel_gt.c | 14 +-
drivers/gpu/drm/i915/gt/intel_sseu.c | 371 +++++++++++--------
drivers/gpu/drm/i915/gt/intel_sseu.h | 69 ++--
drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 28 +-
drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +-
drivers/gpu/drm/i915/i915_getparam.c | 10 +-
drivers/gpu/drm/i915/i915_query.c | 16 +-
9 files changed, 323 insertions(+), 219 deletions(-)
--
2.35.1
[View Less]
Many Xen PV frontends share similar code for setting up a ring page
(allocating and granting access for the backend) and for tearing it
down.
Create new service functions doing all needed steps in one go.
This requires all frontends to use a common value for an invalid
grant reference in order to make the functions idempotent.
Changes in V3:
- new patches 1 and 2, comments addressed
Changes in V2:
- new patch 9 and related changes in patches 10-18
Juergen Gross (21):
xen: update …
[View More]grant_table.h
xen/grant-table: never put a reserved grant on the free list
xen/blkfront: switch blkfront to use INVALID_GRANT_REF
xen/netfront: switch netfront to use INVALID_GRANT_REF
xen/scsifront: remove unused GRANT_INVALID_REF definition
xen/usb: switch xen-hcd to use INVALID_GRANT_REF
xen/drm: switch xen_drm_front to use INVALID_GRANT_REF
xen/sound: switch xen_snd_front to use INVALID_GRANT_REF
xen/dmabuf: switch gntdev-dmabuf to use INVALID_GRANT_REF
xen/shbuf: switch xen-front-pgdir-shbuf to use INVALID_GRANT_REF
xen: update ring.h
xen/xenbus: add xenbus_setup_ring() service function
xen/blkfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/netfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/tpmfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/drmfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/pcifront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/scsifront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/usbfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/sndfront: use xenbus_setup_ring() and xenbus_teardown_ring()
xen/xenbus: eliminate xenbus_grant_ring()
drivers/block/xen-blkfront.c | 57 +++----
drivers/char/tpm/xen-tpmfront.c | 18 +--
drivers/gpu/drm/xen/xen_drm_front.h | 9 --
drivers/gpu/drm/xen/xen_drm_front_evtchnl.c | 43 ++----
drivers/net/xen-netfront.c | 85 ++++-------
drivers/pci/xen-pcifront.c | 19 +--
drivers/scsi/xen-scsifront.c | 31 +---
drivers/usb/host/xen-hcd.c | 65 ++------
drivers/xen/gntdev-dmabuf.c | 13 +-
drivers/xen/grant-table.c | 12 +-
drivers/xen/xen-front-pgdir-shbuf.c | 18 +--
drivers/xen/xenbus/xenbus_client.c | 82 +++++++---
include/xen/grant_table.h | 2 -
include/xen/interface/grant_table.h | 161 ++++++++++++--------
include/xen/interface/io/ring.h | 19 ++-
include/xen/xenbus.h | 4 +-
sound/xen/xen_snd_front_evtchnl.c | 44 ++----
sound/xen/xen_snd_front_evtchnl.h | 9 --
18 files changed, 287 insertions(+), 404 deletions(-)
--
2.35.3
[View Less]
From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
We have a statement from HW designers that the GPU read regression when
using 2M pages was fixed from Icelake onwards, which was also confirmed
by bencharking Eero did last year:
"""
When IOMMU is disabled, enabling THP causes following perf changes on
TGL-H (GT1):
10-15% SynMark Batch[0-3]
5-10% MemBW GPU texture, SynMark ShMapVsm
3-5% SynMark TerrainFly* + Geom* + Fill* + CSCloth + Batch4
1-3% GpuTest Triangle, SynMark …
[View More]TexMem* + DeferredAA + Batch[5-7]
+ few others
-7% MemBW GPU blend
In the above 3D benchmark names, * means all the variants of tests with
the same prefix. For example "SynMark TexMem*", means both TexMem128 &
TexMem512 tests in the synthetic (Intel internal) SynMark test suite.
In the (public, but proprietary) GfxBench & GLB(enchmark) test suites,
there are both onscreen and offscreen variants of each test. Unless
explicitly stated otherwise, numbers are for both variants.
All tests are run with FullHD monitor. All tests are fullscreen except
for GLB and GpuTest ones, which are run in 1/2 screen window (GpuTest
triangle is run both in fullscreen and 1/2 screen window).
"""
Since the only regression is MemBW GPU blend, against many more gains,
it sounds it is time to enable THP on Gen11+.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/430
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Eero Tamminen <eero.t.tamminen(a)intel.com>
---
drivers/gpu/drm/i915/gem/i915_gemfs.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
index ee87874e59dc..c5a6bbc842fc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
@@ -28,12 +28,14 @@ int i915_gemfs_init(struct drm_i915_private *i915)
*
* One example, although it is probably better with a per-file
* control, is selecting huge page allocations ("huge=within_size").
- * However, we only do so to offset the overhead of iommu lookups
- * due to bandwidth issues (slow reads) on Broadwell+.
+ * However, we only do so on platforms which benefit from it, or to
+ * offset the overhead of iommu lookups, where with latter it is a net
+ * win even on platforms which would otherwise see some performance
+ * regressions such a slow reads issue on Broadwell and Skylake.
*/
opts = NULL;
- if (i915_vtd_active(i915)) {
+ if (GRAPHICS_VER(i915) >= 11 || i915_vtd_active(i915)) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
opts = huge_opt;
drm_info(&i915->drm,
@@ -41,7 +43,10 @@ int i915_gemfs_init(struct drm_i915_private *i915)
opts);
} else {
drm_notice(&i915->drm,
- "Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
+ "Transparent Hugepage support is recommended for optimal performance%s\n",
+ GRAPHICS_VER(i915) >= 11 ?
+ " on this platform!" :
+ " when IOMMU is enabled!");
}
}
--
2.32.0
[View Less]
Hello,
I'm trying to understand why our application does not play well with
the i915 driver. We're trying to drive a 4k/60Hz display using an
OpenGL ES application, the GL context being derived directly from the
DRM device. Depending on the GPU load we generate, we quite frequently
manage to get stuck with an GPU clocked just a smidge too low to
actually hit the 60Hz target.
E.g. right now I'm looking at less than ideal animations because the
time to complete a frame is 17.1ms, …
[View More]causing every other vsync to be
missed. This is while the GPU, according to intel_gpu_top, is 50.4%
busy with rendering and the video decoder is at 6.7%. I'm pretty sure
there is room for improvement because decoding a higher bitrate video
*will* cause the GPU to clock up, then hitting the 60Hz no problem.
While investigating this I discovered the concept of "wait boost",
although I'm not really sure what triggers it. To my understanding
there are two ways to benefit from it:
* calling glFinish() (gave it a try, makes no difference)
* tying the render job to a pending vsync event using an
IN_FENCE_FD, we do that but it does not help either
Manual intervention using /sys/class/drm/card0/gt_min_freq_mhz is not
very effective either, causing the long-time average clocks jump to
501 MHz, up from 495 MHz. Those extra 6 MHz really don't cut it. The
reported maximum is 900 MHz, which was also written to gt_min_freq_mhz.
The output of /sys/kernel/debug/dri/0/i915_rps_boost_info alternates
between two states:
> RPS enabled? yes
> RPS active? no
> GPU busy? no
> Boosts outstanding? 0
> Interactive? 2
> Frequency requested 867, actual 300
> min hard:300, soft:900; max soft:900, hard:900
> idle:300, efficient:300, boost:900
> Wait boosts: 0
and
> RPS enabled? yes
> RPS active? yes
> GPU busy? yes
> Boosts outstanding? 0
> Interactive? 2
> Frequency requested 900, actual 900
> min hard:300, soft:900; max soft:900, hard:900
> idle:300, efficient:300, boost:900
> Wait boosts: 0
This is on Linux 5.17.5 with Mesa 22.0.2, but we're seeing this
problem since at least two years so the exact version seems not to
matter very much. It may be worth to mention that this behavior can be
reproduced using the Weston Wayland compositor, and according the some
mailing list there are Kodi users suffering from this too.
I'm really out of ideas here, is there anything obvious wrong with our
approach?
Kind regards,
-Matthias
[View Less]
This is not the full series, if you want that, look for v11.
This series merely has a last-minute change: The VOP2 driver used
platform_get_resource_byname() to get its registers, but the reg-names
property hasn't been documented in the binding. This series adds the
missing documentation and along the way renames the generic "regs"
name to "vop" and "gamma_lut" to "gamma-lut".
Sascha
Andy Yan (1):
drm: rockchip: Add VOP2 driver
Sascha Hauer (2):
arm64: dts: rockchip: rk356x: Add VOP2 …
[View More]nodes
dt-bindings: display: rockchip: Add binding for VOP2
.../display/rockchip/rockchip-vop2.yaml | 146 +
arch/arm64/boot/dts/rockchip/rk3566.dtsi | 4 +
arch/arm64/boot/dts/rockchip/rk3568.dtsi | 4 +
arch/arm64/boot/dts/rockchip/rk356x.dtsi | 51 +
drivers/gpu/drm/rockchip/Kconfig | 6 +
drivers/gpu/drm/rockchip/Makefile | 1 +
drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 1 +
drivers/gpu/drm/rockchip/rockchip_drm_drv.h | 6 +-
drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 2 +
drivers/gpu/drm/rockchip/rockchip_drm_vop.h | 15 +
drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 2706 +++++++++++++++++
drivers/gpu/drm/rockchip/rockchip_drm_vop2.h | 477 +++
drivers/gpu/drm/rockchip/rockchip_vop2_reg.c | 281 ++
include/dt-bindings/soc/rockchip,vop2.h | 14 +
14 files changed, 3713 insertions(+), 1 deletion(-)
create mode 100644 Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.yaml
create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_vop2.c
create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_vop2.h
create mode 100644 drivers/gpu/drm/rockchip/rockchip_vop2_reg.c
create mode 100644 include/dt-bindings/soc/rockchip,vop2.h
--
2.30.2
[View Less]