From: Yukimasa Sugizaki <ysugi(a)idein.jp>
Hi,
The current V3D scheduler has two issues where CSD jobs are resubmitted
regardless of the previous timed-out flag, and where the timer is not
restarted for timed-out CL/CSD jobs (which we wish to continue running).
The second one is due to the DRM scheduler API change and fixed in a
similar way to [1]. A kernel command-line option to set the default
timeout value is also added.
I tested this patchset with Piglit and our CSD programs in [2]…
[View More]. Because
it is hard to get the current upstream kernel to work on BCM2711, I used
the kernel from rpi-5.8.y tree [3]. There still are problems where some
Piglit tests get longer time to finish running (3610 minutes to 3650
minutes in total), and some ones result in the invalid memory read
errors with unknown reasons:
[17086.230959] v3d fec00000.v3d: MMU error from client CLE (4) at 0xac1000, pte invalid
[17086.238722] v3d fec00000.v3d: MMU error from client CLE (4) at 0x1b61000, pte invalid
[18643.303188] v3d fec00000.v3d: MMU error from client L2T (0) at 0x15bff00, pte invalid
[18655.933748] v3d fec00000.v3d: MMU error from client L2T (0) at 0x15bff00, pte invalid
However, most of the CL/CSD programs are now working happily without
kernel warnings and errors.
Regards,
Sugizaki
[1] https://patchwork.kernel.org/patch/11732895/
[2] https://github.com/Idein/py-videocore6
[3] https://github.com/raspberrypi/linux/tree/rpi-5.8.y
Yukimasa Sugizaki (3):
drm/v3d: Don't resubmit guilty CSD jobs
drm/v3d: Correctly restart the timer when progress is made
drm/v3d: Add job timeout module param
drivers/gpu/drm/v3d/v3d_sched.c | 62 +++++++++++++++++++++++++++++++++--------
1 file changed, 51 insertions(+), 11 deletions(-)
--
2.7.4
[View Less]
This patchset adds support for simple-framebuffer platform devices and
a handover mechanism for native drivers to take-over control of the
hardware.
The new driver, called simplekms, binds to a simple-frambuffer platform
device. The kernel's boot code creates such devices for firmware-provided
framebuffers, such as EFI-GOP or VESA. Typically the BIOS, UEFI or boot
loader sets up the framebuffers. Description via device tree is also an
option.
Simplekms is small enough to be linked into the …
[View More]kernel. The driver's main
purpose is to provide graphical output during the early phases of the boot
process, before the native DRM drivers are available. Native drivers are
typically loaded from an initrd ram disk. Occationally simplekms can also
serve as interim solution on graphics hardware without native DRM driver.
So far distributions rely on fbdev drivers, such as efifb, vesafb or
simplefb, for early-boot graphical output. However fbdev is deprecated and
the drivers do not provide DRM interfaces for modern userspace.
Patches 1 and 2 prepare the DRM format helpers for simplekms.
Patches 3 to 7 add the simplekms driver. It's build on simple DRM helpers
and SHMEM. It supports 16-bit, 24-bit and 32-bit RGB framebuffers. During
pageflips, SHMEM buffers are copied into the framebuffer memory, similar
to cirrus or mgag200. The code in patches 6 and 7 handles clocks and
regulators. It's based on the simplefb drivers, but has been modified for
DRM.
Patches 8 and 9 add a hand-over mechanism. Simplekms acquires it's
framebuffer's I/O-memory range and provides a callback function to be
removed by a native driver. The native driver will remove simplekms before
taking over the hardware. The removal is integrated into existing helpers,
so drivers use it automatically.
I tested simplekms with x86 EFI and VESA framebuffers, which both work
reliably. The fbdev console and Weston work automatically. Xorg requires
manual configuration of the device. Xorgs current modesetting driver does
not work with both, platform and PCI device, for the same physical
hardware. Once configured, X11 works.
One cosmetical issue is that simplekms's device file is card0 and the
native driver's device file is card1. After simplekms has been kicked out,
only card1 is left. This does not seem to be a practical problem however.
TODO/IDEAS:
* provide deferred takeover
* provide bootsplash DRM client
* make simplekms usable with ARM-EFI fbs
Thomas Zimmermann (9):
drm/format-helper: Pass destination pitch to drm_fb_memcpy_dstclip()
drm/format-helper: Add blitter functions
drm: Add simplekms driver
drm/simplekms: Add fbdev emulation
drm/simplekms: Initialize framebuffer data from device-tree node
drm/simplekms: Acquire clocks from DT device node
drm/simplekms: Acquire regulators from DT device node
drm: Add infrastructure for platform devices
drm/simplekms: Acquire memory aperture for framebuffer
MAINTAINERS | 6 +
drivers/gpu/drm/Kconfig | 6 +
drivers/gpu/drm/Makefile | 1 +
drivers/gpu/drm/drm_format_helper.c | 96 ++-
drivers/gpu/drm/drm_platform.c | 118 ++++
drivers/gpu/drm/mgag200/mgag200_mode.c | 2 +-
drivers/gpu/drm/tiny/Kconfig | 17 +
drivers/gpu/drm/tiny/Makefile | 1 +
drivers/gpu/drm/tiny/cirrus.c | 2 +-
drivers/gpu/drm/tiny/simplekms.c | 906 +++++++++++++++++++++++++
include/drm/drm_fb_helper.h | 18 +-
include/drm/drm_format_helper.h | 10 +-
include/drm/drm_platform.h | 42 ++
13 files changed, 1217 insertions(+), 8 deletions(-)
create mode 100644 drivers/gpu/drm/drm_platform.c
create mode 100644 drivers/gpu/drm/tiny/simplekms.c
create mode 100644 include/drm/drm_platform.h
--
2.27.0
[View Less]
https://bugzilla.kernel.org/show_bug.cgi?id=204241
Bug ID: 204241
Summary: amdgpu fails to resume from suspend
Product: Drivers
Version: 2.5
Kernel Version: 5.2.1-arch1-1-ARCH
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri(a)kernel-bugs.osdl.org
…
[View More]Reporter: kitaev(a)gmail.com
Regression: No
Created attachment 283863
--> https://bugzilla.kernel.org/attachment.cgi?id=283863&action=edit
dmesg
Computer fails to resume from suspend.
>From the logs it looks like AMDGPU fails to resume.
--
You are receiving this mail because:
You are watching the assignee of the bug.
[View Less]
https://bugzilla.kernel.org/show_bug.cgi?id=202043
Bug ID: 202043
Summary: amdgpu: Vega 56 SCLK drops to 700 Mhz when
undervolting
Product: Drivers
Version: 2.5
Kernel Version: 4.19.8, 4.20.0-rc6
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri(…
[View More]a)kernel-bugs.osdl.org
Reporter: antifermion(a)protonmail.com
Regression: No
When undervolting my Sapphire Pulse Vega 56 by just 1mV, SCLK immediately drops
down to 700 Mhz and pstate 1-2 under load (`gputest /test=fur /width=1920
/height=1080`).
Script to undervolt:
```
echo "s 7 1630 1199" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
```
Stock voltage would be 1200 on the Vega 64 Bios.
The same behavior can be observed with the stock Vega 56 Bios.
Undervolting the memory by 1mV results in similar behavior.
Overvolting by 1mV has no discernable effect.
`echo r > pp_od_clk_voltage` does not work to go back to the normal behavior.
Instead, I need to use `echo "s 7 1630 1200" > pp_od_clk_voltage` as above.
Without undervolting, SCLK is around 1330 Mhz, which matches the behavior on
Windows, where undervolting by around 150 mV is no problem and increases clock.
With an increased power limit of 300W, the clocks increase to around 1100 Mhz
while the card uses the full 300W.
It even maxes that limit with a significant underclock/undervolt which would
pull around 200W on Windows.
I tested with current Manjaro (4.19.8-2-MANJARO), as well as Kubuntu 18.10 with
stock (4.18) and 4.20 from
https://github.com/M-Bab/linux-kernel-amdgpu-binaries.
--
You are receiving this mail because:
You are watching the assignee of the bug.
[View Less]
Hi all
Another update of my patch series to clamp down a bunch of races and gaps
around follow_pfn and other access to iomem mmaps. Previous version:
v1: https://lore.kernel.org/dri-devel/20201007164426.1812530-1-daniel.vetter@ff…
v2: https://lore.kernel.org/dri-devel/20201009075934.3509076-1-daniel.vetter@ff…
v3: https://lore.kernel.org/dri-devel/20201021085655.1192025-1-daniel.vetter@ff…
v4: https://lore.kernel.org/dri-devel/20201026105818.2585306-1-daniel.vetter@ff…
v5: https://lore.…
[View More]kernel.org/dri-devel/20201030100815.2269-1-daniel.vetter@ffwll…
v6: https://lore.kernel.org/dri-devel/20201119144146.1045202-1-daniel.vetter@ff…
And the discussion that sparked this journey:
https://lore.kernel.org/dri-devel/20201007164426.1812530-1-daniel.vetter@ff…
I think the first 12 patches are ready for landing. The parts starting
with "mm: Add unsafe_follow_pfn" probably need more baking time.
Andrew, can you please pick these up, or do you prefer I do a topic branch
and send them to Linus directly in the next merge window?
Changes in v7:
- more acks/reviews
- reordered with the ready pieces at the front
- simplified the new follow_pfn function as Jason suggested
Changes in v6:
- Tested v4l userptr as Tomasz suggested. No boom observed
- Added RFC for locking down follow_pfn, per discussion with Christoph and
Jason.
- Explain why pup_fast is safe in relevant patches, there was a bit a
confusion when discussing v5.
- Fix up the resource patch, with CONFIG_IO_STRICT_DEVMEM it crashed on
boot due to an unintended change (reported by John)
Changes in v5:
- Tomasz found some issues in the media patches
- Polish suggested by Christoph for the unsafe_follow_pfn patch
Changes in v4:
- Drop the s390 patch, that was very stand-alone and now queued up to land
through s390 trees.
- Comment polish per Dan's review.
Changes in v3:
- Bunch of polish all over, no functional changes aside from one barrier
in the resource code, for consistency.
- A few more r-b tags.
Changes in v2:
- tons of small polish&fixes all over, thanks to all the reviewers who
spotted issues
- I managed to test at least the generic_access_phys and pci mmap revoke
stuff with a few gdb sessions using our i915 debug tools (hence now also
the drm/i915 patch to properly request all the pci bar regions)
- reworked approach for the pci mmap revoke: Infrastructure moved into
kernel/resource.c, address_space mapping is now set up at open time for
everyone (which required some sysfs changes). Does indeed look a lot
cleaner and a lot less invasive than I feared at first.
Coments and review on the remaining bits very much welcome, especially
from the kvm and vfio side.
Cheers, Daniel
Daniel Vetter (17):
drm/exynos: Stop using frame_vector helpers
drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
misc/habana: Stop using frame_vector helpers
misc/habana: Use FOLL_LONGTERM for userptr
mm/frame-vector: Use FOLL_LONGTERM
media: videobuf2: Move frame_vector into media subsystem
mm: Close race in generic_access_phys
PCI: Obey iomem restrictions for procfs mmap
/dev/mem: Only set filp->f_mapping
resource: Move devmem revoke code to resource framework
sysfs: Support zapping of binary attr mmaps
PCI: Revoke mappings like devmem
mm: Add unsafe_follow_pfn
media/videobuf1|2: Mark follow_pfn usage as unsafe
vfio/type1: Mark follow_pfn as unsafe
kvm: pass kvm argument to follow_pfn callsites
mm: add mmu_notifier argument to follow_pfn
arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +-
arch/powerpc/kvm/e500_mmu_host.c | 2 +-
arch/x86/kvm/mmu/mmu.c | 8 +-
drivers/char/mem.c | 86 +-------------
drivers/gpu/drm/exynos/Kconfig | 1 -
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 ++++----
drivers/media/common/videobuf2/Kconfig | 1 -
drivers/media/common/videobuf2/Makefile | 1 +
.../media/common/videobuf2}/frame_vector.c | 57 ++++-----
.../media/common/videobuf2/videobuf2-memops.c | 3 +-
drivers/media/platform/omap/Kconfig | 1 -
drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
drivers/misc/habanalabs/Kconfig | 1 -
drivers/misc/habanalabs/common/habanalabs.h | 6 +-
drivers/misc/habanalabs/common/memory.c | 52 +++-----
drivers/pci/pci-sysfs.c | 4 +
drivers/pci/proc.c | 6 +
drivers/vfio/vfio_iommu_type1.c | 4 +-
fs/sysfs/file.c | 11 ++
include/linux/ioport.h | 6 +-
include/linux/kvm_host.h | 9 +-
include/linux/mm.h | 50 +-------
include/linux/sysfs.h | 2 +
include/media/frame_vector.h | 47 ++++++++
include/media/videobuf2-core.h | 1 +
kernel/resource.c | 98 ++++++++++++++-
mm/Kconfig | 3 -
mm/Makefile | 1 -
mm/memory.c | 112 +++++++++++++++---
mm/nommu.c | 16 ++-
security/Kconfig | 13 ++
virt/kvm/kvm_main.c | 56 +++++----
33 files changed, 413 insertions(+), 299 deletions(-)
rename {mm => drivers/media/common/videobuf2}/frame_vector.c (84%)
create mode 100644 include/media/frame_vector.h
--
2.29.2
[View Less]
pm_runtime_get_sync will increment pm usage counter even it
failed. Forgetting to putting operation will result in a
reference leak here.
A new function pm_runtime_resume_and_get is introduced in
[0] to keep usage counter balanced. So We fix the reference
leak by replacing it with new funtion.
[0] dd8088d5a896 ("PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter")
Fixes: 50de2e9ebbc0 ("drm/lima: enable runtime pm")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
…
[View More]Signed-off-by: Qinglang Miao <miaoqinglang(a)huawei.com>
---
drivers/gpu/drm/lima/lima_sched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index dc6df9e9a..f6e7a88a5 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -200,7 +200,7 @@ static int lima_pm_busy(struct lima_device *ldev)
int ret;
/* resume GPU if it has been suspended by runtime PM */
- ret = pm_runtime_get_sync(ldev->dev);
+ ret = pm_runtime_resume_and_get(ldev->dev);
if (ret < 0)
return ret;
--
2.23.0
[View Less]
Hey All,
So this is another revision of my patch series to performance
optimizations to the dma-buf system heap.
Unfortunately, in working these up, I realized the heap-helpers
infrastructure we tried to add to miniimize code duplication is
not as generic as we intended. For some heaps it makes sense to
deal with page lists, for other heaps it makes more sense to
track things with sgtables.
So this series reworks the system heap to use sgtables, and then
consolidates the pagelist method …
[View More]from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked.
After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.
Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION code.
Unfortunately, after submitting the last round, I realized that
part of the reason the page-pooling patch I had included was
providing such great performance numbers, was because the
network page-pool implementation doesn't zero pages that it
pulls from the cache. This is very inappropriate for buffers we
pass to userland and was what gave it an unfair advantage
(almost constant time performance) relative to ION's allocation
performance numbers. I added some patches to zero the buffers
manually similar to how ION does it, but I found this resulted
in basically no performance improvement from the standard page
allocator. Thus I've dropped that patch in this series for now.
Unfortunately this means we still have a performance delta from
the ION system heap as measured by my microbenchmark, and this
delta comes from ION system_heap's use of deferred freeing of
pages. So less work is done in the measured interval of the
microbenchmark. I'll be looking at adding similar code
eventually but I don't want to hold the rest of the patches up
on this, as it is still a good improvement over the current
code.
I've updated the chart I shared earlier with current numbers
(including with the unsubmitted net pagepool implementation, and
with a different unsubmitted pagepool implementation borrowed
from ION) here:
https://docs.google.com/spreadsheets/d/1-1C8ZQpmkl_0DISkI6z4xelE08MlNAN7oEu…
I did add to this series a reworked version of my uncached
system heap implementation I was submitting a few weeks back.
Since it duplicated a lot of the now reworked system heap code,
I realized it would be much simpler to add the functionality to
the system_heap implementaiton itself.
While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.
Feedback on these would be great!
thanks
-john
New in v3:
* Dropped page-pool patches as after correcting the code to
zero buffers, they provided no net performance gain.
* Added system-uncached implementation ontop of reworked
system-heap.
* Use the new sgtable mapping functions, in the system and cma
code as Suggested-by: Daniel Mentz <danielmentz(a)google.com>
* Cleanup: Use page_size() rather then open-coding it
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Liam Mark <lmark(a)codeaurora.org>
Cc: Laura Abbott <labbott(a)kernel.org>
Cc: Brian Starkey <Brian.Starkey(a)arm.com>
Cc: Hridya Valsaraju <hridya(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Sandeep Patil <sspatil(a)google.com>
Cc: Daniel Mentz <danielmentz(a)google.com>
Cc: Chris Goldsworthy <cgoldswo(a)codeaurora.org>
Cc: Ørjan Eide <orjan.eide(a)arm.com>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Ezequiel Garcia <ezequiel(a)collabora.com>
Cc: Simon Ser <contact(a)emersion.fr>
Cc: James Jones <jajones(a)nvidia.com>
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
John Stultz (7):
dma-buf: system_heap: Rework system heap to use sgtables instead of
pagelists
dma-buf: heaps: Move heap-helper logic into the cma_heap
implementation
dma-buf: heaps: Remove heap-helpers code
dma-buf: heaps: Skip sync if not mapped
dma-buf: system_heap: Allocate higher order pages if available
dma-buf: dma-heap: Keep track of the heap device struct
dma-buf: system_heap: Add a system-uncached heap re-using the system
heap
drivers/dma-buf/dma-heap.c | 33 +-
drivers/dma-buf/heaps/Makefile | 1 -
drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++---
drivers/dma-buf/heaps/heap-helpers.c | 271 ---------------
drivers/dma-buf/heaps/heap-helpers.h | 53 ---
drivers/dma-buf/heaps/system_heap.c | 480 ++++++++++++++++++++++++---
include/linux/dma-heap.h | 9 +
7 files changed, 741 insertions(+), 433 deletions(-)
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h
--
2.17.1
[View Less]