Hi
Am 02.08.19 um 09:11 schrieb Rong Chen:
Hi,
On 8/1/19 7:58 PM, Thomas Zimmermann wrote:
Hi
Am 01.08.19 um 13:25 schrieb Feng Tang:
Hi Thomas,
On Thu, Aug 01, 2019 at 11:59:28AM +0200, Thomas Zimmermann wrote:
Hi
Am 01.08.19 um 10:37 schrieb Feng Tang:
On Thu, Aug 01, 2019 at 02:19:53PM +0800, Rong Chen wrote:
>>>>>>>>> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 >>>>>>>>> ("drm/mgag200: Replace struct mga_fbdev with generic >>>>>>>>> framebuffer emulation") >>>>>>>>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git >>>>>>>>> master >>>>>>>> Daniel, Noralf, we may have to revert this patch. >>>>>>>> >>>>>>>> I expected some change in display performance, but not in >>>>>>>> VM. Since it's >>>>>>>> a server chipset, probably no one cares much about display >>>>>>>> performance. >>>>>>>> So that seemed like a good trade-off for re-using shared >>>>>>>> code. >>>>>>>> >>>>>>>> Part of the patch set is that the generic fb emulation now >>>>>>>> maps and >>>>>>>> unmaps the fbdev BO when updating the screen. I guess >>>>>>>> that's the cause >>>>>>>> of the performance regression. And it should be visible >>>>>>>> with other >>>>>>>> drivers as well if they use a shadow FB for fbdev emulation. >>>>>>> For fbcon we should need to do any maps/unamps at all, this >>>>>>> is for the >>>>>>> fbdev mmap support only. If the testcase mentioned here >>>>>>> tests fbdev >>>>>>> mmap handling it's pretty badly misnamed :-) And as long as >>>>>>> you don't >>>>>>> have an fbdev mmap there shouldn't be any impact at all. >>>>>> The ast and mgag200 have only a few MiB of VRAM, so we have >>>>>> to get the >>>>>> fbdev BO out if it's not being displayed. If not being >>>>>> mapped, it can be >>>>>> evicted and make room for X, etc. >>>>>> >>>>>> To make this work, the BO's memory is mapped and unmapped in >>>>>> drm_fb_helper_dirty_work() before being updated from the >>>>>> shadow FB. [1] >>>>>> That fbdev mapping is established on each screen update, >>>>>> more or less. >>>>>> From my (yet unverified) understanding, this causes the >>>>>> performance >>>>>> regression in the VM code. >>>>>> >>>>>> The original code in mgag200 used to kmap the fbdev BO while >>>>>> it's being >>>>>> displayed; [2] and the drawing code only mapped it when >>>>>> necessary (i.e., >>>>>> not being display). [3] >>>>> Hm yeah, this vmap/vunmap is going to be pretty bad. We >>>>> indeed should >>>>> cache this. >>>>> >>>>>> I think this could be added for VRAM helpers as well, but >>>>>> it's still a >>>>>> workaround and non-VRAM drivers might also run into such a >>>>>> performance >>>>>> regression if they use the fbdev's shadow fb. >>>>> Yeah agreed, fbdev emulation should try to cache the vmap. >>>>> >>>>>> Noralf mentioned that there are plans for other DRM clients >>>>>> besides the >>>>>> console. They would as well run into similar problems. >>>>>> >>>>>>>> The thing is that we'd need another generic fbdev >>>>>>>> emulation for ast and >>>>>>>> mgag200 that handles this issue properly. >>>>>>> Yeah I dont think we want to jump the gun here. If you can >>>>>>> try to >>>>>>> repro locally and profile where we're wasting cpu time I >>>>>>> hope that >>>>>>> should sched a light what's going wrong here. >>>>>> I don't have much time ATM and I'm not even officially at >>>>>> work until >>>>>> late Aug. I'd send you the revert and investigate later. I >>>>>> agree that >>>>>> using generic fbdev emulation would be preferable. >>>>> Still not sure that's the right thing to do really. Yes it's a >>>>> regression, but vm testcases shouldn run a single line of >>>>> fbcon or drm >>>>> code. So why this is impacted so heavily by a silly drm >>>>> change is very >>>>> confusing to me. We might be papering over a deeper and much >>>>> more >>>>> serious issue ... >>>> It's a regression, the right thing is to revert first and then >>>> work >>>> out the right thing to do. >>> Sure, but I have no idea whether the testcase is doing something >>> reasonable. If it's accidentally testing vm scalability of >>> fbdev and >>> there's no one else doing something this pointless, then it's >>> not a >>> real bug. Plus I think we're shooting the messenger here. >>> >>>> It's likely the test runs on the console and printfs stuff out >>>> while running. >>> But why did we not regress the world if a few prints on the >>> console >>> have such a huge impact? We didn't get an entire stream of >>> mails about >>> breaking stuff ... >> The regression seems not related to the commit. But we have >> retested >> and confirmed the regression. Hard to understand what happens. > Does the regressed test cause any output on console while it's > measuring? If so, it's probably accidentally measuring fbcon/DRM > code in > addition to the workload it's trying to measure. > Sorry, I'm not familiar with DRM, we enabled the console to output logs, and attached please find the log file.
"Command line: ... console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw"
We did more check, and found this test machine does use the mgag200 driver.
And we are suspecting the regression is caused by
commit cf1ca9aeb930df074bb5bbcde55f935fec04e529 Author: Thomas Zimmermann tzimmermann@suse.de Date: Wed Jul 3 09:58:24 2019 +0200
Yes, that's the commit. Unfortunately reverting it would require reverting a hand full of other patches as well.
I have a potential fix for the problem. Could you run and verify that it resolves the problem?
Sure, please send it to us. Rong and I will try it.
Fantastic, thank you! The patch set is available on dri-devel at
https://lists.freedesktop.org/archives/dri-devel/2019-August/228950.html
The patch set improves the performance slightly, but the change is not very obvious.
$ git log --oneline 8f7ec6bcc7 -5 8f7ec6bcc75a9 drm/mgag200: Map fbdev framebuffer while it's being displayed abcb1cf24033a drm/ast: Map fbdev framebuffer while it's being displayed a92f80044c623 drm/vram-helpers: Add kmap ref-counting to GEM VRAM objects 90f479ae51afa drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation f1f8555dfb9a7 drm/bochs: Use shadow buffer for bochs framebuffer console
commit: f1f8555dfb ("drm/bochs: Use shadow buffer for bochs framebuffer console") 90f479ae51 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation") 8f7ec6bcc7 ("drm/mgag200: Map fbdev framebuffer while it's being displayed")
f1f8555dfb9a70a2 90f479ae51afa45efab97afdde 8f7ec6bcc75a996f5c6b39a9cf testcase/testparams/testbox
---------------- -------------------------- --------------------------
%stddev change %stddev change %stddev \ | \ | \ 43921 -18% 35884 -17% 36629 vm-scalability/performance-300s-8T-anon-cow-seq-hugetlb/lkp-knm01 43921 -18% 35884 -17% 36629 GEO-MEAN vm-scalability.median
The regression goes from -18% to -17%, if I understand this correctly. This is strange, because the patch set restores the way that the original code worked. The heavy map/unmap calls in the fbdev code are gone. Performance should have been back to normal.
I'd like to prepare a patch set for entirely reverting all changes. Can I send it to you for testing?
Best regards Thomas
Best Regards, Rong Chen
Best regards Thomas
Thanks, Feng
Best regards Thomas
drm/fb-helper: Map DRM client buffer only when required This patch changes DRM clients to not map the buffer by default. The buffer, like any buffer object, should be mapped and unmapped when needed. An unmapped buffer object can be evicted to system memory and does not consume video ram until displayed. This allows to use generic fbdev emulation with drivers for low-memory devices, such as ast and mgag200. This change affects the generic framebuffer console. HW-based consoles map their console buffer once and keep it mapped. Userspace can mmap this buffer into its address space. The shadow-buffered framebuffer console only needs the buffer object to be mapped during updates. While not being updated from the shadow buffer, the buffer object can remain unmapped. Userspace will always mmap the shadow buffer. which may add more load when fbcon is busy printing out messages.
We are doing more test inside 0day to confirm.
Thanks, Feng _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Thomas Zimmermann Graphics Driver Developer SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg)
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel