https://bugs.freedesktop.org/show_bug.cgi?id=97075
Bug ID: 97075 Summary: VCE encoding slow when GPU is not stressed (HD 7970M) Product: DRI Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: haagch@frickel.club
This is on an intel + radeon laptop, so I need to run encoding with gstreamer with DRI_PRIME=1.
Here is an example video: http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_1mb.mp4
DRI_PRIME is doing a good job of waking up the GPU from runpm when needed for encoding via VAAPI and OMX, but for comparison I'll run glxgears both times.
I'm encoding the mentioned example video with VAAPI with this exact command: $ time DRI_PRIME=1 LIBVA_DRIVER_NAME=radeonsi gst-launch-1.0 -e filesrc location=big_buck_bunny_720p_1mb.mp4 ! qtdemux ! h264parse ! avdec_h264 ! queue ! videoconvert ! queue ! video/x-raw,format=NV12 ! vaapih264enc ! h264parse ! matroskamux ! filesink location=output.mkv
For low GPU stress I run the gst pipeline while glxgears with vsync is running: $ DRI_PRIME=1 glxgears Result: 0.75s user 0.33s system 2% cpu 52.779 total
For higher GPU stress I run the gst pipeline while glxgears without vsync is running: $ DRI_PRIME=1 vblank_mode=0 glxgears Result: 0.99s user 0.28s system 43% cpu 2.928 total
I also tried a very similar pipeline with OMX: $ time DRI_PRIME=1 gst-launch-1.0 -e filesrc location=big_buck_bunny_720p_1mb.mp4 ! qtdemux ! h264parse ! avdec_h264 ! queue ! videoconvert ! queue ! video/x-raw,format=NV12 ! omxh264enc ! h264parse ! matroskamux ! filesink location=output.mkv
Low GPU stress: 0.96s user 0.24s system 19% cpu 6.298 total High GPU stress: 1.10s user 0.24s system 141% cpu 0.949 total
Overall OMX encoding does a lot better, but it's still a large difference and still below "real time" for the 5 second video.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Priority|medium |high
--- Comment #1 from Christian König deathsimple@vodafone.de --- Yeah, that is a known issue.
The current VA-API implementation waits for the result after sending a single frame to the hardware.
The OpenMAX implementation pipelines the whole thing and waits for a result after sending multiple frames to the hardware to chew on.
So with OpenMAX the hardware is always busy, while with VA-API it constantly turns on/off.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
Andy Furniss adf.lists@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |adf.lists@gmail.com
--- Comment #2 from Andy Furniss adf.lists@gmail.com --- So maybe there is also some dpm type issue on your system.
Rather than running gears maybe there is somewhere you can force gpu clocks to high.
My setup is very different but I would do -
echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level
https://bugs.freedesktop.org/show_bug.cgi?id=97075
--- Comment #3 from Christoph Haag haagch@frickel.club --- I put the issue in DRM/radeon instead of mesa/radeonsi because I thought it would be related to power management.
I tried echo high > /sys/class/drm/card1/device/power_dpm_force_performance_level and performance > /sys/class/drm/card1/device/power_dpm_state but it makes no difference, still just as slow.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- QA Contact| |dri-devel@lists.freedesktop | |.org Product|DRI |Mesa Component|DRM/Radeon |Drivers/Gallium/radeonsi
https://bugs.freedesktop.org/show_bug.cgi?id=97075
--- Comment #4 from Christian König deathsimple@vodafone.de --- Good point, but no the problem is clearly in the VA-API state tracker.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
--- Comment #5 from Andy Furniss adf.lists@gmail.com --- Well his omx test is 6x slower as well without load (though the test vid is very short).
So I think in addition to to the vaapi issue he is seeing some prime+HD 7970M dpm problem.
Though maybe forcing CPUs to high and re-testing would help rule out cpufreq messing things up.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
--- Comment #6 from Christian König deathsimple@vodafone.de --- I should open my eyes while reading. Indeed that is way to much to be explained by the VAAPI problems.
https://bugs.freedesktop.org/show_bug.cgi?id=97075
GitLab Migration User gitlab-migration@fdo.invalid changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #7 from GitLab Migration User gitlab-migration@fdo.invalid --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1235.
dri-devel@lists.freedesktop.org