https://bugs.freedesktop.org/show_bug.cgi?id=60028
Priority: medium Bug ID: 60028 Assignee: dri-devel@lists.freedesktop.org Summary: Post-3.7.x memory leak, Radeon Evergreen, bisected Severity: major Classification: Unclassified OS: All Reporter: dawitbro@sbcglobal.net Hardware: x86-64 (AMD64) Status: NEW Version: unspecified Component: DRM/Radeon Product: DRI
Created attachment 73839 --> https://bugs.freedesktop.org/attachment.cgi?id=73839&action=edit dmesg from my latest custom kernel
Overview:
There is a memory leak in the 3D graphics support for my Radeon Evergreen GPU (HD 5750). I have at least two programs which leak memory if I configure them to run fullscreen, but stop leaking memory if I run them as windows on my desktop.
The testing I have done so far seems to indicate a problem in the kernel DRM, but after bisecting and locating the first commit which introduced the leak, I found that attempted reverts of the problem code on top of my local git tree still resulted in kernels that leaked. I may be able to find out more when I can find time to do more testing, but I want to report what I have learned so far in case I am doing something stupid or missing something obvious.
I use a game called 'prboom-plus' as a kind of simple test program to see if the latest DRM changes, or the latest version of Mesa, are working. If 'prboom-plus' runs without corruption (or crashing) after a few minutes, then I (naively) conclude that everything is fine. I occasionally play for a longer period of time, and in Dec. had an instance of the game seeming to freeze, causing the whole system to become almost unresponsive. The game suddenly shutdown without any error being logged, and the system was back to normal; I now know that this was symptomatic of the leak and the kernel's OOM killer intervening.
There is also an old DOS game I still like to play, and I use 'dosbox' for that purpose. In Dec., the 'xine' music player I use while playing the DOS game started "skipping" after playing the 'dosbox' game for a while, and the whole system became almost totally unresponsive. I now see that these were the same symptoms as were affecting 'prboom-plus', but at the time I didn't see the connection.
Steps to reproduce:
For 'prboom-plus' it defaults to fullscreen mode. I run it using a script which sets some audio-related environment variables and then runs the program:
prboom-plus -iwad doom.wad -vidmode gl -width 1920 -height 1200
The "-iwad" option is for loading the maps from original DOS DOOM, "-vidmode gl" is to force OpenGL graphics output, and "-width" and "-height" are probably unnecessary but are set to match my monitor's resolution.) The program defaults to fullscreen; to run in a window instead, one can just add the "-window" option to the command.
For 'dosbox', it is possible to toggle from fullscreen to window using the <Alt>+<Enter> hotkey combo. It requires a configuration file, so I have customized it to start my game in fullscreen mode using OpenGL graphics:
[sdl] fullscreen=true fulldouble=true fullresolution=1920x1200 windowresolution=1600x1200 output=opengl ...
Actual results:
While trying to track down another issue (which is now resolved) I started using 'top' to check for processes which were hogging the CPU, and accidentally discovered the runaway RAM usage. I was doing this by running 'top' on VT1 while running 'prboom-plus' fullscreen in X. For convenience, I reconfigured the game to run as a window instead of fullscreen... and the leak disappeared!
The same behavior is seen with 'dosbox': running in fullscreen leaks memory, but running in a window does not.
Expected results:
These programs should run without leaking memory. They have done so for years, up to and including kernel 3.7.X.
I like to test new Radeon-related code, but I have been burned by Linux -rc kernels. For several years, I have been creating local git branches, starting with a stable release as a branch point and then cherry-picking commits from drm-next and drm-fixes which are either directly relevant to my hardware or which are relevant to all systems. I do not file bug reports if I am unable to reproduce a bug using the upstream developers' trees; in this case, I have confirmed that the HEAD of drm-fixes exhibits the memory leak as described above.
System info:
GPU: Radeon HD 5750 (Evergreen Juniper)
Linux distribution: Debian unstable + custom X stack
Software versions:
libdrm: 2.4.40 (plus git commits up to 0980633a of Nov. 27) mesa: 9.1-devel (up to 5330c5a2 of Jan. 14) xorg-core: 1.13.1 xf86-video-ati: 7.0.99-devel (up to commit 793e1b0e of Dec. 6)
Additional Information:
I began noticing problems in late Nov. or early Dec. There were corruption problems with 'torcs', but there were also instances of my desktop becoming unresponsive and programs freezing or suddenly closing without leaving any errors in logs.
The 'torcs' corruption was solved recently in drm-fixes:
commit 20707874 Revert "drm/radeon: do not move bo to different placement at each cs"
I had assumed that the other issues were related, but when they continued to occur I started to investigate more seriously, and discovered the memory leak using 'top'.
[Can a developer rename this bug report for me, using your preferred naming conventions?]