https://bugzilla.kernel.org/show_bug.cgi?id=209163
Bug ID: 209163 Summary: amdgpu: The CS has been cancelled because the context is lost Product: Drivers Version: 2.5 Kernel Version: 4.9.118 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: satish.in@outlook.in Regression: No
Created attachment 292355 --> https://bugzilla.kernel.org/attachment.cgi?id=292355&action=edit dmesg log
I am getting error after playing application continuously .
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #1 from Satish patel (satish.in@outlook.in) --- Created attachment 292357 --> https://bugzilla.kernel.org/attachment.cgi?id=292357&action=edit AMDGPU version information
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #2 from Satish patel (satish.in@outlook.in) --- Created attachment 292359 --> https://bugzilla.kernel.org/attachment.cgi?id=292359&action=edit Mesa_opencl version information
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #3 from Satish patel (satish.in@outlook.in) --- Created attachment 292361 --> https://bugzilla.kernel.org/attachment.cgi?id=292361&action=edit lspci information
https://bugzilla.kernel.org/show_bug.cgi?id=209163
Satish patel (satish.in@outlook.in) changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|Video(DRI - non Intel) |Other Product|Drivers |Memory Management
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #4 from Christian König (christian.koenig@amd.com) --- This is expected behavior, your application tries to use more memory than physical available:
[71804.930003] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for command submission!
That is most likely a bug in the application, e.g. a memory leak.
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #5 from Satish patel (satish.in@outlook.in) --- (In reply to Christian König from comment #4)
This is expected behavior, your application tries to use more memory than physical available:
[71804.930003] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for command submission!
That is most likely a bug in the application, e.g. a memory leak.
Dear Mr. Konig,
Thanks for your reply , But I would like to inform and describe same application running up to 10 days until Physical memory and swap memory not utilized in CentOS 7 (gnome display ) with kernel 3.10.0-1127.el7.x86_64.
But same application has error "amdgpu: The CS has been cancelled because the context is lost" even system utilize only 75% physical memory from Total 5.83 GB Physical memory and 1% swap memory from 15 GB swap partition. This Error , I am getting in Kernel 4.9.118. Why system crash ( Display flickering and touch screen not responding) and not utilize swap memory area ? . But CPU and memory utilization showing when monitoring from other system .
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #6 from Christian König (christian.koenig@amd.com) --- You are running out of VRAM, not system memory.
Can you test this on an up to date kernel as well?
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #7 from Satish patel (satish.in@outlook.in) --- Created attachment 292449 --> https://bugzilla.kernel.org/attachment.cgi?id=292449&action=edit VRAM Utilization screen shot
It's attached VRAM Utilization error screen shot as output of - cat /sys/kernel/debug/dri/0/amdgpu_vram_mm
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #8 from Satish patel (satish.in@outlook.in) --- (In reply to Christian König from comment #6)
You are running out of VRAM, not system memory.
Can you test this on an up to date kernel as well?
Is there any way to restrict not utilize full VRAM by AMDGPU module parameter settings ? same application running with on same hardware in Gnome desktop (Centos 7) with kernel 3.10.xx.1127 .
I am getting error when Utilize same application in X Windows and getting error after 19 hours. where same application running more than 7 days with above Operating system and kernel version.
https://bugzilla.kernel.org/show_bug.cgi?id=209163
--- Comment #9 from Christian König (christian.koenig@amd.com) --- Try amdgpu.vramlimit=512 on the kernel command line to limit the available VRAM to 512MB.
The problem is certainly some kind of memory leak.
You need to test an up to date kernel, like 5.8 or even better the latest bleeding edge amd-staging-drm-next branch.
dri-devel@lists.freedesktop.org