On 4/8/20 2:19 PM, Christian König wrote:
Am 08.04.20 um 14:01 schrieb Thomas Hellström (VMware):
Hi, Christian,
On 4/8/20 1:53 PM, Thomas Hellström (VMware) wrote:
From: "Thomas Hellstrom (VMware)" thomas_os@shipmail.org
With amdgpu and CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y, there are errors like: BUG: non-zero pgtables_bytes on freeing mm and: BUG: Bad rss-counter state with TTM transparent huge-pages. Until we've figured out what other TTM drivers do differently compared to vmwgfx, disable the huge_fault() callback, eliminating transhuge page-table entries.
Cc: Christian König christian.koenig@amd.com Signed-off-by: Thomas Hellstrom (VMware) thomas_os@shipmail.org Reported-by: Alex Xu (Hello71) alex_y_xu@yahoo.ca Tested-by: Alex Xu (Hello71) alex_y_xu@yahoo.ca
Acked-by: Christian König christian.koenig@amd.com
Without being able to test and track this down on amdgpu there's little more than this I can do at the moment. Hopefully I'll be able to test on nouveau/ttm after getting back from vacation to see if I can reproduce.
It looks like some part of the kernel mistakes a huge page-table entry for a page directory, and that would be a path that is not hit with vmwgfx.
Well that looks like an ugly one and I don't know enough about the page table handling to hunt this down either.
BTW: Have you seen the coverity warning about "WARN_ON_ONCE(ret = VM_FAULT_FALLBACK);"?
Yes, that's a false warning but it might be that it should be rewritten for clarity like so:
ret = VM_FAULT_FALLBACK; WARN_ON_ONCE(true);
/Thomas
Regards, Christian.
/Thomas