On 12/4/19 3:42 PM, Michal Hocko wrote:
On Wed 04-12-19 15:36:58, Thomas Hellström (VMware) wrote:
On 12/4/19 3:35 PM, Michal Hocko wrote:
On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
On 12/4/19 2:52 PM, Michal Hocko wrote:
On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
From: Thomas Hellstrom thellstrom@vmware.com
TTM graphics buffer objects may, transparently to user-space, move between IO and system memory. When that happens, all PTEs pointing to the old location are zapped before the move and then faulted in again if needed. When that happens, the page protection caching mode- and encryption bits may change and be different from those of struct vm_area_struct::vm_page_prot.
We were using an ugly hack to set the page protection correctly. Fix that and instead use vmf_insert_mixed_prot() and / or vmf_insert_pfn_prot(). Also get the default page protection from struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot(). This way we catch modifications done by the vm system for drivers that want write-notification.
So essentially this should have any new side effect on functionality it is just making a hacky/ugly code less so?
Functionality is unchanged. The use of a on-stack vma copy was severely frowned upon in an earlier thread, which also points to another similar example using vmf_insert_pfn_prot().
https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
In other words what are the consequences of having page protection inconsistent from vma's?
During the years, it looks like the caching- and encryption flags of vma::vm_page_prot have been largely removed from usage. From what I can tell, there are no more places left that can affect TTM. We discussed __split_huge_pmd_locked() towards the end of that thread, but that doesn't affect TTM even with huge page-table entries.
Please state all those details/assumptions you are operating on in the changelog.
Thanks. I'll update the patchset and add that.
And thinking about that this also begs for a comment in the code to explain that some (which?) mappings might have a mismatch and the generic code have to be careful. Because as things stand now this seems to be really subtle and happen to work _now_ and might break in the future. Or what does prevent a generic code to stumble over this discrepancy?
Yes we had that discussion in the thread I pointed to. I initially suggested and argued for updating the vma::vm_page_prot using a WRITE_ONCE() (we only have the mmap_sem in read mode), there seems to be other places in generic code that does the same.
But I was convinced by Andy that this was the right way and also was used elsewhere.
(See also https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vma.c#L11...)
I guess to have this properly formulated, what's required is that generic code doesn't build page-table entries using vma::vm_page_prot for VM_PFNMAP and VM_MIXEDMAP outside of driver control.
/Thomas