On Tue, Oct 19, 2021 at 08:49:29PM +0200, Thomas Hellström wrote:
Hi,
On 10/19/21 20:21, Jason Gunthorpe wrote:
PUD and PMD entries do not have a special bit.
get_user_pages_fast() considers any page that passed pmd_huge() as usable:
if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))) {
And vmf_insert_pfn_pmd_prot() unconditionally sets
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
As such gup_huge_pmd() will try to deref a struct page:
head = try_grab_compound_head(pmd_page(orig), refs, flags);
and thus crash.
Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot().
Actually I think if fast gup will break even page backed memory since the backing drivers don't assume anybody else takes a refcount / pincount. Normal pages have PTE_SPECIAL and VM_PFNMAP to block that.
Erk, yes, that is even worse.
(Side note I was recommended to introduce a PTE_HUGESPECIAL bit for this and basically had a patch ready but got scared off when trying to handle 64-bit PTE atomic updates on x86-32)
Right, a PMD_SPECIAL bit is needed to make this work.
It might be that we (Intel) try to resurrect this code using PTE_HUGESPECIAL in the near future for i915, but until that, I think it's the safest option to disable it completely.
Okay, do you want a patch to just delete this function?
Jason