On Wed, Feb 19, 2020 at 1:52 AM Tian, Kevin kevin.tian@intel.com wrote:
From: Paolo Bonzini Sent: Wednesday, February 19, 2020 12:29 AM
On 14/02/20 23:03, Sean Christopherson wrote:
On Fri, Feb 14, 2020 at 1:47 PM Chia-I Wu olvaffe@gmail.com wrote:
AFAICT, it is currently allowed on ARM (verified) and AMD (not verified, but svm_get_mt_mask returns 0 which supposedly means the
NPT
does not restrict what the guest PAT can do). This diff would do the trick for Intel without needing any uapi change:
I would be concerned about Intel CPU errata such as SKX40 and SKX59.
The part KVM cares about, #MC, is already addressed by forcing UC for
MMIO.
The data corruption issue is on the guest kernel to correctly use WC and/or non-temporal writes.
What about coherency across live migration? The userspace process would use cached accesses, and also a WBINVD could potentially corrupt guest memory.
In such case the userspace process possibly should conservatively use UC mapping, as if for MMIO regions on a passthrough device. However there remains a problem. the definition of KVM_MEM_DMA implies favoring guest setting, which could be whatever type in concept. Then assuming UC is also problematic. I'm not sure whether inventing another interface to query effective memory type from KVM is a good idea. There is no guarantee that the guest will use same type for every page in the same slot, then such interface might be messy. Alternatively, maybe we could just have an interface for KVM userspace to force memory type for a given slot, if it is mainly used in para-virtualized scenarios (e.g. virtio-gpu) where the guest is enlightened to use a forced type (e.g. WC)?
KVM forcing the memory type for a given slot should work too. But the ignore-guest-pat bit seems to be Intel-specific. We will need to define how the second-level page attributes combine with the guest page attributes somehow.
KVM should in theory be able to tell that the userspace region is mapped with a certain memory type and can force the same memory type onto the guest. The userspace does not need to be involved. But that sounds very slow? This may be a dumb question, but would it help to add KVM_SET_DMA_BUF and let KVM negotiate the memory type with the in-kernel GPU drivers?
Thanks Kevin