From: Chia-I Wu Sent: Saturday, February 15, 2020 5:15 AM
On Fri, Feb 14, 2020 at 2:26 AM Paolo Bonzini pbonzini@redhat.com wrote:
On 13/02/20 23:18, Chia-I Wu wrote:
The bug you mentioned was probably this one
Yes, indeed.
From what I can tell, the commit allowed the guests to create cached mappings to MMIO regions and caused MCEs. That is different than what I need, which is to allow guests to create uncached mappings to system ram (i.e., !kvm_is_mmio_pfn) when the host userspace also has
uncached
mappings. But it is true that this still allows the userspace & guest kernel to create conflicting memory types.
Right, the question is whether the MCEs were tied to MMIO regions specifically and if so why.
An interesting remark is in the footnote of table 11-7 in the SDM. There, for the MTRR (EPT for us) memory type UC you can read:
The UC attribute comes from the MTRRs and the processors are not required to snoop their caches since the data could never have been cached. This attribute is preferred for performance reasons.
There are two possibilities:
- the footnote doesn't apply to UC mode coming from EPT page tables.
That would make your change safe.
- the footnote also applies when the UC attribute comes from the EPT
page tables rather than the MTRRs. In that case, the host should use UC as the EPT page attribute if and only if it's consistent with the host MTRRs; it would be more or less impossible to honor UC in the guest
MTRRs.
In that case, something like the patch below would be needed.
It is not clear from the manual why the footnote would not apply to WC;
that
is, the manual doesn't say explicitly that the processor does not do
snooping
for accesses to WC memory. But I guess that must be the case, which is
why I
used MTRR_TYPE_WRCOMB in the patch below.
Either way, we would have an explanation of why creating cached mapping
to
MMIO regions would, and why in practice we're not seeing MCEs for guest
RAM
(the guest would have set WB for that memory in its MTRRs, not UC).
One thing you didn't say: how would userspace use KVM_MEM_DMA? On
which
regions would it be set?
It will be set for shmems that are mapped WC.
GPU/DRM drivers allocate shmems as DMA-able gpu buffers and allow the userspace to map them cached or WC (I915_MMAP_WC or AMDGPU_GEM_CREATE_CPU_GTT_USWC for example). When a shmem is mapped WC and is made available to the guest, we would like the ability to map the region WC in the guest.
Curious... How is such slot exposed to the guest? A reserved memory region? Is it static or might be dynamically added?
Thanks Kevin