On Mon, 4 Jan 2021 21:13:53 +0100 Christian König christian.koenig@amd.com wrote:
Am 04.01.21 um 19:43 schrieb Alex Williamson:
On Mon, 4 Jan 2021 18:39:33 +0100 Christian König christian.koenig@amd.com wrote:
Am 04.01.21 um 17:45 schrieb Alex Williamson:
On Mon, 4 Jan 2021 12:34:34 +0100 Christian König christian.koenig@amd.com wrote:
[SNIP]
That's a rather bad idea. See our GPUs for example return way more than they actually need.
E.g. a Polaris usually returns 4GiB even when only 2GiB are installed, because 4GiB is just the maximum amount of RAM you can put together with the ASIC on a board.
Would the driver fail or misbehave if the BAR is sized larger than the amount of memory on the card or is memory size determined independently of BAR size?
Uff, good question. I have no idea.
At least the Linux driver should behave well, but no idea about the Windows driver stack.
Some devices even return a mask of all 1 even when they need only 2MiB, resulting in nearly 1TiB of wasted address space with this approach.
Ugh. I'm afraid to ask why a device with a 2MiB BAR would implement a REBAR capability, but I guess we really can't make any assumptions about the breadth of SKUs that ASIC might support (or sanity of the designers).
It's a standard feature for FPGAs these days since how much BAR you need depends on what you load on it, and that in turn usually only happens after the OS is already started and you fire up your development environment.
We could probe to determine the maximum size the host can support and potentially emulate the capability to remove sizes that we can't allocate, but without any ability for the device to reject a size advertised as supported via the capability protocol it makes me nervous how we can guarantee the resources are available when the user re-configures the device. That might mean we'd need to reserve the resources, up to what the host can support, regardless of what the device can actually use. I'm not sure how else to know how much to reserve without device specific code in vfio-pci. Thanks,
Well in the FPGA case I outlined above you don't really know how much BAR you need until the setup is completed.
E.g. you could need one BAR with just 2MiB and another with 128GB, or two with 64GB or.... That's the reason why somebody came up with the REBAR standard in the first place.
Yes, I suppose without a full bus-reset and soft-hotplug event, resizable BARs are the best way to reconfigure a device based on FPGA programming. Anyway, thanks for the insights here.
I think I can summarize that static resizing might work for some devices like our GPUs, but it doesn't solve the problem in general.
Yup, I don't have a good approach for the general case for a VM yet. We could add a sysfs or side channel mechanism to preconfigure a BAR size, but once we're dealing with a VM interacting with the REBAR capability itself, it's far too easy for the guest to create a configuration that the host might not have bus resources to support, especially if there are multiple resizable BARs under a bridge. Thanks,
Alex