On Tue, Sep 17, 2019 at 04:07:54PM +1000, Ben Skeggs wrote:
On Tue, 17 Sep 2019 at 00:36, Thierry Reding thierry.reding@gmail.com wrote:
From: Thierry Reding treding@nvidia.com
Hi Ben,
I messed up the ordering of patches in my tree a bit, so these two fixes got separated from the others. I don't consider these particularily urgent because the crash that the first one fixes only happens on gp10b which we don't enable by default yet and the second patch fixes a crash that only happens on module unload (or driver unbind, more accurately), which isn't a terribly common thing to do.
I'll be sending out fixes shortly to make the GP10B work more properly on a wider range of Jetson TX2 devices and enable it by default.
One thing to mention is that I'm not exactly sure if the first patch is the right thing to do. I haven't seen any issues after that change, but I'm also not exactly sure I understand what BAR2 is used for, so I don't know if I would've even covered those code paths (other than the one causing the crash at probe time) in my tests.
BAR2 on dGPUs is used to map kernel-level GPU objects in VRAM so they can be accessed by the driver. It's pretty much a smaller version of BAR1, but intended for a different purpose.
On dGPUs, there's a couple of places (fault buffer address, and fault method buffer on volta) where the GPU wants PRI regs to be poked with an offset within BAR2 rather than an aperture+offset combination. I'm not 100% sure what Tegra parts do here, but presumably if it's working for you, they're happy to just accept a system memory address instead.
I guess this would be the right thing to do here in that situation. The more obvious (from a "reading the code" POV) thing to do would be to write Tegra-specific versions of the functions that use nvkm_memory_bar2() to perform this mapping, and use nvkm_memory_addr() instead but I'm not sure if we need/want to go to that effort. It's conceivable it could be required at some point.
Yeah, that sounds slightly more correct. I'll look into it and see if I can come up with something.
Thierry
Ben.
It'd be great to get Lyude's feedback on the second patch, since that call to pci_disable_device() was rather oddly placed and I'm not sure if that was essential for things to work or whether the slightly different point in time where it's called after this patch is also okay. It looks to me like it should work fine, but I don't currently have a way to test this on desktop GPUs.
Thierry
Thierry Reding (2): drm/nouveau: tegra: Fix NULL pointer dereference drm/nouveau: tegra: Do not try to disable PCI device
drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- .../drm/nouveau/nvkm/subdev/instmem/gk20a.c | 30 +++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-)
-- 2.23.0
Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau