On 2020-05-29 6:45 a.m., Christoph Hellwig wrote:
On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote:
This issue is most likely in the i915 driver and is most likely caused by the driver not respecting the return value of the dma_map_ops::map_sg function. You can see the driver ignoring the return value here: https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3...
Previously this didn’t cause issues because the intel map_sg always returned the same number of elements as the input scatter gather list but with the change to this dma-iommu api this is no longer the case. I wasn’t able to track the bug down to a specific line of code unfortunately.
Mark did a big audit into the map_sg API abuse and initially had some i915 patches, but then gave up on them with this comment:
"The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix it fully. The driver creatively uses sg_table->orig_nents to store the size of the allocate scatterlist and ignores the number of the entries returned by dma_map_sg function. In this patchset I only fixed the sg_table objects exported by dmabuf related functions. I hope that I didn't break anything there."
it would be really nice if the i915 maintainers could help with sorting that API abuse out.
I agree completely that the API abuse should be sorted out, but I think that's much larger than just the i915 driver. Pretty much every dma-buf map_dma_buf implementation I looked at ignores the returned nents of sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See:
amdgpu_dma_buf_map armada_gem_prime_map_dma_buf drm_gem_map_dma_buf i915_gem_map_dma_buf tegra_gem_prime_map_dma_buf
So this should probably be addressed by the whole GPU community.
However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA.
As I tried to point out in my previous email, i915 does not do this trick. In fact, it completely ignores sg_dma_len() and is thus completely broken. See i915_scatterlist.h and the __sgt_iter() function. So it doesn't sound to me like Mark's fix would address the issue at all. Per my previous email, I'd guess that it can be fixed simply by adjusting the __sgt_iter() function to do something more sensible. (Better yet, if possible, ditch __sgt_iter() and use the common DRM function that AMD uses).
This would at least allow us to make progress with Tom's IOMMU-DMA patch set and once that gets in, it will be harder for other drivers to make the same mistake.
Logan
Hi Logan,
On 29.05.2020 21:05, Logan Gunthorpe wrote:
On 2020-05-29 6:45 a.m., Christoph Hellwig wrote:
On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote:
This issue is most likely in the i915 driver and is most likely caused by the driver not respecting the return value of the dma_map_ops::map_sg function. You can see the driver ignoring the return value here: https://protect2.fireeye.com/url?k=ca25a34b-97f7b813-ca242804-0cc47a31c8b4-0...
Previously this didn’t cause issues because the intel map_sg always returned the same number of elements as the input scatter gather list but with the change to this dma-iommu api this is no longer the case. I wasn’t able to track the bug down to a specific line of code unfortunately.
Mark did a big audit into the map_sg API abuse and initially had some i915 patches, but then gave up on them with this comment:
"The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix it fully. The driver creatively uses sg_table->orig_nents to store the size of the allocate scatterlist and ignores the number of the entries returned by dma_map_sg function. In this patchset I only fixed the sg_table objects exported by dmabuf related functions. I hope that I didn't break anything there."
it would be really nice if the i915 maintainers could help with sorting that API abuse out.
I agree completely that the API abuse should be sorted out, but I think that's much larger than just the i915 driver. Pretty much every dma-buf map_dma_buf implementation I looked at ignores the returned nents of sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See:
amdgpu_dma_buf_map armada_gem_prime_map_dma_buf drm_gem_map_dma_buf i915_gem_map_dma_buf tegra_gem_prime_map_dma_buf
So this should probably be addressed by the whole GPU community.
Patches are pending: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@samsu...
However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA.
I'm not sure that this is a trick. Stopping at zero sg_dma_len() was somewhere documented.
As I tried to point out in my previous email, i915 does not do this trick. In fact, it completely ignores sg_dma_len() and is thus completely broken. See i915_scatterlist.h and the __sgt_iter() function. So it doesn't sound to me like Mark's fix would address the issue at all. Per my previous email, I'd guess that it can be fixed simply by adjusting the __sgt_iter() function to do something more sensible. (Better yet, if possible, ditch __sgt_iter() and use the common DRM function that AMD uses).
This would at least allow us to make progress with Tom's IOMMU-DMA patch set and once that gets in, it will be harder for other drivers to make the same mistake.
Best regards
On 2020-05-29 3:11 p.m., Marek Szyprowski wrote:
Patches are pending: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@samsu...
Cool, nice! Though, I still don't think that fixes the issue in i915_scatterlist.h given it still ignores sg_dma_len() and strictly relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this is the bug that got in Tom's way.
However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA.
I'm not sure that this is a trick. Stopping at zero sg_dma_len() was somewhere documented.
Well whatever you want to call it, it is ugly to have some drivers doing one thing with the returned value and others assuming there's an extra zero at the end. It just causes confusion for people reading/copying the code. It would be better if they are all consistent. However, I concede stopping at zero should not be broken, presently.
Logan
Hi Logan/All,
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
+ if (sgl && sg_dma_len(sgl) == 0) + s.sgp = NULL;
if (s.sgp) { ..... """ at location [1]. but it doens't fix the problem.
You're right though, this change does need to be made, this code doesn't handle pages of sg_dma_len(sg) == 0 correctly So my guess is that we have more bugs in other parts of the i915 driver (or there is a problem with my "sg_dma_len == 0" fix above). I have been trying to spot where else the code might be buggy but I haven't had any luck so far.
I'm doing a microconfernce (at LPC 2020) this wednesdays [1] on this if you're interested in attending. I'm hoping I can chat about it with a few people and find how can reproduce and fix this issues. I don't have any more time I can give to this unfortunately and it would be a shame for the work to go to waste.
[0] https://github.com/torvalds/linux/blob/d012a7190fc1fd72ed48911e77ca97ba4521b... [1] https://linuxplumbersconf.org/event/7/contributions/846/
On Fri, 29 May 2020 at 22:21, Logan Gunthorpe logang@deltatee.com wrote:
On 2020-05-29 3:11 p.m., Marek Szyprowski wrote:
Patches are pending: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@samsu...
Cool, nice! Though, I still don't think that fixes the issue in i915_scatterlist.h given it still ignores sg_dma_len() and strictly relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this is the bug that got in Tom's way.
However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA.
I'm not sure that this is a trick. Stopping at zero sg_dma_len() was somewhere documented.
Well whatever you want to call it, it is ugly to have some drivers doing one thing with the returned value and others assuming there's an extra zero at the end. It just causes confusion for people reading/copying the code. It would be better if they are all consistent. However, I concede stopping at zero should not be broken, presently.
Logan
On Mon, Aug 24, 2020 at 2:56 AM Tom Murphy murphyt7@tcd.ie wrote:
Hi Logan/All,
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && sg_dma_len(sgl) == 0)
s.sgp = NULL; if (s.sgp) { .....
""" at location [1]. but it doens't fix the problem.
You're right though, this change does need to be made, this code doesn't handle pages of sg_dma_len(sg) == 0 correctly So my guess is that we have more bugs in other parts of the i915 driver (or there is a problem with my "sg_dma_len == 0" fix above). I have been trying to spot where else the code might be buggy but I haven't had any luck so far.
I'm doing a microconfernce (at LPC 2020) this wednesdays [1] on this if you're interested in attending. I'm hoping I can chat about it with a few people and find how can reproduce and fix this issues. I don't have any more time I can give to this unfortunately and it would be a shame for the work to go to waste.
[0] https://github.com/torvalds/linux/blob/d012a7190fc1fd72ed48911e77ca97ba4521b... [1] https://linuxplumbersconf.org/event/7/contributions/846/
On Fri, 29 May 2020 at 22:21, Logan Gunthorpe logang@deltatee.com wrote:
On 2020-05-29 3:11 p.m., Marek Szyprowski wrote:
Patches are pending: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprowski@samsu...
Cool, nice! Though, I still don't think that fixes the issue in i915_scatterlist.h given it still ignores sg_dma_len() and strictly relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this is the bug that got in Tom's way.
However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA.
We ran into the same issue with amdgpu and radeon when the AMD IOMMU driver was converted and had to fix it as well. The relevant fixes were: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Alex
I'm not sure that this is a trick. Stopping at zero sg_dma_len() was somewhere documented.
Well whatever you want to call it, it is ugly to have some drivers doing one thing with the returned value and others assuming there's an extra zero at the end. It just causes confusion for people reading/copying the code. It would be better if they are all consistent. However, I concede stopping at zero should not be broken, presently.
Logan
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2020-08-23 6:04 p.m., Tom Murphy wrote:
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && sg_dma_len(sgl) == 0)
s.sgp = NULL; if (s.sgp) { .....
""" at location [1]. but it doens't fix the problem.
Based on my read of the code, it looks like we also need to change usage of sgl->length... Something like the rough patch below, maybe?
Also, Tom, do you have an updated version of the patchset to convert the Intel IOMMU to dma-iommu available? The last one I've found doesn't apply cleanly (I'm assuming parts of it have been merged in slightly modified forms).
Thanks,
Logan
--
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
+ if (sgl && !sg_dma_len(s.sgp)) + s.sgp = NULL; + if (s.sgp) { s.max = s.curr = s.sgp->offset; - s.max += s.sgp->length; - if (dma) + + if (dma) { + s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp); - else + } else { + s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp)); + } }
return s;
On Thu, 27 Aug 2020 at 22:36, Logan Gunthorpe logang@deltatee.com wrote:
On 2020-08-23 6:04 p.m., Tom Murphy wrote:
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && sg_dma_len(sgl) == 0)
s.sgp = NULL; if (s.sgp) { .....
""" at location [1]. but it doens't fix the problem.
Based on my read of the code, it looks like we also need to change usage of sgl->length... Something like the rough patch below, maybe?
Also, Tom, do you have an updated version of the patchset to convert the Intel IOMMU to dma-iommu available? The last one I've found doesn't apply cleanly (I'm assuming parts of it have been merged in slightly modified forms).
I'll try and post one in the next 24hours
Thanks,
Logan
--
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && !sg_dma_len(s.sgp))
s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset;
s.max += s.sgp->length;
if (dma)
if (dma) {
s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp);
else
} else {
s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp));
} } return s;
On Fri, 28 Aug 2020 at 00:34, Tom Murphy murphyt7@tcd.ie wrote:
On Thu, 27 Aug 2020 at 22:36, Logan Gunthorpe logang@deltatee.com wrote:
On 2020-08-23 6:04 p.m., Tom Murphy wrote:
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && sg_dma_len(sgl) == 0)
s.sgp = NULL; if (s.sgp) { .....
""" at location [1]. but it doens't fix the problem.
Based on my read of the code, it looks like we also need to change usage of sgl->length... Something like the rough patch below, maybe?
Also, Tom, do you have an updated version of the patchset to convert the Intel IOMMU to dma-iommu available? The last one I've found doesn't apply cleanly (I'm assuming parts of it have been merged in slightly modified forms).
I'll try and post one in the next 24hours
I have just posted this now: The subject of the cover letter is: "[PATCH V2 0/5] Convert the intel iommu driver to the dma-iommu api"
Thanks,
Logan
--
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && !sg_dma_len(s.sgp))
s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset;
s.max += s.sgp->length;
if (dma)
if (dma) {
s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp);
else
} else {
s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp));
} } return s;
Hi,
On 27/08/2020 22:36, Logan Gunthorpe wrote:
On 2020-08-23 6:04 p.m., Tom Murphy wrote:
I have added a check for the sg_dma_len == 0 : """ } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && sg_dma_len(sgl) == 0)
s.sgp = NULL; if (s.sgp) { .....
""" at location [1]. but it doens't fix the problem.
Based on my read of the code, it looks like we also need to change usage of sgl->length... Something like the rough patch below, maybe?
This thread was brought to my attention and I initially missed this reply. Essentially I came to the same conclusion about the need to use sg_dma_len. One small correction below:
Also, Tom, do you have an updated version of the patchset to convert the Intel IOMMU to dma-iommu available? The last one I've found doesn't apply cleanly (I'm assuming parts of it have been merged in slightly modified forms).
Thanks,
Logan
--
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe:
if (dma && sgl && !sg_dma_len(s.sgp))
s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset;
s.max += s.sgp->length;
if (dma)
if (dma) {
s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp);
else
} else {
s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp));
}
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
Regards,
Tvrtko
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
+ if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
+ s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset; - s.max += s.sgp->length; - if (dma)
+ if (dma) { + s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp); - else + } else { + s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp)); + }
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Thanks,
Logan
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
+ if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
+ s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset; - s.max += s.sgp->length; - if (dma)
+ if (dma) { + s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp); - else + } else { + s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp)); + }
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Regards,
Tvrtko
On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset;
s.max += s.sgp->length;
if (dma)
if (dma) {
s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp);
else
} else {
s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp));
}
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
I don't unfortunately. I'm working locally with poor internet.
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Is this enough info?:
$ lspci -vnn | grep VGA -A 12 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f] Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at eb000000 (64-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] I/O ports at e000 [size=64] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS)
Regards,
Tvrtko
On 08/09/2020 23:43, Tom Murphy wrote:
On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset;
s.max += s.sgp->length;
if (dma)
if (dma) {
s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp);
else
} else {
s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp));
}
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
I don't unfortunately. I'm working locally with poor internet.
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Is this enough info?:
$ lspci -vnn | grep VGA -A 12 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f] Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at eb000000 (64-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] I/O ports at e000 [size=64] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS)
Works for a start. What about the steps to repro? Any desktop environment and it is just visual corruption, no hangs/stalls or such?
I've submitted a series consisting of what I understood are the patches needed to repro the issue to our automated CI here:
https://patchwork.freedesktop.org/series/81489/
So will see if it will catch something, or more targeted testing will be required. Hopefully it does trip over in which case I can add the patch suggested by Logan on top and see if that fixes it. Or I'll need to write a new test case.
If you could glance over my series to check I identified the patches correctly it would be appreciated.
Regards,
Tvrtko
On 09/09/2020 10:16, Tvrtko Ursulin wrote:
On 08/09/2020 23:43, Tom Murphy wrote:
On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915 index b7b59328cb76..9367ac801f0c 100644 --- a/drivers/gpu/drm/i915/i915_scatterlist.h +++ b/drivers/gpu/drm/i915/i915_scatterlist.h @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { } __sgt_iter(struct scatterlist *sgl, bool dma) { struct sgt_iter s = { .sgp = sgl };
+ if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
+ s.sgp = NULL;
if (s.sgp) { s.max = s.curr = s.sgp->offset; - s.max += s.sgp->length; - if (dma)
+ if (dma) { + s.max += sg_dma_len(s.sgp); s.dma = sg_dma_address(s.sgp); - else + } else { + s.max += s.sgp->length; s.pfn = page_to_pfn(sg_page(s.sgp)); + }
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
I don't unfortunately. I'm working locally with poor internet.
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Is this enough info?:
$ lspci -vnn | grep VGA -A 12 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f] Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at eb000000 (64-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] I/O ports at e000 [size=64] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS)
Works for a start. What about the steps to repro? Any desktop environment and it is just visual corruption, no hangs/stalls or such?
I've submitted a series consisting of what I understood are the patches needed to repro the issue to our automated CI here:
https://patchwork.freedesktop.org/series/81489/
So will see if it will catch something, or more targeted testing will be required. Hopefully it does trip over in which case I can add the patch suggested by Logan on top and see if that fixes it. Or I'll need to write a new test case.
If you could glance over my series to check I identified the patches correctly it would be appreciated.
Our CI was more than capable at catching the breakage so I've copied you on a patch (https://patchwork.freedesktop.org/series/81497/) which has a good potential to fix this. (Or improve the robustness of our sg walks, depends how you look at it.)
Would you be able to test it in your environment by any chance? If it works I understand it unblocks your IOMMU work, right?
Regards,
Tvrtko
On Wed, 9 Sep 2020 at 13:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 09/09/2020 10:16, Tvrtko Ursulin wrote:
On 08/09/2020 23:43, Tom Murphy wrote:
On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h > b/drivers/gpu/drm/i915/i915 > index b7b59328cb76..9367ac801f0c 100644 > --- a/drivers/gpu/drm/i915/i915_scatterlist.h > +++ b/drivers/gpu/drm/i915/i915_scatterlist.h > @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { > } __sgt_iter(struct scatterlist *sgl, bool dma) { > struct sgt_iter s = { .sgp = sgl }; > > + if (sgl && !sg_dma_len(s.sgp))
I'd extend the condition to be, just to be safe: if (dma && sgl && !sg_dma_len(s.sgp))
Right, good catch, that's definitely necessary.
> + s.sgp = NULL; > + > if (s.sgp) { > s.max = s.curr = s.sgp->offset; > - s.max += s.sgp->length; > - if (dma) > + > + if (dma) { > + s.max += sg_dma_len(s.sgp); > s.dma = sg_dma_address(s.sgp); > - else > + } else { > + s.max += s.sgp->length; > s.pfn = page_to_pfn(sg_page(s.sgp)); > + }
Otherwise has this been tested or alternatively how to test it? (How to repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
I don't unfortunately. I'm working locally with poor internet.
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Is this enough info?:
$ lspci -vnn | grep VGA -A 12 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f] Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at eb000000 (64-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] I/O ports at e000 [size=64] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS)
Works for a start. What about the steps to repro? Any desktop environment and it is just visual corruption, no hangs/stalls or such?
I've submitted a series consisting of what I understood are the patches needed to repro the issue to our automated CI here:
https://patchwork.freedesktop.org/series/81489/
So will see if it will catch something, or more targeted testing will be required. Hopefully it does trip over in which case I can add the patch suggested by Logan on top and see if that fixes it. Or I'll need to write a new test case.
If you could glance over my series to check I identified the patches correctly it would be appreciated.
Our CI was more than capable at catching the breakage so I've copied you on a patch (https://patchwork.freedesktop.org/series/81497/) which has a good potential to fix this. (Or improve the robustness of our sg walks, depends how you look at it.)
Would you be able to test it in your environment by any chance? If it works I understand it unblocks your IOMMU work, right?
I tested your latest patch set ([PATCH 1/2] drm/i915: Fix DMA mapped scatterlist walks) and it fixes the issue. great work!
Regards,
Tvrtko
On Thu, 10 Sep 2020 at 14:33, Tom Murphy murphyt7@tcd.ie wrote:
On Wed, 9 Sep 2020 at 13:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 09/09/2020 10:16, Tvrtko Ursulin wrote:
On 08/09/2020 23:43, Tom Murphy wrote:
On Tue, 8 Sep 2020 at 16:56, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 08/09/2020 16:44, Logan Gunthorpe wrote:
On 2020-09-08 9:28 a.m., Tvrtko Ursulin wrote: >> >> diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h >> b/drivers/gpu/drm/i915/i915 >> index b7b59328cb76..9367ac801f0c 100644 >> --- a/drivers/gpu/drm/i915/i915_scatterlist.h >> +++ b/drivers/gpu/drm/i915/i915_scatterlist.h >> @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter { >> } __sgt_iter(struct scatterlist *sgl, bool dma) { >> struct sgt_iter s = { .sgp = sgl }; >> >> + if (sgl && !sg_dma_len(s.sgp)) > > I'd extend the condition to be, just to be safe: > if (dma && sgl && !sg_dma_len(s.sgp)) >
Right, good catch, that's definitely necessary.
>> + s.sgp = NULL; >> + >> if (s.sgp) { >> s.max = s.curr = s.sgp->offset; >> - s.max += s.sgp->length; >> - if (dma) >> + >> + if (dma) { >> + s.max += sg_dma_len(s.sgp); >> s.dma = sg_dma_address(s.sgp); >> - else >> + } else { >> + s.max += s.sgp->length; >> s.pfn = page_to_pfn(sg_page(s.sgp)); >> + } > > Otherwise has this been tested or alternatively how to test it? > (How to > repro the issue.)
It has not been tested. To test it, you need Tom's patch set without the last "DO NOT MERGE" patch:
https://lkml.kernel.org/lkml/20200907070035.GA25114@infradead.org/T/
Tom, do you have a branch somewhere I could pull from? (Just being lazy about downloading a bunch of messages from the archives.)
I don't unfortunately. I'm working locally with poor internet.
What GPU is in your Lenovo x1 carbon 5th generation and what graphical/desktop setup I need to repro?
Is this enough info?:
$ lspci -vnn | grep VGA -A 12 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo ThinkPad X1 Carbon 5th Gen [17aa:224f] Flags: bus master, fast devsel, latency 0, IRQ 148 Memory at eb000000 (64-bit, non-prefetchable) [size=16M] Memory at 60000000 (64-bit, prefetchable) [size=256M] I/O ports at e000 [size=64] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS)
Works for a start. What about the steps to repro? Any desktop environment and it is just visual corruption, no hangs/stalls or such?
I've submitted a series consisting of what I understood are the patches needed to repro the issue to our automated CI here:
https://patchwork.freedesktop.org/series/81489/
So will see if it will catch something, or more targeted testing will be required. Hopefully it does trip over in which case I can add the patch suggested by Logan on top and see if that fixes it. Or I'll need to write a new test case.
If you could glance over my series to check I identified the patches correctly it would be appreciated.
Our CI was more than capable at catching the breakage so I've copied you on a patch (https://patchwork.freedesktop.org/series/81497/) which has a good potential to fix this. (Or improve the robustness of our sg walks, depends how you look at it.)
Would you be able to test it in your environment by any chance? If it works I understand it unblocks your IOMMU work, right?
And yes this does unblock the iommu work
I tested your latest patch set ([PATCH 1/2] drm/i915: Fix DMA mapped scatterlist walks) and it fixes the issue. great work!
Regards,
Tvrtko
dri-devel@lists.freedesktop.org