https://bugzilla.kernel.org/show_bug.cgi?id=203879
Bug ID: 203879
Summary: hard freeze on high single threaded load when Xorg is
active (AMD Ryzen 7 2700X CPU, AMD Radeon RX 580 GPU)
Product: Drivers
Version: 2.5
Kernel Version: 4.19.37-3 (Debian 4.19.0-5-amd64) and others
(including mainline versions)
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
…
[View More] Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri(a)kernel-bugs.osdl.org
Reporter: claude(a)mathr.co.uk
Regression: No
Created attachment 283233
--> https://bugzilla.kernel.org/attachment.cgi?id=283233&action=edit
dmesg from 4.19.0-5-amd64 with amdgpu.dc=1 (no freeze yet)
I am developing a CPU-based program to render fractals, which I usually run
with "nice -n 20". The main calculations are multi-threaded, using 16 threads
on AMD Ryzen 7 2700X Eight-Core Processor. However, final image PNG saving is
single-threaded. During the single-threaded workload only (as observed by htop
and program status prints), it can happen that the system freezes hard (no ssh,
stuck mouse pointer, no NumLock LED toggle, no magic SysRq, only physical power
button for hard power-off works).
This freeze only happens when Xorg is running on the active virtual terminal: I
tried to see if some kernel log messages would be displayed before freeze by
switching to a console with Ctrl-Alt-F1 after launching my program, but with
the terminal active it doesn't seem to freeze.
The freeze does not always occur, but usually happens before a dozen images are
saved (sequential process is full-threaded workload, followed by
single-threaded workload, repeated). This can take a few hours.
With the virtual terminal active instead of Xorg, I have rendered 100+ images
in a row without any issues. Of course, I can't use other X applications at
the same time, so this is an annoying workaround.
I mostly run the regular Debian Buster kernel but I have had this freeze occur
with other self-compiled kernels of various versions (newer than the Debian
kernel, without Debian patches). I also had the freeze with both amdgpu.dc=1
(default) and amdgpu.dc=0 options.
$ uname -a
Linux eiskaffee 4.19.0-5-amd64 #1 SMP Debian 4.19.37-3 (2019-05-15) x86_64
GNU/Linux
$ apt-cache policy linux-image-4.19.0-5-amd64
linux-image-4.19.0-5-amd64:
Installed: 4.19.37-3
Candidate: 4.19.37-3
Version table:
*** 4.19.37-3 990
990 http://ftp.uk.debian.org/debian buster/main amd64 Packages
500 http://ftp.uk.debian.org/debian unstable/main amd64 Packages
100 /var/lib/dpkg/status
--
You are receiving this mail because:
You are watching the assignee of the bug.
[View Less]
From: Dave Airlie <airlied(a)redhat.com>
Purelink FX-D120 (DVI over fibre extendeders) drive the HPD line
low on the GPU side when the monitor side device is unplugged
or loses the connection. However the GPU side device seems to cache
EDID in this case. Per DVI spec the HPD line must be driven in order
for EDID to be done, but we've met enough broken devices (mainly
VGA->DVI convertors) that do the wrong thing with HPD that we ignore
it if a DDC probe succeeds.
This patch adds an …
[View More]option to the radeon driver to always respect HPD
on DVI connectors such that if the HPD line isn't driven then EDID
isn't probed.
Signed-off-by: Dave Airlie <airlied(a)redhat.com>
---
drivers/gpu/drm/radeon/radeon.h | 1 +
drivers/gpu/drm/radeon/radeon_connectors.c | 7 +++++++
drivers/gpu/drm/radeon/radeon_drv.c | 4 ++++
3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 32808e50be12..d572e8ded9b9 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -117,6 +117,7 @@ extern int radeon_uvd;
extern int radeon_vce;
extern int radeon_si_support;
extern int radeon_cik_support;
+extern int radeon_respect_hpd;
/*
* Copy from radeon_drv.h so we don't have to include both and have conflicting
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c
index c60d1a44d22a..e9b3924df06e 100644
--- a/drivers/gpu/drm/radeon/radeon_connectors.c
+++ b/drivers/gpu/drm/radeon/radeon_connectors.c
@@ -1265,6 +1265,13 @@ radeon_dvi_detect(struct drm_connector *connector, bool force)
goto exit;
}
+ if (radeon_respect_hpd && radeon_connector->hpd.hpd != RADEON_HPD_NONE) {
+ if (!radeon_hpd_sense(rdev, radeon_connector->hpd.hpd)) {
+ ret = connector_status_disconnected;
+ goto exit;
+ }
+ }
+
if (radeon_connector->ddc_bus) {
dret = radeon_ddc_probe(radeon_connector, false);
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index a6cbe11f79c6..556ae381ea86 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -207,6 +207,7 @@ int radeon_auxch = -1;
int radeon_mst = 0;
int radeon_uvd = 1;
int radeon_vce = 1;
+int radeon_respect_hpd = 0;
MODULE_PARM_DESC(no_wb, "Disable AGP writeback for scratch registers");
module_param_named(no_wb, radeon_no_wb, int, 0444);
@@ -312,6 +313,9 @@ int radeon_cik_support = 1;
MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = disabled)");
module_param_named(cik_support, radeon_cik_support, int, 0444);
+MODULE_PARM_DESC(respect_hpd, "For DVI always believe HPD");
+module_param_named(respect_hpd, radeon_respect_hpd, int, 0644);
+
static struct pci_device_id pciidlist[] = {
radeon_PCI_IDS
};
--
2.20.1
[View Less]
From: Colin Ian King <colin.king(a)canonical.com>
Variable val is initialized to a value in a for-loop that is
never read and hence it is redundant. Remove it.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
drivers/gpu/drm/panel/panel-tpo-td043mtea1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panel/panel-tpo-td043mtea1.c b/drivers/gpu/drm/panel/panel-tpo-td043mtea1.c
index 3b4f30c0fdae..…
[View More]84370562910f 100644
--- a/drivers/gpu/drm/panel/panel-tpo-td043mtea1.c
+++ b/drivers/gpu/drm/panel/panel-tpo-td043mtea1.c
@@ -116,7 +116,7 @@ static void td043mtea1_write_gamma(struct td043mtea1_panel *lcd)
td043mtea1_write(lcd, 0x13, val);
/* gamma bits [7:0] */
- for (val = i = 0; i < 12; i++)
+ for (i = 0; i < 12; i++)
td043mtea1_write(lcd, 0x14 + i, gamma[i] & 0xff);
}
--
2.20.1
[View Less]
https://bugs.freedesktop.org/show_bug.cgi?id=111413
Andre Klapper <a9016009(a)gmx.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|IGT |Two
Group| |spam
Status|NEW |RESOLVED
Version|XOrg git |unspecified
Resolution|--- …
[View More] |INVALID
Product|DRI |Spam
--- Comment #1 from Andre Klapper <a9016009(a)gmx.de> ---
Go away and test somewhere else.
--
You are receiving this mail because:
You are the assignee for the bug.
[View Less]
On Sat, Aug 17, 2019 at 01:31:28PM +0200, Christoph Hellwig wrote:
> On Fri, Aug 16, 2019 at 05:11:07PM +0000, Jason Gunthorpe wrote:
> > - if (args->cpages)
> > - migrate_vma_prepare(args);
> > - if (args->cpages)
> > - migrate_vma_unmap(args);
> > + if (!args->cpages)
> > + return 0;
> > +
> > + migrate_vma_prepare(args);
> > + migrate_vma_unmap(args);
>
> I don't think this is ok. Both migrate_vma_prepare and …
[View More]migrate_vma_unmap
> can reduce args->cpages, including possibly to 0.
Oh, yes, that was far too hasty on my part, I had assumed collect set
the cpages. Thank you for checking
Jason
[View Less]
Rearrange the couple of 32-bit atomics hidden amongst the field of
pointers that unnecessarily caused the compiler to insert some padding,
shrinks the size of the base struct dma_fence from 80 to 72 bytes on
x86-64.
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Christian König <christian.koenig(a)amd.com>
---
include/linux/dma-fence.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index …
[View More]404aa748eda6..2ce4d877d33e 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -63,7 +63,7 @@ struct dma_fence_cb;
* been completed, or never called at all.
*/
struct dma_fence {
- struct kref refcount;
+ spinlock_t *lock;
const struct dma_fence_ops *ops;
/* We clear the callback list on kref_put so that by the time we
* release the fence it is unused. No one should be adding to the cb_list
@@ -73,11 +73,11 @@ struct dma_fence {
struct rcu_head rcu;
struct list_head cb_list;
};
- spinlock_t *lock;
u64 context;
u64 seqno;
- unsigned long flags;
ktime_t timestamp;
+ unsigned long flags;
+ struct kref refcount;
int error;
};
--
2.23.0.rc1
[View Less]
On Fri, Aug 16, 2019 at 5:11 PM Christoph Hellwig <hch(a)lst.de> wrote:
>
> On Mon, Aug 12, 2019 at 12:42:30PM -0700, Ralph Campbell wrote:
> >
> > On 8/10/19 4:13 AM, Christoph Hellwig wrote:
> >> On something vaguely related to this patch:
> >>
> >> You use the NVIF_VMM_PFNMAP_V0_V* defines from nvif/if000c.h, which are
> >> a little odd as we only ever set these bits, but they also don't seem
> >> to appear to be in values …
[View More]that are directly fed to the hardware.
> >>
> >> On the other hand mmu/vmm.h defines a set of NVIF_VMM_PFNMAP_V0_*
> >
> > Yes, I see NVKM_VMM_PFN_*
> >
> >> constants with similar names and identical values, and those are used
> >> in mmu/vmmgp100.c and what appears to finally do the low-level dma
> >> mapping and talking to the hardware. Are these two sets of constants
> >> supposed to be the same? Are the actual hardware values or just a
> >> driver internal interface?
> >
> > It looks a bit odd to me too.
> > I don't really know the structure/history of nouveau.
> > Perhaps Ben Skeggs can shed more light on your question.
>
> Ben, do you have any insights on these constants?
Those sets of constants are (currently) supposed to be the same value.
They don't necessarily map to the HW directly at this stage, and
something different will likely be done in the future as HW changes.
Ben.
[View Less]
On Fri, 2019-08-16 at 21:29 +0000, Patchwork wrote:
> == Series Details ==
>
> Series: drm/connector: Allow max possible encoders to attach to a
> connector (rev2)
> URL : https://patchwork.freedesktop.org/series/62743/
> State : warning
>
> == Summary ==
>
> $ dim sparse origin/drm-tip
> Sparse version: v0.6.0
> Commit: drm/connector: Allow max possible encoders to attach to a
> connector
> + ^
> + }
> +drivers/gpu/drm/amd/amdgpu/../display/…
[View More]amdgpu_dm/amdgpu_dm.c:4802:1:
> warning: control reaches end of non-void function [-Wreturn-type]
> +drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In
> function ‘amdgpu_dm_connector_to_encoder’:
Missed a "return NULL;" that will not be reached.
Will fix that in the next version after get some comments.
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[View Less]