https://bugzilla.kernel.org/show_bug.cgi?id=201847
Bug ID: 201847 Summary: nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 000000000a721000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 4 [00ff85c000 X[3819]] Product: Drivers Version: 2.5 Kernel Version: 4.19.6 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: marc@osknowledge.org Regression: No
Dec 2 21:05:25 local kernel: [ 0.955901] nouveau 0000:01:00.0: NVIDIA GM107 (117300a2) Dec 2 21:05:25 local kernel: [ 0.992024] nouveau 0000:01:00.0: bios: version 82.07.9d.00.14 Dec 2 21:05:25 local kernel: [ 0.993477] nouveau 0000:01:00.0: fb: 4096 MiB GDDR5 Dec 2 21:05:25 local kernel: [ 0.993527] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 001228 [ IBUS ] Dec 2 21:05:25 local kernel: [ 1.008241] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 10ac08 [ IBUS ] Dec 2 21:05:25 local kernel: [ 1.061536] nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB Dec 2 21:05:25 local kernel: [ 1.061539] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB Dec 2 21:05:25 local kernel: [ 1.061543] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 Dec 2 21:05:25 local kernel: [ 1.061546] nouveau 0000:01:00.0: DRM: DCB version 4.0 Dec 2 21:05:25 local kernel: [ 1.061549] nouveau 0000:01:00.0: DRM: DCB outp 00: 04800fb6 04420010 Dec 2 21:05:25 local kernel: [ 1.061552] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011fa6 04420010 Dec 2 21:05:25 local kernel: [ 1.061555] nouveau 0000:01:00.0: DRM: DCB outp 02: 02011f62 00020010 Dec 2 21:05:25 local kernel: [ 1.061558] nouveau 0000:01:00.0: DRM: DCB outp 03: 08022fc6 04420010 Dec 2 21:05:25 local kernel: [ 1.061561] nouveau 0000:01:00.0: DRM: DCB outp 04: 08022f82 00020010 Dec 2 21:05:25 local kernel: [ 1.061564] nouveau 0000:01:00.0: DRM: DCB outp 05: 01033fd6 04420020 Dec 2 21:05:25 local kernel: [ 1.061567] nouveau 0000:01:00.0: DRM: DCB outp 06: 01033f92 00020020 Dec 2 21:05:25 local kernel: [ 1.061570] nouveau 0000:01:00.0: DRM: DCB conn 00: 00002047 Dec 2 21:05:25 local kernel: [ 1.061573] nouveau 0000:01:00.0: DRM: DCB conn 01: 00001146 Dec 2 21:05:25 local kernel: [ 1.061575] nouveau 0000:01:00.0: DRM: DCB conn 02: 00010246 Dec 2 21:05:25 local kernel: [ 1.061578] nouveau 0000:01:00.0: DRM: DCB conn 03: 00020346 Dec 2 21:05:25 local kernel: [ 1.433020] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies Dec 2 21:05:25 local kernel: [ 1.535562] nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb: 0x80000, bo 0000000071889fdf Dec 2 21:05:25 local kernel: [ 1.853891] nouveau 0000:01:00.0: disp: 0x00006671[0]: INIT_GENERIC_CONDITON: unknown 0x07 Dec 2 21:05:25 local kernel: [ 2.034030] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device Dec 2 21:05:25 local kernel: [ 2.034061] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
Dec 2 22:35:07 local kernel: [ 5422.645466] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.645475] nouveau 0000:01:00.0: gr: GPC0/TPC3/MP trap: global 00000000 [] warp 3c000d [OOR_REG] Dec 2 22:35:07 local kernel: [ 5422.646304] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646316] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 0, y = 0, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646334] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646346] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 384, y = 74, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646362] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646373] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 352, y = 152, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646388] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646399] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 448, y = 268, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646418] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 000000000a721000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646425] nouveau 0000:01:00.0: fifo: channel 4: killed Dec 2 22:35:07 local kernel: [ 5422.646427] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Dec 2 22:35:07 local kernel: [ 5422.646432] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Dec 2 22:35:07 local kernel: [ 5422.646437] nouveau 0000:01:00.0: X[3819]: channel 4 killed! Dec 2 22:35:31 local kernel: [ 5446.744051] sysrq: SysRq : Keyboard mode set to system default Dec 2 22:35:32 local kernel: [ 5447.080135] sysrq: SysRq : Terminate All Tasks
https://bugzilla.kernel.org/show_bug.cgi?id=201847
--- Comment #1 from Marc B. (kernel.org@marc.ngoe.de) --- It would be soooo cool if anyone would actually read this bug report and maybe try to fix it. I will assist in testing patches until this is resolved.
And: I am willing to offer $100 for fixing this annoying bug! Keeps freezing my 4.19.39 kernel out of nowhere.
Some things I would like to get into discussion:
a) - it might have something to do with memory pressure
_and_
b) - high CPU load _or_ - high number of context switches.
For the latter I'm not sure. The bug actually always occurs when I ie. compile two kernels at -j24 and habe some other work besides this, say a YT video. The bug is, however, definitely triggered by a graphics event, ie. resizing/creating a window, scrolling a Web page or watching a video.
[2019-05-04 15:43:24] err kern 03 kernel : [ 523.906459] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906467] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906473] nouveau 0000:01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906479] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : [ 523.906789] nouveau 0000:01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:43:24] err kern 03 kernel : nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : nouveau 0000:01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:44:24] info kern 06 kernel : [ 584.121331] sysrq: SysRq : Keyboard mode set to system default [2019-05-04 15:44:24] info kern 06 kernel : sysrq: SysRq : Keyboard mode set to system default
https://bugzilla.kernel.org/show_bug.cgi?id=201847
Harry Coin (hcoin@quietfountain.com) changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |hcoin@quietfountain.com
--- Comment #2 from Harry Coin (hcoin@quietfountain.com) --- Here's another freeze report: From $ uname -a Linux ceo1homenx 5.2.0-8-generic #9-Ubuntu SMP Mon Jul 8 13:07:27 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux just before lock syslog:
Jul 21 09:45:20 ceo1homenx kernel: [89849.919490] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] Jul 21 09:45:20 ceo1homenx kernel: [89849.919500] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Jul 21 09:45:20 ceo1homenx kernel: [89849.919506] nouveau 0000:01:00.0: fifo: channel 8: killed Jul 21 09:45:20 ceo1homenx kernel: [89849.919511] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Jul 21 09:45:20 ceo1homenx kernel: [89849.919815] nouveau 0000:01:00.0: Xorg[1546]: channel 8 killed! -- hard lock --
https://bugzilla.kernel.org/show_bug.cgi?id=201847
interface@p3k.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |interface@p3k.org
--- Comment #3 from interface@p3k.org --- i think i got this issue, too:
→ uname -a Linux sticke 4.19.66-1-MANJARO #1 SMP PREEMPT Fri Aug 9 18:01:53 UTC 2019 x86_64 GNU/Linux → journalctl -b-1 Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000240000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 2 [003fbec000 Xorg[634]] Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: channel 2: killed Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: Xorg[634]: channel 2 killed!
btw. i am working with exactly the same os (usb stick) on a different hardware where this problem does not occur. pls let me know if i should post more details (and which ones).
dri-devel@lists.freedesktop.org