https://bugs.freedesktop.org/show_bug.cgi?id=43835
Bug #: 43835 Summary: System crashes when radeon firmware blob (R520_cp.bin) is installed Classification: Unclassified Product: DRI Version: XOrg CVS Platform: Other URL: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=65153 2 OS/Version: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: noelamac@gmail.com
Created attachment 54426 --> https://bugs.freedesktop.org/attachment.cgi?id=54426 Logs ("dmesg" and "xorg.0.log" for kernels 3.2-rc4 and 3.1 with and without the firmware installed)
1. Steps to reproduce the problem
Running Debian Wheezy, by installing the package "firmware-linux-nonfree" which contains the firmware to enable 3D acceleration for the ATI card (M56P Radeon Mobility X1600), when the system starts and user logins, the system crashes (no specific action triggers the crash, is just about time to get it).
2. Symptoms
The user receives a "kernel oops" (kernel 3.1) or system hangs with a trace (kernel 3.2-rc4) and system locks.
3. Tested kernels
The user has tested kernel 3.1 (Wheezy's stock kernel) and 3.2-rc4 (from Debian's experimental branch). Both kernels expose the same result when firmware is installed. On the other hand, both kernels work fine as soon as the firmware package is unistalled.
4. Additional information
The crash has been tracked in Debian BTS #651532 (full link available in the URL field).
5. Attached logs (4 files):
- "dmesg" and "Xorg.0.log" for kernels 3.2-rc4 and 3.1 when firmware is installed.
- "dmesg" and "Xorg.0.log" for kernels 3.2-rc4 and 3.1 when firmware is not installed.
6. Other considerations
Please, note that I am opening the this bug on behalf of another person who is experiencing the crash. For this reason I'm CC'ing to him.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #1 from Alex Deucher agd5f@yahoo.com 2011-12-14 09:19:34 PST --- In the future, please attach the dmesg and log files directly. It looks like it's a problem with acceleration (which is available without the firmware). I don't see any oops or backtraces in the logs. Can you attach the oops or get a picture of it?
Does setting: Option "NoAccel" "True" in the device section of your xorg.conf fix the problem?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #2 from Alex Deucher agd5f@yahoo.com 2011-12-14 09:22:37 PST --- which is NOT available without the firmware
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #3 from Camaleón noelamac@gmail.com 2011-12-14 09:56:05 PST --- (In reply to comment #1)
In the future, please attach the dmesg and log files directly.
Will do, sorry.
It looks like it's a problem with acceleration (which is available without the firmware). I don't see any oops or backtraces in the logs. Can you attach the oops or get a picture of it?
Kernel oops and backtrace are available at Debian bug. Direct links:
- Kernel 3.1
(syslog) http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=20111209_syslog_...
(snapshot) http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=20111209_snapsho...
- Kernel 3.2-rc4
(syslog) http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=21;filename=20111212_syslog...
(snapshot) http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=21;filename=20111212_snapsh...
I can attach the files to this bug report if you find it convenient.
Does setting: Option "NoAccel" "True" in the device section of your xorg.conf fix the problem?
I have asked the user to try with this option while having the firmware package installed, will report back as soon as I get the results.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #4 from Camaleón noelamac@gmail.com 2011-12-15 00:49:15 PST --- (In reply to comment #3)
I have asked the user to try with this option while having the firmware package installed, will report back as soon as I get the results.
The user reported that both kernels do work (no crashes) with "firmware-linux-nonfree" installed and using this "/etc/X11/xorg.conf" file:
*** Section "Device" Identifier "ATI" Driver "radeon" Option "NoAccel" "True" EndSection
Section "Screen" Identifier "Default Screen" DefaultDepth 24 EndSection ***
This effectively disables 3D acceleration (which means no "gnome-shell") but the user hasn't experienced any further crash since yesterday.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #5 from Alex Deucher agd5f@yahoo.com 2011-12-15 07:29:00 PST --- What version of the 3D driver is he using? You might try a newer 3D driver package. Make sure he is using the r300 gallium driver (r300g).
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #6 from Camaleón noelamac@gmail.com 2011-12-15 07:47:32 PST --- (In reply to comment #5)
What version of the 3D driver is he using?
How could we check this?
You might try a newer 3D driver package. Make sure he is using the r300 gallium driver (r300g).
As he's on Debian Wheezy he has installed "libgl1-mesa-dri (7.11.1-1)" but not sure if this tells you something.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #7 from Alex Deucher agd5f@yahoo.com 2011-12-15 08:43:49 PST --- (In reply to comment #6)
(In reply to comment #5)
What version of the 3D driver is he using?
How could we check this?
Please attach the output of glxinfo.
You might try a newer 3D driver package. Make sure he is using the r300 gallium driver (r300g).
As he's on Debian Wheezy he has installed "libgl1-mesa-dri (7.11.1-1)" but not sure if this tells you something.
Just need to find out if they are using the classic or gallium driver.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #8 from Alex Deucher agd5f@yahoo.com 2011-12-15 08:44:55 PST --- Does the system hang if you remove the NoAccel option but don't load gnome-shell?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #9 from Camaleón noelamac@gmail.com 2011-12-15 13:25:23 PST --- Created attachment 54475 --> https://bugs.freedesktop.org/attachment.cgi?id=54475 glxinfo
I'm attaching the full output of "glxinfo".
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #10 from Camaleón noelamac@gmail.com 2011-12-15 13:28:13 PST --- (In reply to comment #8)
Does the system hang if you remove the NoAccel option but don't load gnome-shell?
Yes, the user has reported that by removing that option from "xorg.conf" file and login into gnome fallback mode (now "gnome classical") the system hung.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
Alex Deucher agd5f@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #54475|application/octet-stream |text/plain mime type| | Attachment #54475|0 |1 is patch| |
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #11 from Alex Deucher agd5f@yahoo.com 2011-12-15 13:40:27 PST --- (In reply to comment #9)
Created attachment 54475 [details] [review] glxinfo
I'm attaching the full output of "glxinfo".
Unfortunately, you'll end up with the software 3D driver if you have acceleration disabled. You'll have to find out what debian uses on wheezy (r300c vs. r300g). However, if you still get hangs even without using 3D, there seems to be a problem with acceleration in general on his system.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #12 from Camaleón noelamac@gmail.com 2011-12-16 00:01:15 PST --- Created attachment 54487 --> https://bugs.freedesktop.org/attachment.cgi?id=54487 glxinfo (with 3D enable)
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #13 from Camaleón noelamac@gmail.com 2011-12-16 00:17:14 PST --- (In reply to comment #11)
Unfortunately, you'll end up with the software 3D driver if you have acceleration disabled.
I have added the full output while 3D accel is enabled, hope this helps.
You'll have to find out what debian uses on wheezy (r300c vs. r300g).
I can't tell... maybe Jonathan can shed some light here :-)
However, if you still get hangs even without using 3D, there seems to be a problem with acceleration in general on his system.
Curious is that system runs stable (no hangs nor crashes) in gnome fallback mode as soon as "firmware-linux-nonfree" package is removed as stated in #c1.
So what we have now is that system does not crash if:
1/ "firmware-linux-nonfree" package is not installed, or 2/ Option "NoAccel" "True" is set at xorg.conf
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #14 from Jonathan Nieder jrnieder@gmail.com 2011-12-16 01:05:48 PST --- Mesa in wheezy ships the gallium r300 driver on all Linux architectures.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #15 from Michel Dänzer michel@daenzer.net 2011-12-16 01:51:06 PST --- (In reply to comment #13)
1/ "firmware-linux-nonfree" package is not installed, or 2/ Option "NoAccel" "True" is set at xorg.conf
These are mostly equivalent, as acceleration is not possible without the microcode with KMS.
What might be interesting would be to try GNOME fallback mode with KMS disabled (radeon.modeset=0) with and without firmware-linux-nonfree installed. Please attach dmesg and Xorg.0.log for both cases again. (The r300g driver doesn't work with KMS disabled)
BTW, does the GNOME fallback mode end up using the same window manager (Metacity?) with and without acceleration being enabled?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #16 from Camaleón noelamac@gmail.com 2011-12-16 13:23:03 PST --- (In reply to comment #15)
(In reply to comment #13)
1/ "firmware-linux-nonfree" package is not installed, or 2/ Option "NoAccel" "True" is set at xorg.conf
These are mostly equivalent, as acceleration is not possible without the microcode with KMS.
What might be interesting would be to try GNOME fallback mode with KMS disabled (radeon.modeset=0) with and without firmware-linux-nonfree installed. Please attach dmesg and Xorg.0.log for both cases again. (The r300g driver doesn't work with KMS disabled)
The user reports that he has tried to disable KMS by all these means:
- Appending "nomodeset" to kernel line - Appending "radeon.modeset=0" to the kernel line - Appending "modeset=0" to the kernel line - Blacklisting radeon module to avoid from loading
But all he gets is a system hang with the following message:
*** Could not update ICEautorithy file /var/lib/gdm3/.ICEauthority Closing session ***
And that's all. He's now stuck at there, forced to load his MacOS X partition in order to get the system up and running.
BTW, does the GNOME fallback mode end up using the same window manager (Metacity?) with and without acceleration being enabled?
I don't know... anyhow, now I have to help the user to restore back his Debian system to an operating state.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #17 from Lucas Stach dev@lynxeye.de 2011-12-16 14:48:58 PST ---
But all he gets is a system hang with the following message:
Could not update ICEautorithy file /var/lib/gdm3/.ICEauthority Closing session
As far as I can tell this has nothing to do with graphics drivers. I hit this too with my home partition improperly tagged for some changes in SELinux. It seems to me this is more a problem of wrong file permissions.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #18 from Camaleón noelamac@gmail.com 2011-12-17 01:44:01 PST --- (In reply to comment #17)
As far as I can tell this has nothing to do with graphics drivers. I hit this too with my home partition improperly tagged for some changes in SELinux. It seems to me this is more a problem of wrong file permissions.
Yes, I know, but this problem was caused as a side effect when trying to disable KMS. Anyway, the user can now again access normally to his system (".ICEauthority" file was manually removed and automatically re-created).
We're ready to run for more tests but please, remember that we're not developers but plain users, we need some guidance on the given steps.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #19 from Camaleón noelamac@gmail.com 2012-01-22 13:46:46 PST --- The user still reports crashes with kernel 3.2.0-rc7-686-pae. I'm attaching the involved files ("syslog" contains the kernel oops).
I can tell the user to run whatever tests you estimate convenient, he is very interested in solving this because he can't use GNOME3 and gnome-shell at all and has to work with no 3D acceleration.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #20 from Camaleón noelamac@gmail.com 2012-01-22 13:47:45 PST --- Created attachment 55998 --> https://bugs.freedesktop.org/attachment.cgi?id=55998 glxinfo
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #21 from Camaleón noelamac@gmail.com 2012-01-22 13:48:12 PST --- Created attachment 55999 --> https://bugs.freedesktop.org/attachment.cgi?id=55999 dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #22 from Camaleón noelamac@gmail.com 2012-01-22 13:49:03 PST --- Created attachment 56000 --> https://bugs.freedesktop.org/attachment.cgi?id=56000 syslog
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #23 from Camaleón noelamac@gmail.com 2012-01-22 13:49:36 PST --- Created attachment 56001 --> https://bugs.freedesktop.org/attachment.cgi?id=56001 xorg.0.log
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #24 from Jonathan Nieder jrnieder@gmail.com 2012-01-22 14:24:16 PST --- bugzilla-daemon@freedesktop.org wrote:
The user still reports crashes with kernel 3.2.0-rc7-686-pae. I'm attaching the involved files ("syslog" contains the kernel oops).
Summary of log: tests were performed on 22 January.
| 20:37:04 linux dbus[1320]: [system] Activating service name='org.freedesktop.Accounts' (using servicehelper) | 20:37:04 linux kernel: [ 724.400329] dbus-daemon-lau: Corrupted page table at address b817800c | 20:37:04 linux kernel: [ 724.400434] *pdpt = 0000000033666001 *pde = fb274e7ffb274e81 | 20:37:04 linux kernel: [ 724.400530] Bad pagetable: 0009 [#1] SMP [... snipping list of modules linked in, because of line wrapping ...] | 20:37:04 linux kernel: [ 724.401727] Pid: 1815, comm: dbus-daemon-lau Not tainted 3.2.0-rc7-686-pae #1 Apple Computer, Inc. iMac5,1/Mac-F4228EC8
The boot was at 20:36. Maybe those 10 minutes came from NTP or something. Next boot:
| 20:41:33 linux dbus[1353]: [system] Activating service name='org.freedesktop.Accounts' (using servicehelper) | 20:41:33 linux dbus[1353]: [system] Successfully activated service 'org.freedesktop.Accounts' | 20:41:33 linux accounts-daemon[2275]: started daemon version 0.6.15 | 20:41:34 linux kernel: [ 41.669373] ssh: Corrupted page table at address 998f31c | 20:41:34 linux kernel: [ 41.669433] *pdpt = 000000002c80d001 *pde = 000000002f980067 *pte = ff192f4fff192f4f | 20:41:34 linux kernel: [ 41.669502] Bad pagetable: 0009 [#1] SMP [...] | 20:41:34 linux kernel: [ 41.670989] Pid: 2279, comm: ssh Not tainted 3.2.0-rc7-686-pae #1 Apple Computer, Inc. iMac5,1/Mac-F4228EC8
Anyway, the page table seems to get corrupted when X starts.
Could you send a log from booting and starting X with drm.debug=6 on the kernel command line?
Thanks, Jonathan
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #25 from Camaleón noelamac@gmail.com 2012-01-23 13:21:32 PST --- Created attachment 56055 --> https://bugs.freedesktop.org/attachment.cgi?id=56055 Dmesg with "drm.debug=6"
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #26 from Camaleón noelamac@gmail.com 2012-01-23 13:22:08 PST --- Created attachment 56056 --> https://bugs.freedesktop.org/attachment.cgi?id=56056 Syslog with "drm.debug=6"
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #27 from Camaleón noelamac@gmail.com 2012-01-23 13:25:10 PST --- (In reply to comment #24)
Could you send a log from booting and starting X with drm.debug=6 on the kernel command line?
I'm attaching "syslog" and "dmesg" with the above kernel option appended. "Glxinfo" and "Xorg.0.log" seem to provide no additional information.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #28 from Jonathan Nieder jrnieder@gmail.com 2012-01-23 13:25:16 PST --- bugzilla-daemon@freedesktop.org wrote:
Dmesg with "drm.debug=6"
Hm, no page table corruption/crash this time?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #29 from Camaleón noelamac@gmail.com 2012-01-23 13:40:19 PST --- (In reply to comment #28)
bugzilla-daemon@freedesktop.org wrote:
Dmesg with "drm.debug=6"
Hm, no page table corruption/crash this time?
I neither see a kernel oops at the "syslog". I just have asked the user if the system crashed this time again.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #30 from Camaleón noelamac@gmail.com 2012-01-23 23:58:35 PST --- (In reply to comment #29)
I neither see a kernel oops at the "syslog". I just have asked the user if the system crashed this time again.
The user reported that system crashed after a while.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #31 from Jonathan Nieder jrnieder@gmail.com 2012-01-24 00:00:35 PST --- bugzilla-daemon@freedesktop.org wrote:
The user reported that system crashed after a while.
Interesting --- so it sounds like there is a random element to this, too. Can we have a log of the crash, please?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #32 from Camaleón noelamac@gmail.com 2012-01-24 01:16:03 PST --- (In reply to comment #31)
bugzilla-daemon@freedesktop.org wrote:
The user reported that system crashed after a while.
Interesting --- so it sounds like there is a random element to this, too. Can we have a log of the crash, please?
I have asked the user for it. He will have to wait until the system locks.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #33 from Camaleón noelamac@gmail.com 2012-01-25 09:16:12 PST --- (In reply to comment #32)
(In reply to comment #31)
bugzilla-daemon@freedesktop.org wrote:
The user reported that system crashed after a while.
Interesting --- so it sounds like there is a random element to this, too. Can we have a log of the crash, please?
I have asked the user for it. He will have to wait until the system locks.
Sorry for the delay (the user was a bit busy fighting against his "VAT taxes").
I'm attaching the syslog for the kernel oops (starts at line "3911"). Any hint to bypass this crash would be very welcome, the user is going nuts with this issue :-(
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #34 from Camaleón noelamac@gmail.com 2012-01-25 09:17:55 PST --- Created attachment 56153 --> https://bugs.freedesktop.org/attachment.cgi?id=56153 Syslog with "drm.debug=6" + call trace
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #35 from Jonathan Nieder jrnieder@gmail.com 2012-01-25 09:37:29 PST --- bugzilla-daemon@freedesktop.org wrote:
Syslog with "drm.debug=6" + call trace
Summary follows. Log is from 25 January.
15:55 bootup 15:55 [after 22 seconds] drm driver loads 15:55 [after 27 seconds] consolekit loads 15:55 [after 33 or so seconds] modesetting again 15:55 [after 39 seconds] gdm startup (this is where it crashed previously) 16:34 [2344 seconds]:
| radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec | ------------[ cut here ]------------ | WARNING: at [...]/drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x22e/0x298 [radeon]() | Hardware name: iMac5,1 | GPU lockup (waiting for 0x0003A3C8 last fence id 0x0003A3C7) | Modules linked in: hid_magicmouse hidp michael_mic arc4 pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) acpi_cpufreq mperf cpufreq_stats cpufreq_conservative cpufreq_powersave cpufreq_userspace parport_pc ppdev lp parport bnep rfcomm binfmt_misc promethean(O) fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc uvcvideo videodev media ssb mmc_core pcmcia pcmcia_core ndiswrapper(O) loop firewire_sbp2 ir_lirc_codec rc_avermedia_m135a lirc_dev mxl5005s cryptd aes_i586 aes_generic ir_mce_kbd_decoder af9013 ecb btusb ir_sony_decoder bluetooth rfkill ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder isight_firmware dvb_usb_af9015 dvb_usb dvb_core ir_nec_decoder rc_core uas hid_apple usb_storage snd_hda_codec_idt lib80211_crypt_tkip usbhid hid wl(P) snd_hda_intel snd_hda_codec radeon snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device ttm drm_kms_helper drm snd i2c_algo_bit i2c_i801 soundcore applesmc i2c_core snd_page_alloc power_supply iTCO_wdt iTCO_vendor_support | processor input_polldev evdev pcspkr lib80211 thermal_sys apple_bl button ext4 mbcache jbd2 crc16 sr_mod cdrom sd_mod crc_t10dif ata_generic firewire_ohci uhci_hcd firewire_core crc_itu_t ata_piix libata ehci_hcd sky2 usbcore scsi_mod [last unloaded: scsi_wait_scan] | Pid: 1515, comm: Xorg Tainted: P O 3.1.0-1-686-pae #1 | Call Trace: | [<c1037698>] ? warn_slowpath_common+0x68/0x79 | [<f88b3ffe>] ? radeon_fence_wait+0x22e/0x298 [radeon] | [<c1037711>] ? warn_slowpath_fmt+0x29/0x2d | [<f88b3ffe>] ? radeon_fence_wait+0x22e/0x298 [radeon] | [<c104cf51>] ? add_wait_queue+0x30/0x30 | [<f872521f>] ? ttm_bo_wait+0xa6/0x153 [ttm] | [<f88c32b6>] ? radeon_bo_wait+0x59/0x77 [radeon] | [<f88c3719>] ? radeon_gem_wait_idle_ioctl+0x2a/0x50 [radeon] | [<f87aedc9>] ? drm_ioctl+0x224/0x2dd [drm] | [<f88c36ef>] ? radeon_gem_busy_ioctl+0x6b/0x6b [radeon] | [<c10110cc>] ? restore_i387_fxsave+0x63/0x70 | [<f87aeba5>] ? drm_copy_field+0x47/0x47 [drm] | [<c10d33d2>] ? do_vfs_ioctl+0x459/0x48f | [<c1011a89>] ? restore_i387_xstate+0x16c/0x1a3 | [<c10adf54>] ? mmap_region+0x2ef/0x3b7 | [<c104334d>] ? recalc_sigpending+0x1f/0x2f | [<c10d344c>] ? sys_ioctl+0x44/0x68 | [<c12b2ddf>] ? sysenter_do_call+0x12/0x28 | ---[ end trace 6b5e1f4e74986b70 ]--- | radeon: wait for empty RBBM fifo failed ! Bad things might happen. | Failed to wait GUI idle while programming pipes. Bad things might happen. | radeon 0000:01:00.0: (rs600_asic_reset:348) RBBM_STATUS=0xB4116100 | radeon 0000:01:00.0: (rs600_asic_reset:367) RBBM_STATUS=0x94010140 | radeon 0000:01:00.0: (rs600_asic_reset:375) RBBM_STATUS=0x94000140 | radeon 0000:01:00.0: (rs600_asic_reset:383) RBBM_STATUS=0x94000140 | radeon 0000:01:00.0: restoring config space at offset 0x1 (was 0x100403, writing 0x100407) | radeon 0000:01:00.0: failed to reset GPU | radeon 0000:01:00.0: GPU reset failed | BUG: unable to handle kernel paging request at f8982990
This is Debian kernel 3.1.8-2 (close to upstream v3.1.8). I don't see any page table corruption this time.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #36 from Alex Deucher agd5f@yahoo.com 2012-01-25 09:54:47 PST --- It's a GPU lockup. Unfortunately, they tend to be really hard to track down. You might try a newer ddx or mesa.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #37 from Camaleón noelamac@gmail.com 2012-01-25 10:10:06 PST --- (In reply to comment #36)
It's a GPU lockup. Unfortunately, they tend to be really hard to track down. You might try a newer ddx or mesa.
We're open to test anything, whatever... but I don't really know what kind of test to suggest to the user, I'm stuck at this point. All we know for sure is that by simply removing the firmware blob it makes the system to run stable but the user needs to have 3D acceleration, otherwise gnome-shell can't run.
How could we test a new ddx (sorry to ask but, what's that? :-?) or an udpated mesa without breaking many things?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #38 from Jonathan Nieder jrnieder@gmail.com 2012-01-25 10:17:22 PST --- bugzilla-daemon@freedesktop.org wrote:
How could we test a new ddx (sorry to ask but, what's that? :-?) or an udpated mesa without breaking many things?
http://pkg-xorg.alioth.debian.org/reference/squeeze-backports.html
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #39 from Camaleón noelamac@gmail.com 2012-01-25 10:35:48 PST --- (In reply to comment #38)
bugzilla-daemon@freedesktop.org wrote:
How could we test a new ddx (sorry to ask but, what's that? :-?) or an udpated mesa without breaking many things?
http://pkg-xorg.alioth.debian.org/reference/squeeze-backports.html
Thanks! But... mmmm... the use runs "wheezy" which I guess includes an updated version of the packages (btw, what are the packages we would need to update?), so I don't see the point for getting them from backports.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #40 from Jonathan Nieder jrnieder@gmail.com 2012-01-25 10:41:38 PST --- bugzilla-daemon@freedesktop.org wrote:
Thanks! But... mmmm... the use runs "wheezy" which I guess includes an updated version of the packages (btw, what are the packages we would need to update?), so I don't see the point for getting them from backports.
Whoops, sorry, I should have remembered.
Mesa is libgl1-mesa-dri and libgl1-mesa-glx. The DDX driver is[1] xserver-xorg-video-radeon and libdrm-radeon1, I suppose. One can get fairly recent versions of most packages from sid or experimental.
[1] http://www.x.org/wiki/Development/Documentation/Glossary
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #41 from Camaleón noelamac@gmail.com 2012-01-25 10:56:22 PST --- (In reply to comment #40)
Mesa is libgl1-mesa-dri and libgl1-mesa-glx. The DDX driver is[1] xserver-xorg-video-radeon and libdrm-radeon1, I suppose. One can get fairly recent versions of most packages from sid or experimental.
[1] http://www.x.org/wiki/Development/Documentation/Glossary
Okay, thanks... wheezy and sid share the same versions of the mentioned packages:
libgl1-mesa-glx (7.11.2-1) libgl1-mesa-glx (7.11.2-1 and others)
xserver-xorg-video-radeon (1:6.14.3-2) xserver-xorg-video-radeon (1:6.14.3-2 and others)
And I can't tell the user to update from experimental, it's too dangerous. Anyway, I understand bugzilla is not the best place to discuss this support things (though I thank your efforts, Jonathan and Xorg people) :-)
I have finally to surrender. If anyone thinks on anything we can try, you can contact directly to me or add the information here. I leave this bug status at your (@xorg devels) consideration.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #42 from Jonathan Nieder jrnieder@gmail.com 2012-01-25 10:57:27 PST --- Jonathan Nieder wrote:
One can get
fairly recent versions of most packages from sid or experimental.
Though at the moment they all match wheezy:
mesa 7.11.2 libdrm 2.4.30 xf86-video-ati 6.14.3
It should be possible to provide instructions to test a snapshot. Which component in particular has interesting recent changes?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #43 from Jonathan Nieder jrnieder@gmail.com 2012-01-25 12:39:49 PST --- bugzilla-daemon@freedesktop.org wrote:
I have finally to surrender. If anyone thinks on anything we can try, you can contact directly to me or add the information here.
Ok, just to fill in the blanks: was this a regression? Was there a kernel or X or GNOME upgrade before which the system worked fine and after which it broke?
X devs: it looks like there are two sets of symptoms here --- sometimes there are GPU lockups, and other times (e.g., the syslog from 2012-01-22) page table corruption with no obvious trouble before that. Questions:
- is there any simple way to figure out what exactly is causing the regression? E.g., after starting X without accelaration can we explicitly provoke whatever caused trouble?
- is it normal that after a lockup the GPU fails to reset? Even if the GPU lockup itself is not well understood, is that later failure fixable?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #44 from Camaleón noelamac@gmail.com 2012-01-25 23:41:02 PST --- (In reply to comment #42)
Ok, just to fill in the blanks: was this a regression? Was there a kernel or X or GNOME upgrade before which the system worked fine and after which it broke?
The problem started at some point between the migration from kernel 2.6.38/2.6.39 (in late November 2011) to 3.0.x and have continued until now (3.1.x).
I don't think this is a kernel issue but a package update because every kernel he has tried since then crashes in the same way. What package exactly started to make noise? I can't tell.
For example, the user reported this trace in November 27th, 2011. He kept kernel 2.6.38 because since kernel 3.0 his system became completely unstable with crashes every day. After trying with more updated kernels, the crashes persisted so while performing several system reinstalls he discovered a pattern, the source of the problem: inestability came as soon as he installed the radeon firmware and enabled acceleration regardless the kernel version.
Kernel failure message 1: ------------[ cut here ]------------ WARNING: at /build/buildd-linux-2.6_2.6.38-5~bpo60+1-i386-B7LqDK/linux-2.6-2.6.38/debian/build/source_i386_none/drivers/gpu/drm/radeon/radeon_fence.c:248 radeon_fence_wait+0x251/0x2d7 [radeon]() Hardware name: iMac5,1 GPU lockup (waiting for 0x0000083B last fence id 0x00000836) Modules linked in: hid_magicmouse nls_utf8 isofs nls_cp437 udf vfat fat hidp acpi_cpufreq mperf cpufreq_conservative cpufreq_powersave cpufreq_userspace cpufreq_stats parport_pc ppdev lp parport sco bridge stp bnep rfcomm l2cap nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs binfmt_misc fuse ssb mmc_core pcmcia pcmcia_core loop firewire_sbp2 snd_hda_codec_idt snd_hda_intel btusb bluetooth rfkill radeon snd_hda_codec snd_hwdep snd_pcm ttm snd_seq snd_timer drm_kms_helper snd_seq_device drm i2c_algo_bit applesmc power_supply snd input_polldev button processor video soundcore snd_page_alloc i2c_i801 rng_core i2c_core isight_firmware ndiswrapper(O) pcspkr tpm_tis tpm tpm_bios thermal_sys hid_apple evdev usb_storage uas usbhid hid ext4 mbcache jbd2 crc16 sg sr_mod cdrom sd_mod crc_t10dif ata_generic uhci_hcd ata_piix libata ehci_hcd scsi_mod usbcore firewire_ohci firewire_core sky2 crc_itu_t nls_base [last unloaded: scsi_wait_scan] Pid: 1340, comm: Xorg Tainted: P W O 2.6.38-bpo.2-686 #1 Call Trace: [<c102fa51>] ? warn_slowpath_common+0x6a/0x7b [<f8523b4d>] ? radeon_fence_wait+0x251/0x2d7 [radeon] [<c102fac8>] ? warn_slowpath_fmt+0x28/0x2c [<f8523b4d>] ? radeon_fence_wait+0x251/0x2d7 [radeon] [<c1044d66>] ? autoremove_wake_function+0x0/0x29 [<f82844bf>] ? ttm_bo_wait+0xad/0x135 [ttm] [<f853505d>] ? radeon_bo_wait+0x5e/0x7c [radeon] [<f85350a2>] ? radeon_gem_wait_idle_ioctl+0x27/0x50 [radeon] [<f82affa4>] ? drm_ioctl+0x224/0x2d5 [drm] [<f853507b>] ? radeon_gem_wait_idle_ioctl+0x0/0x50 [radeon] [<c10a3b97>] ? handle_pte_fault+0x2c5/0x80f [<c11458f8>] ? prio_tree_insert+0x150/0x1cc [<f82afd80>] ? drm_ioctl+0x0/0x2d5 [drm] [<c10c8fb4>] ? do_vfs_ioctl+0x494/0x4df [<c10a7483>] ? mmap_region+0x328/0x3fb [<c10c9043>] ? sys_ioctl+0x44/0x64 [<c1002f1f>] ? sysenter_do_call+0x12/0x28 ---[ end trace 4d111c5bd88900e9 ]---
In brief, the last good-known configuration that worked fine with 3D acceleration enabled was kernel 2.6.38/2.6.39 and GNOME 2 (metacity).
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #45 from Alex Deucher agd5f@yahoo.com 2012-01-26 05:59:43 PST --- Can you narrow down the packages and bisect? Unfortunately, I can't reproduce this on any of the 5xx cards I have access to.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #46 from Jonathan Nieder jrnieder@gmail.com 2012-01-26 10:18:08 PST --- bugzilla-daemon@freedesktop.org wrote:
Can you narrow down the packages and bisect? Unfortunately, I can't reproduce this on any of the 5xx cards I have access to.
If I understand correctly, there is no known-good version of the X/kernel stack, but the regression the user experienced came with the upgrade to GNOME 3.
Please forgive my ignorance: is it possible to (temporarily) configure GNOME 3 not to take advantage of accelaration, or to start X without starting a GNOME session? That might be interesting, since then it might be possible to find some other simpler application that reproduces the same trouble and helps pinpoint what is provoking trouble from that end.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #47 from Michel Dänzer michel@daenzer.net 2012-02-02 04:04:31 PST --- How are the LVDS and DVI displays arranged in the session? Can you attach the output of xrandr?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #48 from Camaleón noelamac@gmail.com 2012-02-02 06:23:13 PST --- Created attachment 56518 --> https://bugs.freedesktop.org/attachment.cgi?id=56518 Output of "xrandr"
In reply to comment #47, I'm attaching the output of "xrandr".
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #49 from Michel Dänzer michel@daenzer.net 2012-02-02 06:36:30 PST --- The kernel DESKTOP_HEIGHT fix from bug 43835 might help for the GPU lockups.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #50 from Michel Dänzer michel@daenzer.net 2012-02-02 06:36:59 PST --- Argh, I mean bug 45329.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #51 from Camaleón noelamac@gmail.com 2012-02-02 08:34:48 PST --- (In reply to comment #50)
Argh, I mean bug 45329.
Thank you, we can do try... what would be the "easy peasy" way to go for it? Mainline kernel 3.3-rc2 contains the mentioned patches? Are there any other packages involved? By reading bug's #45329 comment 9 looks like "xf86-video-ati" also needs to be patched :-?
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #52 from Michel Dänzer michel@daenzer.net 2012-02-02 08:57:40 PST --- (In reply to comment #51)
Mainline kernel 3.3-rc2 contains the mentioned patches?
No. You can try the drm-fixes branch of git://people.freedesktop.org/~airlied/linux.git, but it should be easy to manually apply the patch to any 3.x tree.
Are there any other packages involved?
No.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #53 from Jonathan Nieder jrnieder@gmail.com 2012-02-09 17:55:11 PST --- (In reply to comment #51)
Mainline kernel 3.3-rc2 contains the mentioned patches?
3.3-rc3 does.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #54 from Camaleón noelamac@gmail.com 2012-02-12 03:24:44 PST --- (In reply to comment #53)
(In reply to comment #51)
Mainline kernel 3.3-rc2 contains the mentioned patches?
3.3-rc3 does.
Thanks for the pointer.
The user still reports crashes with the latest mainline kernel (3.3-rc3). I'm attaching the logs, though I can't see any error or kernel trace on them.
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #55 from Camaleón noelamac@gmail.com 2012-02-12 03:26:18 PST --- Created attachment 56910 --> https://bugs.freedesktop.org/attachment.cgi?id=56910 dmesg with kernel 3.3-rc3
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #56 from Camaleón noelamac@gmail.com 2012-02-12 03:27:00 PST --- Created attachment 56911 --> https://bugs.freedesktop.org/attachment.cgi?id=56911 syslog with kernel 3.3-rc3
https://bugs.freedesktop.org/show_bug.cgi?id=43835
--- Comment #57 from Camaleón noelamac@gmail.com 2012-02-12 03:27:40 PST --- Created attachment 56912 --> https://bugs.freedesktop.org/attachment.cgi?id=56912 Xorg.0.log with kernel 3.3-rc3
https://bugs.freedesktop.org/show_bug.cgi?id=43835
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |MOVED Status|NEW |RESOLVED
--- Comment #58 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/238.
dri-devel@lists.freedesktop.org