Hi guys,
this assert_drm_connector_list_read_locked() thing fires here when suspending to disk with Linus' master from around a week ago and tip/master merged ontop.
After I resume, box comes up but wedges solid. I've managed to capture that splat in its whole glory too, see the end of this mail.
Let me know if you need more info.
Thanks.
... [ 6.702579] [drm] ring test on 5 succeeded in 1 usecs [ 6.702584] [drm] UVD initialized successfully. [ 6.703003] kvm: Nested Virtualization enabled [ 6.703008] kvm: Nested Paging enabled [ 6.703236] [drm] ib test on ring 0 succeeded in 0 usecs [ 7.350567] [drm] ib test on ring 5 succeeded [ 7.355083] [drm] Radeon Display Connectors [ 7.355085] [drm] Connector 0: [ 7.355087] [drm] DVI-I-1 [ 7.355088] [drm] HPD1 [ 7.355092] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [ 7.355093] [drm] Encoders: [ 7.355094] [drm] DFP1: INTERNAL_UNIPHY [ 7.355096] [drm] CRT2: INTERNAL_KLDSCP_DAC2 [ 7.355097] [drm] Connector 1: [ 7.355099] [drm] DIN-1 [ 7.355100] [drm] Encoders: [ 7.355101] [drm] TV1: INTERNAL_KLDSCP_DAC2 [ 7.355102] [drm] Connector 2: [ 7.355104] [drm] DVI-I-2 [ 7.355105] [drm] HPD2 [ 7.355108] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [ 7.355109] [drm] Encoders: [ 7.355110] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [ 7.355111] [drm] DFP2: INTERNAL_KLDSCP_LVTMA [ 7.433440] [drm] fb mappable at 0xC0355000 [ 7.433441] [drm] vram apper at 0xC0000000 [ 7.433442] [drm] size 9216000 [ 7.433442] [drm] fb depth is 24 [ 7.433443] [drm] pitch is 7680 [ 7.726046] fbcon: radeondrmfb (fb0) is primary device [ 7.778688] Console: switching to colour frame buffer device 240x75 [ 7.794402] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device [ 7.806218] [drm] Initialized radeon 2.43.0 20080528 for 0000:01:00.0 on minor 0 [ 8.325617] Adding 15624188k swap on /dev/sda1. Priority:-1 extents:1 across:15624188k SS [ 8.388944] EXT4-fs (sda2): re-mounted. Opts: (null) [ 8.463474] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro [ 9.067027] device-mapper: ioctl: 4.33.0-ioctl (2015-8-18) initialised: dm-devel@redhat.com [ 9.156228] fuse init (API version 7.23) [ 9.423650] EXT4-fs (sdb1): recovery complete [ 9.429639] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 9.693493] EXT4-fs (sdc1): recovery complete [ 9.713762] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null) [ 9.722179] EXT4-fs (sdd1): mounting ext3 file system using the ext4 subsystem [ 9.823570] EXT4-fs (sdd1): recovery complete [ 9.841202] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null) [ 10.487124] NET: Registered protocol family 10 [ 10.577013] r8169 0000:02:00.0 eth0: link down [ 10.577139] r8169 0000:02:00.0 eth0: link down [ 10.588052] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 12.163524] r8169 0000:02:00.0 eth0: link up [ 12.170637] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 31.654000] hib.sh (3258): drop_caches: 3 [ 31.658377] PM: Hibernation mode set to 'shutdown' [ 31.669415] PM: Syncing filesystems ... done. [ 31.677368] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 31.686887] PM: Marking nosave pages: [mem 0x00000000-0x00000fff] [ 31.693064] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff] [ 31.699188] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff] [ 31.705428] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff] [ 31.711544] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff] [ 31.717693] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff] [ 31.724553] PM: Basic memory bitmaps created [ 31.728833] PM: Preallocating image memory... done (allocated 129259 pages) [ 32.040891] PM: Allocated 517036 kbytes in 0.30 seconds (1723.45 MB/s) [ 32.047431] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 32.060464] ------------[ cut here ]------------ [ 32.065167] WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 32.075255] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 32.098694] CPU: 4 PID: 863 Comm: kworker/u16:9 Not tainted 4.3.0-rc1+ #1 [ 32.105543] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 32.115538] Workqueue: events_unbound async_run_entry_fn [ 32.120927] ffffffff81959a25 ffff8800ba00fb60 ffffffff812c8c2a 0000000000000000 [ 32.128494] ffff8800ba00fb98 ffffffff81053e55 ffff880429ff1000 ffff8804280caa00 [ 32.136057] 0000000000000003 0000000000000000 ffffffff81960c81 ffff8800ba00fba8 [ 32.143624] Call Trace: [ 32.146106] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 32.151295] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 32.157359] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 32.163253] [<ffffffff813c0228>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 32.170197] [<ffffffff813c066e>] drm_helper_connector_dpms+0x4e/0x110 [ 32.176830] [<ffffffffa00204bf>] radeon_suspend_kms+0x6f/0x380 [radeon] [ 32.183597] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 32.190460] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.196455] [<ffffffffa001e1ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 32.203232] [<ffffffff81311b6a>] pci_pm_freeze+0x6a/0x100 [ 32.208777] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.214756] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 32.220562] [<ffffffff81472498>] __device_suspend+0xf8/0x2d0 [ 32.226369] [<ffffffff8147268f>] async_suspend+0x1f/0xa0 [ 32.231818] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 32.237712] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 32.243604] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 32.249670] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 32.255207] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 32.261274] [<ffffffff81076a58>] kthread+0x108/0x120 [ 32.266382] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.273000] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 32.278450] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.285037] ---[ end trace 5941400179d67357 ]--- [ 32.289771] ------------[ cut here ]------------ [ 32.294456] WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 32.304274] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 32.327736] CPU: 4 PID: 863 Comm: kworker/u16:9 Tainted: G W 4.3.0-rc1+ #1 [ 32.335810] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 32.345803] Workqueue: events_unbound async_run_entry_fn [ 32.351186] ffffffff81959a25 ffff8800ba00fb60 ffffffff812c8c2a 0000000000000000 [ 32.358768] ffff8800ba00fb98 ffffffff81053e55 ffff880429ff1000 ffff8804298fc000 [ 32.366333] 0000000000000003 0000000000000000 0000000000000003 ffff8800ba00fba8 [ 32.373897] Call Trace: [ 32.376378] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 32.381568] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 32.387634] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 32.393528] [<ffffffff813c02c1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 32.400224] [<ffffffffa002a9b0>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 32.407361] [<ffffffff813c071d>] drm_helper_connector_dpms+0xfd/0x110 [ 32.413974] [<ffffffffa00204bf>] radeon_suspend_kms+0x6f/0x380 [radeon] [ 32.420739] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 32.427596] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.433591] [<ffffffffa001e1ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 32.440358] [<ffffffff81311b6a>] pci_pm_freeze+0x6a/0x100 [ 32.445925] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.451903] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 32.457708] [<ffffffff81472498>] __device_suspend+0xf8/0x2d0 [ 32.463518] [<ffffffff8147268f>] async_suspend+0x1f/0xa0 [ 32.468975] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 32.474868] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 32.484062] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 32.493413] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 32.502219] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 32.511569] [<ffffffff81076a58>] kthread+0x108/0x120 [ 32.519970] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.529874] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 32.538660] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.548615] ---[ end trace 5941400179d67358 ]--- [ 32.559780] ------------[ cut here ]------------ [ 32.567771] WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 32.581227] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 32.611715] CPU: 4 PID: 863 Comm: kworker/u16:9 Tainted: G W 4.3.0-rc1+ #1 [ 32.623351] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 32.636827] Workqueue: events_unbound async_run_entry_fn [ 32.645614] ffffffff81959a25 ffff8800ba00fb60 ffffffff812c8c2a 0000000000000000 [ 32.656576] ffff8800ba00fb98 ffffffff81053e55 ffff880429ff1000 ffff8804280c8400 [ 32.667498] 0000000000000003 0000000000000000 ffffffff81960c81 ffff8800ba00fba8 [ 32.678404] Call Trace: [ 32.684244] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 32.692771] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 32.702155] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 32.711341] [<ffffffff813c0228>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 32.721589] [<ffffffff813c066e>] drm_helper_connector_dpms+0x4e/0x110 [ 32.731479] [<ffffffffa00204bf>] radeon_suspend_kms+0x6f/0x380 [radeon] [ 32.741561] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 32.751739] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.761054] [<ffffffffa001e1ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 32.771150] [<ffffffff81311b6a>] pci_pm_freeze+0x6a/0x100 [ 32.780024] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 32.789321] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 32.798439] [<ffffffff81472498>] __device_suspend+0xf8/0x2d0 [ 32.807538] [<ffffffff8147268f>] async_suspend+0x1f/0xa0 [ 32.816291] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 32.825451] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 32.834637] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 32.843983] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 32.852826] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 32.862169] [<ffffffff81076a58>] kthread+0x108/0x120 [ 32.870586] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.880490] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 32.889259] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 32.899184] ---[ end trace 5941400179d67359 ]--- [ 32.907238] ------------[ cut here ]------------ [ 32.915248] WARNING: CPU: 5 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 32.928439] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 32.958998] CPU: 5 PID: 863 Comm: kworker/u16:9 Tainted: G W 4.3.0-rc1+ #1 [ 32.970661] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 32.984157] Workqueue: events_unbound async_run_entry_fn [ 32.992966] ffffffff81959a25 ffff8800ba00fb60 ffffffff812c8c2a 0000000000000000 [ 33.003955] ffff8800ba00fb98 ffffffff81053e55 ffff880429ff1000 ffff8804298fa000 [ 33.014908] 0000000000000003 0000000000000000 0000000000000003 ffff8800ba00fba8 [ 33.025846] Call Trace: [ 33.031701] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 33.040227] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 33.049621] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 33.058833] [<ffffffff813c02c1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 33.068859] [<ffffffffa002a9b0>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 33.079302] [<ffffffff813c071d>] drm_helper_connector_dpms+0xfd/0x110 [ 33.089264] [<ffffffffa00204bf>] radeon_suspend_kms+0x6f/0x380 [radeon] [ 33.099381] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 33.109582] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 33.118898] [<ffffffffa001e1ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 33.129009] [<ffffffff81311b6a>] pci_pm_freeze+0x6a/0x100 [ 33.137867] [<ffffffff81311b00>] ? pci_pm_poweroff+0x100/0x100 [ 33.147183] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 33.156290] [<ffffffff81472498>] __device_suspend+0xf8/0x2d0 [ 33.165406] [<ffffffff8147268f>] async_suspend+0x1f/0xa0 [ 33.174152] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 33.183355] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 33.192568] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 33.201953] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 33.210793] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 33.220187] [<ffffffff81076a58>] kthread+0x108/0x120 [ 33.228596] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 33.238492] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 33.247260] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 33.257225] ---[ end trace 5941400179d6735a ]--- [ 33.562334] PM: freeze of devices complete after 1505.243 msecs [ 33.574510] PM: late freeze of devices complete after 6.235 msecs [ 33.587264] PM: noirq freeze of devices complete after 6.647 msecs [ 33.593490] Disabling non-boot CPUs ... [ 33.618614] smpboot: CPU 1 is now offline [ 33.652741] smpboot: CPU 2 is now offline [ 33.700532] smpboot: CPU 3 is now offline [ 33.742880] smpboot: CPU 4 is now offline [ 33.788740] smpboot: CPU 5 is now offline [ 33.820800] smpboot: CPU 6 is now offline [ 33.852632] smpboot: CPU 7 is now offline [ 33.870237] PM: Creating hibernation image: [ 34.354018] PM: Need to copy 139046 pages [ 34.358033] PM: Normal pages needed: 139046 + 1024, available pages: 4029762 [ 35.110019] PM: Hibernation image created (139046 pages copied) [ 34.279213] LVT offset 0 assigned for vector 0x400 [ 34.284682] Enabling non-boot CPUs ... [ 34.288640] x86: Booting SMP configuration: [ 34.292830] smpboot: Booting Node 0 Processor 1 APIC 0x11 [ 34.320336] cache: parent cpu1 should not be sleeping [ 34.326904] CPU1 is up [ 34.329376] smpboot: Booting Node 0 Processor 2 APIC 0x12 [ 34.356093] cache: parent cpu2 should not be sleeping [ 34.361949] CPU2 is up [ 34.364416] smpboot: Booting Node 0 Processor 3 APIC 0x13 [ 34.395004] cache: parent cpu3 should not be sleeping [ 34.400844] CPU3 is up [ 34.403269] smpboot: Booting Node 0 Processor 4 APIC 0x14 [ 34.426890] cache: parent cpu4 should not be sleeping [ 34.434053] CPU4 is up [ 34.436562] smpboot: Booting Node 0 Processor 5 APIC 0x15 [ 34.463794] cache: parent cpu5 should not be sleeping [ 34.470967] CPU5 is up [ 34.473493] smpboot: Booting Node 0 Processor 6 APIC 0x16 [ 34.500304] cache: parent cpu6 should not be sleeping [ 34.507428] CPU6 is up [ 34.509945] smpboot: Booting Node 0 Processor 7 APIC 0x17 [ 34.535182] cache: parent cpu7 should not be sleeping [ 34.542453] CPU7 is up [ 34.568007] PM: noirq thaw of devices complete after 2.272 msecs [ 34.576209] PM: early thaw of devices complete after 2.129 msecs [ 34.583042] rtc_cmos 00:03: System wakeup disabled by ACPI [ 34.584556] [drm] PCIE gen 2 link speeds already enabled [ 34.584755] serial 00:06: activated [ 34.587880] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000). [ 34.587932] radeon 0000:01:00.0: WB enabled [ 34.587937] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8804293b6c00 [ 34.588352] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0xffffc900008121d0 [ 34.619311] [drm] ring test on 0 succeeded in 0 usecs [ 34.680684] r8169 0000:02:00.0 eth0: link down [ 34.794001] [drm] ring test on 5 succeeded in 1 usecs [ 34.794008] [drm] UVD initialized successfully. [ 34.794168] [drm] ib test on ring 0 succeeded in 0 usecs [ 34.900888] ata5: SATA link down (SStatus 0 SControl 300) [ 34.900955] ata6: SATA link down (SStatus 0 SControl 300) [ 35.072778] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 35.072843] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 35.072903] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 35.072958] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 35.074925] ata2.00: supports DRM functions and may not be fully accessible [ 35.074969] ata1.00: supports DRM functions and may not be fully accessible [ 35.075058] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 35.075097] ata2.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 35.075641] ata1.00: supports DRM functions and may not be fully accessible [ 35.075777] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 35.075793] ata2.00: supports DRM functions and may not be fully accessible [ 35.075888] ata1.00: configured for UDMA/133 [ 35.075927] ata2.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 35.076103] ata2.00: configured for UDMA/133 [ 35.085750] ata4.00: configured for UDMA/133 [ 35.096779] ata3.00: configured for UDMA/133 [ 35.096932] sd 2:0:0:0: [sdc] 1220942646 4096-byte logical blocks: (4.88 TB/4.54 TiB) [ 35.440724] [drm] ib test on ring 5 succeeded [ 35.541611] ------------[ cut here ]------------ [ 35.541624] WARNING: CPU: 0 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 35.541654] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 35.541659] CPU: 0 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc1+ #1 [ 35.541661] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 35.541667] Workqueue: events_unbound async_run_entry_fn [ 35.541674] ffffffff81959a25 ffff88042a21bb80 ffffffff812c8c2a 0000000000000000 [ 35.541679] ffff88042a21bbb8 ffffffff81053e55 ffff880429ff1000 ffff8804280caa00 [ 35.541683] 0000000000000000 0000000000000003 0000000000000000 ffff88042a21bbc8 [ 35.541684] Call Trace: [ 35.541692] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 35.541697] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 35.541700] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 35.541704] [<ffffffff813c0228>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 35.541709] [<ffffffff813c066e>] drm_helper_connector_dpms+0x4e/0x110 [ 35.541713] [<ffffffff813c0cfd>] ? drm_helper_resume_force_mode+0xfd/0x130 [ 35.541748] [<ffffffffa0020a49>] radeon_resume_kms+0x279/0x3b0 [radeon] [ 35.541753] [<ffffffff813118c0>] ? pci_pm_restore+0xd0/0xd0 [ 35.541775] [<ffffffffa001e0bc>] radeon_pmops_thaw+0x1c/0x20 [radeon] [ 35.541778] [<ffffffff8131191f>] pci_pm_thaw+0x5f/0x90 [ 35.541783] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 35.541786] [<ffffffff81471de7>] device_resume+0x97/0x1b0 [ 35.541789] [<ffffffff81471f1d>] async_resume+0x1d/0x50 [ 35.541793] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.541797] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.541800] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.541804] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.541807] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.541812] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.541817] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.541823] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.541827] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.541830] ---[ end trace 5941400179d6735b ]--- [ 35.541832] ------------[ cut here ]------------ [ 35.541838] WARNING: CPU: 0 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 35.541863] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 35.541866] CPU: 0 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc1+ #1 [ 35.541868] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 35.541872] Workqueue: events_unbound async_run_entry_fn [ 35.541877] ffffffff81959a25 ffff88042a21bb80 ffffffff812c8c2a 0000000000000000 [ 35.541882] ffff88042a21bbb8 ffffffff81053e55 ffff880429ff1000 ffff8804298fc000 [ 35.541887] 0000000000000000 0000000000000003 0000000000000000 ffff88042a21bbc8 [ 35.541888] Call Trace: [ 35.541892] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 35.541896] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 35.541899] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 35.541903] [<ffffffff813c02c1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 35.541908] [<ffffffff813c0697>] drm_helper_connector_dpms+0x77/0x110 [ 35.541933] [<ffffffffa002a9b0>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 35.541955] [<ffffffffa0020a49>] radeon_resume_kms+0x279/0x3b0 [radeon] [ 35.541960] [<ffffffff813118c0>] ? pci_pm_restore+0xd0/0xd0 [ 35.541980] [<ffffffffa001e0bc>] radeon_pmops_thaw+0x1c/0x20 [radeon] [ 35.541984] [<ffffffff8131191f>] pci_pm_thaw+0x5f/0x90 [ 35.541987] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 35.541991] [<ffffffff81471de7>] device_resume+0x97/0x1b0 [ 35.541993] [<ffffffff81471f1d>] async_resume+0x1d/0x50 [ 35.541997] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.542000] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.542003] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.542007] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.542011] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.542014] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.542020] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.542024] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.542029] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.542032] ---[ end trace 5941400179d6735c ]--- [ 35.559660] ------------[ cut here ]------------ [ 35.559670] WARNING: CPU: 0 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 35.559698] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 35.559702] CPU: 0 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc1+ #1 [ 35.559704] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 35.559709] Workqueue: events_unbound async_run_entry_fn [ 35.559716] ffffffff81959a25 ffff88042a21bb80 ffffffff812c8c2a 0000000000000000 [ 35.559720] ffff88042a21bbb8 ffffffff81053e55 ffff880429ff1000 ffff8804280c8400 [ 35.559725] 0000000000000000 0000000000000003 0000000000000000 ffff88042a21bbc8 [ 35.559726] Call Trace: [ 35.559732] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 35.559737] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 35.559740] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 35.559744] [<ffffffff813c0228>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 35.559749] [<ffffffff813c066e>] drm_helper_connector_dpms+0x4e/0x110 [ 35.559778] [<ffffffffa002a9b0>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 35.559801] [<ffffffffa0020a49>] radeon_resume_kms+0x279/0x3b0 [radeon] [ 35.559806] [<ffffffff813118c0>] ? pci_pm_restore+0xd0/0xd0 [ 35.559827] [<ffffffffa001e0bc>] radeon_pmops_thaw+0x1c/0x20 [radeon] [ 35.559831] [<ffffffff8131191f>] pci_pm_thaw+0x5f/0x90 [ 35.559835] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 35.559838] [<ffffffff81471de7>] device_resume+0x97/0x1b0 [ 35.559841] [<ffffffff81471f1d>] async_resume+0x1d/0x50 [ 35.559844] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.559848] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.559851] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.559855] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.559859] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.559863] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.559869] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.559873] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.559878] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.559881] ---[ end trace 5941400179d6735d ]--- [ 35.559883] ------------[ cut here ]------------ [ 35.559888] WARNING: CPU: 0 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 35.559941] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 35.559944] CPU: 0 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc1+ #1 [ 35.559946] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 35.559950] Workqueue: events_unbound async_run_entry_fn [ 35.559956] ffffffff81959a25 ffff88042a21bb80 ffffffff812c8c2a 0000000000000000 [ 35.559960] ffff88042a21bbb8 ffffffff81053e55 ffff880429ff1000 ffff8804298fa000 [ 35.559965] 0000000000000000 0000000000000003 0000000000000000 ffff88042a21bbc8 [ 35.559966] Call Trace: [ 35.559971] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 35.559974] [<ffffffff81053e55>] warn_slowpath_common+0x95/0xe0 [ 35.559977] [<ffffffff81053f5a>] warn_slowpath_null+0x1a/0x20 [ 35.559981] [<ffffffff813c02c1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 35.559986] [<ffffffff813c0697>] drm_helper_connector_dpms+0x77/0x110 [ 35.560010] [<ffffffffa002a9b0>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 35.560032] [<ffffffffa0020a49>] radeon_resume_kms+0x279/0x3b0 [radeon] [ 35.560037] [<ffffffff813118c0>] ? pci_pm_restore+0xd0/0xd0 [ 35.560057] [<ffffffffa001e0bc>] radeon_pmops_thaw+0x1c/0x20 [radeon] [ 35.560061] [<ffffffff8131191f>] pci_pm_thaw+0x5f/0x90 [ 35.560064] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 35.560068] [<ffffffff81471de7>] device_resume+0x97/0x1b0 [ 35.560071] [<ffffffff81471f1d>] async_resume+0x1d/0x50 [ 35.560074] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.560077] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.560080] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.560084] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.560087] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.560091] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.560097] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.560101] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.560105] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.560108] ---[ end trace 5941400179d6735e ]--- [ 35.985316] usb 8-2: reset low-speed USB device number 2 using ohci-pci [ 36.233387] r8169 0000:02:00.0 eth0: link up [ 36.876031] PM: thaw of devices complete after 2295.114 msecs [ 36.887191] PM: writing image. [ 36.894508] PM: Using 3 thread(s) for compression. [ 36.894508] PM: Compressing and saving image data (139318 pages)... [ 36.912145] PM: Image saving progress: 0% [ 37.115082] PM: Image saving progress: 10% [ 37.298480] PM: Image saving progress: 20% [ 37.482715] PM: Image saving progress: 30% [ 37.607467] PM: Image saving progress: 40% [ 37.684382] PM: Image saving progress: 50% [ 37.823122] PM: Image saving progress: 60% [ 37.954283] PM: Image saving progress: 70% [ 38.083761] PM: Image saving progress: 80% [ 38.213594] PM: Image saving progress: 90% [ 38.323446] PM: Image saving progress: 100% [ 38.329418] PM: Image saving done. [ 38.333916] PM: Wrote 557272 kbytes in 1.41 seconds (395.22 MB/s) [ 38.341724] PM: S| [ 38.443187] kvm: exiting hardware virtualization [ 38.462813] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0010 address=0x0000000020001000 flags=0x0000] [ 38.654130] sd 3:0:0:0: [sdd] Synchronizing SCSI cache [ 38.662313] sd 3:0:0:0: [sdd] Stopping disk [ 39.541392] sd 2:0:0:0: [sdc] Synchronizing SCSI cache [ 39.547859] sd 2:0:0:0: [sdc] Stopping disk [ 39.726979] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [ 39.734877] sd 1:0:0:0: [sdb] Stopping disk [ 40.040149] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 40.048976] sd 0:0:0:0: [sda] Stopping disk [ 40.218543] pcieport 0000:00:04.0: System wakeup enabled by ACPI [ 40.246902] ACPI: Preparing to enter system sleep state S5 [ 40.253967] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline [ 40.264950] reboot: Power down [ 40.270127] acpi_power_off called
Resuming:
[ 5.783459] PM: Checking hibernation image partition /dev/sda1 [ 5.789385] PM: Hibernation image partition 8:1 present [ 5.794691] PM: Looking for hibernation image. [ 5.801591] PM: Image signature found, resuming [ 5.808550] PM: Preparing processes for restore. [ 5.813199] Freezing user space processes ... [ 5.814966] hid-generic 0003:04B4:0101.0003: input,hidraw2: USB HID v1.00 Device [DATACOMP SteelS쀁̄Љ̒DATA] on usb-0000:00:12.0-2/input1
[ 5.830177] (elapsed 0.000 seconds) done. [ 5.834597] PM: Loading hibernation image. [ 5.839039] PM: Marking nosave pages: [mem 0x00000000-0x00000fff] [ 5.845170] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff] [ 5.851309] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff] [ 5.857569] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff] [ 5.863710] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff] [ 5.869861] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff] [ 5.876765] PM: Basic memory bitmaps created [ 5.892558] PM: Using 3 thread(s) for decompression. [ 5.892558] PM: Loading and decompressing image data (139318 pages)... [ 5.969429] PM: Image loading progress: 0% [ 6.268254] PM: Image loading progress: 10% [ 6.344699] random: nonblocking pool is initialized [ 6.366397] PM: Image loading progress: 20% [ 6.457787] PM: Image loading progress: 30% [ 6.529543] PM: Image loading progress: 40% [ 6.601163] PM: Image loading progress: 50% [ 6.675339] PM: Image loading progress: 60% [ 6.749853] PM: Image loading progress: 70% [ 6.818361] PM: Image loading progress: 80% [ 6.887274] PM: Image loading progress: 90% [ 6.959711] PM: Image loading progress: 100% [ 6.964198] PM: Image loading done. [ 6.967814] PM: Read 557272 kbytes in 1.06 seconds (525.72 MB/s) [ 6.977129] PM: Image successfully loaded [ 7.204973] PM: quiesce of devices complete after 222.428 msecs [ 7.212819] PM: late quiesce of devices complete after 1.841 msecs [ 7.233684] PM: noirq quiesce of devices complete after 14.604 msecs [ 7.240132] Disabling non-boot CPUs ... [ 34.135352] LVT offset 0 assigned for vector 0x400 [ 34.140551] Enabling non-boot CPUs ... [ 34.144418] x86: Booting SMP configuration: [ 34.148602] smpboot: Booting Node 0 Processor 1 APIC 0x11 [ 34.178841] cache: parent cpu1 should not be sleeping [ 34.184725] CPU1 is up [ 34.187142] smpboot: Booting Node 0 Processor 2 APIC 0x12 [ 34.210663] cache: parent cpu2 should not be sleeping [ 34.216481] CPU2 is up [ 34.218908] smpboot: Booting Node 0 Processor 3 APIC 0x13 [ 34.243066] cache: parent cpu3 should not be sleeping [ 34.248931] CPU3 is up [ 34.251404] smpboot: Booting Node 0 Processor 4 APIC 0x14 [ 34.271128] cache: parent cpu4 should not be sleeping [ 34.276947] CPU4 is up [ 34.279378] smpboot: Booting Node 0 Processor 5 APIC 0x15 [ 34.299359] cache: parent cpu5 should not be sleeping [ 34.305220] CPU5 is up [ 34.307638] smpboot: Booting Node 0 Processor 6 APIC 0x16 [ 34.331611] cache: parent cpu6 should not be sleeping [ 34.337455] CPU6 is up [ 34.339872] smpboot: Booting Node 0 Processor 7 APIC 0x17 [ 34.363915] cache: parent cpu7 should not be sleeping [ 34.369757] CPU7 is up [ 34.382276] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 34.390305] IP: [<ffffffff81323016>] pci_restore_msi_state+0x196/0x240 [ 34.396997] PGD b9ea8067 PUD ba1e1067 PMD 0 [ 34.401468] Oops: 0000 [#1] PREEMPT SMP [ 34.405593] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 34.428932] CPU: 6 PID: 821 Comm: kworker/u16:7 Tainted: G W 4.3.0-rc1+ #1 [ 34.437077] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 34.447131] Workqueue: events_unbound async_run_entry_fn [ 34.452606] task: ffff880429f497c0 ti: ffff880428b00000 task.ti: ffff880428b00000 [ 34.460225] RIP: 0010:[<ffffffff81323016>] [<ffffffff81323016>] pci_restore_msi_state+0x196/0x240 [ 34.469341] RSP: 0018:ffff880428b03c40 EFLAGS: 00010286 [ 34.474792] RAX: 0000000000000000 RBX: ffff880429f8f000 RCX: 0000000000000000 [ 34.482061] RDX: 0000000000003800 RSI: ffffffff813064d7 RDI: ffffffff816c3dfb [ 34.489333] RBP: ffff880428b03c58 R08: 0000000000000000 R09: 0000000000000000 [ 34.496602] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 34.503874] R13: 0000000000000000 R14: ffff8804289dff20 R15: ffffffff81960bd2 [ 34.511146] FS: 00007f2b1e84b700(0000) GS:ffff88042d200000(0000) knlGS:0000000000000000 [ 34.519379] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 34.525271] CR2: 0000000000000034 CR3: 00000000b9cb7000 CR4: 00000000000406e0 [ 34.532550] Stack: [ 34.534709] 0080002c29f8f000 0000000000000000 ffff880429f8f000 ffff880428b03c78 [ 34.542316] ffffffff8130ddf7 ffff880429f8f098 ffff880429f8f000 ffff880428b03c88 [ 34.549917] ffffffff8130df58 ffff880428b03cb0 ffffffff81311dac ffff880429f8f098 [ 34.557517] Call Trace: [ 34.560111] [<ffffffff8130ddf7>] pci_restore_state.part.34+0xc7/0x210 [ 34.566782] [<ffffffff8130df58>] pci_restore_state+0x18/0x20 [ 34.572675] [<ffffffff81311dac>] pci_pm_restore_noirq+0x4c/0x100 [ 34.578914] [<ffffffff81311d60>] ? pci_pm_suspend+0x160/0x160 [ 34.584895] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 34.590787] [<ffffffff81471aa3>] device_resume_noirq+0x93/0x150 [ 34.596939] [<ffffffff81471b7d>] async_resume_noirq+0x1d/0x50 [ 34.602919] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 34.608899] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 34.614880] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 34.621031] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 34.626664] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 34.632819] [<ffffffff81076a58>] kthread+0x108/0x120 [ 34.638018] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 34.644692] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 34.650236] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 34.656900] Code: 66 89 4d ee 0f b7 c9 e8 e9 41 fe ff 48 89 df e8 c1 60 ce ff 0f b6 53 4b 8b 73 38 48 8d 4d ee 48 8b 7b 10 83 c2 02 e8 2a 34 fe ff <41> 0f b6 4c 24 34 41 8b 54 24 30 be ff ff ff ff c0 e9 04 83 e1 [ 34.677058] RIP [<ffffffff81323016>] pci_restore_msi_state+0x196/0x240 [ 34.683826] RSP <ffff880428b03c40> [ 34.687456] CR2: 0000000000000034 [ 34.690915] ---[ end trace 5941400179d6735b ]--- [ 34.695554] BUG: unable to handle kernel paging request at ffffffffffffff98 [ 34.702692] IP: [<ffffffff81077380>] kthread_data+0x10/0x20 [ 34.708419] PGD 19ed067 PUD 19ef067 PMD 0 [ 34.712709] Oops: 0000 [#2] PREEMPT SMP [ 34.716851] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 34.740207] CPU: 6 PID: 821 Comm: kworker/u16:7 Tainted: G D W 4.3.0-rc1+ #1 [ 34.748354] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 34.758421] task: ffff880429f497c0 ti: ffff880428b00000 task.ti: ffff880428b00000 [ 34.766050] RIP: 0010:[<ffffffff81077380>] [<ffffffff81077380>] kthread_data+0x10/0x20 [ 34.774222] RSP: 0018:ffff880428b03948 EFLAGS: 00010002 [ 34.779689] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000 [ 34.786977] RDX: 000000015b3d1188 RSI: 0000000000000006 RDI: ffff880429f497c0 [ 34.794266] RBP: ffff880428b03948 R08: ffff880429f49848 R09: ffff88042d3d60b0 [ 34.801554] R10: 0000000000000001 R11: 0000000000000000 R12: 00000000001d6000 [ 34.808842] R13: ffff88042d3d6018 R14: ffff880429f497c0 R15: 0000000000000006 [ 34.816131] FS: 00007f2b1e84b700(0000) GS:ffff88042d200000(0000) knlGS:0000000000000000 [ 34.824378] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 34.830283] CR2: 0000000000000028 CR3: 00000000b9cb7000 CR4: 00000000000406e0 [ 34.837577] Stack: [ 34.839762] ffff880428b03960 ffffffff81071321 ffff88042d3d6000 ffff880428b039b8 [ 34.847380] ffffffff816be2bb ffff880037b4c010 ffff880428b039a8 ffffffff00000000 [ 34.854997] ffff880429f497c0 ffff880428b04000 ffff880428b03a10 ffff880428b03a10 [ 34.862616] Call Trace: [ 34.865225] [<ffffffff81071321>] wq_worker_sleeping+0x11/0x90 [ 34.871212] [<ffffffff816be2bb>] __schedule+0x58b/0xe90 [ 34.876682] [<ffffffff816bec4d>] schedule+0x3d/0x90 [ 34.881802] [<ffffffff810565e1>] do_exit+0x711/0xaf0 [ 34.887012] [<ffffffff8100809c>] oops_end+0x6c/0x90 [ 34.892132] [<ffffffff810462e5>] no_context+0x155/0x380 [ 34.897602] [<ffffffff810a24bd>] ? __lock_acquire+0x62d/0x19e0 [ 34.903676] [<ffffffff81046619>] __bad_area_nosemaphore+0x109/0x210 [ 34.910184] [<ffffffff812f5537>] ? debug_smp_processor_id+0x17/0x20 [ 34.916692] [<ffffffff81046733>] bad_area_nosemaphore+0x13/0x20 [ 34.922844] [<ffffffff81046b74>] __do_page_fault+0x1e4/0x360 [ 34.928740] [<ffffffff81000f60>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 34.935411] [<ffffffff81046d2c>] do_page_fault+0xc/0x10 [ 34.940872] [<ffffffff816c676f>] page_fault+0x1f/0x30 [ 34.946158] [<ffffffff813064d7>] ? pci_bus_read_config_word+0x97/0xa0 [ 34.952831] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 34.959762] [<ffffffff81323016>] ? pci_restore_msi_state+0x196/0x240 [ 34.966350] [<ffffffff8130ddf7>] pci_restore_state.part.34+0xc7/0x210 [ 34.973021] [<ffffffff8130df58>] pci_restore_state+0x18/0x20 [ 34.978907] [<ffffffff81311dac>] pci_pm_restore_noirq+0x4c/0x100 [ 34.985147] [<ffffffff81311d60>] ? pci_pm_suspend+0x160/0x160 [ 34.991126] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 34.997010] [<ffffffff81471aa3>] device_resume_noirq+0x93/0x150 [ 35.003174] [<ffffffff81471b7d>] async_resume_noirq+0x1d/0x50 [ 35.009151] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.015131] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.021102] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.027245] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.032870] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.039013] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.044205] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.050869] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.056408] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.063071] Code: ba 02 00 00 00 e8 31 f9 ff ff 5d c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 f8 03 00 00 55 48 89 e5 <48> 8b 40 98 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 [ 35.083253] RIP [<ffffffff81077380>] kthread_data+0x10/0x20 [ 35.089086] RSP <ffff880428b03948> [ 35.092735] CR2: ffffffffffffff98 [ 35.096203] ---[ end trace 5941400179d6735c ]--- [ 35.100819] Fixing recursive fault but reboot is needed! [ 35.106124] BUG: scheduling while atomic: kworker/u16:7/821/0x00000004 [ 35.112649] INFO: lockdep is turned off. [ 35.116565] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 amd64_edac_mod lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 35.139592] irq event stamp: 5022 [ 35.142902] hardirqs last enabled at (5021): [<ffffffff816c3e15>] _raw_spin_unlock_irqrestore+0x65/0x80 [ 35.152391] hardirqs last disabled at (5022): [<ffffffff816c6940>] error_entry+0x60/0xb0 [ 35.160493] softirqs last enabled at (1794): [<ffffffff81148815>] bdi_register+0xf5/0x200 [ 35.168770] softirqs last disabled at (1792): [<ffffffff811487f3>] bdi_register+0xd3/0x200 [ 35.177047] Preemption disabled at:[<ffffffff8100809c>] oops_end+0x6c/0x90 [ 35.183928] [ 35.185418] CPU: 6 PID: 821 Comm: kworker/u16:7 Tainted: G D W 4.3.0-rc1+ #1 [ 35.193417] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 35.203328] 00000000001d6000 ffff880428b03648 ffffffff812c8c2a ffff880429f497c0 [ 35.210783] ffff880428b03660 ffffffff8107e19a ffff88042d3d6000 ffff880428b036b8 [ 35.218255] ffffffff816be424 ffff880428b036d0 ffffffff8112779e 0000000000000008 [ 35.225714] Call Trace: [ 35.228189] [<ffffffff812c8c2a>] dump_stack+0x4e/0x84 [ 35.233324] [<ffffffff8107e19a>] __schedule_bug+0x6a/0xd0 [ 35.238809] [<ffffffff816be424>] __schedule+0x6f4/0xe90 [ 35.244114] [<ffffffff8112779e>] ? printk+0x48/0x50 [ 35.249079] [<ffffffff816bec4d>] schedule+0x3d/0x90 [ 35.254036] [<ffffffff81056770>] do_exit+0x8a0/0xaf0 [ 35.259090] [<ffffffff810b6e25>] ? kmsg_dump+0x135/0x180 [ 35.264487] [<ffffffff810b6d12>] ? kmsg_dump+0x22/0x180 [ 35.269801] [<ffffffff8100809c>] oops_end+0x6c/0x90 [ 35.274766] [<ffffffff810462e5>] no_context+0x155/0x380 [ 35.280079] [<ffffffff812d5e94>] ? delay_tsc+0x94/0xc0 [ 35.285304] [<ffffffff81046619>] __bad_area_nosemaphore+0x109/0x210 [ 35.291656] [<ffffffff81046733>] bad_area_nosemaphore+0x13/0x20 [ 35.297661] [<ffffffff81046b74>] __do_page_fault+0x1e4/0x360 [ 35.303407] [<ffffffff81000f60>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 35.309934] [<ffffffff81046d2c>] do_page_fault+0xc/0x10 [ 35.315246] [<ffffffff816c676f>] page_fault+0x1f/0x30 [ 35.320385] [<ffffffff81077380>] ? kthread_data+0x10/0x20 [ 35.325869] [<ffffffff81071321>] wq_worker_sleeping+0x11/0x90 [ 35.331701] [<ffffffff816be2bb>] __schedule+0x58b/0xe90 [ 35.337006] [<ffffffff816bec4d>] schedule+0x3d/0x90 [ 35.341971] [<ffffffff810565e1>] do_exit+0x711/0xaf0 [ 35.347024] [<ffffffff8100809c>] oops_end+0x6c/0x90 [ 35.351989] [<ffffffff810462e5>] no_context+0x155/0x380 [ 35.357303] [<ffffffff810a24bd>] ? __lock_acquire+0x62d/0x19e0 [ 35.363220] [<ffffffff81046619>] __bad_area_nosemaphore+0x109/0x210 [ 35.369573] [<ffffffff812f5537>] ? debug_smp_processor_id+0x17/0x20 [ 35.375925] [<ffffffff81046733>] bad_area_nosemaphore+0x13/0x20 [ 35.381930] [<ffffffff81046b74>] __do_page_fault+0x1e4/0x360 [ 35.387677] [<ffffffff81000f60>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 35.394203] [<ffffffff81046d2c>] do_page_fault+0xc/0x10 [ 35.399515] [<ffffffff816c676f>] page_fault+0x1f/0x30 [ 35.404653] [<ffffffff813064d7>] ? pci_bus_read_config_word+0x97/0xa0 [ 35.411179] [<ffffffff816c3dfb>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 35.417963] [<ffffffff81323016>] ? pci_restore_msi_state+0x196/0x240 [ 35.424394] [<ffffffff8130ddf7>] pci_restore_state.part.34+0xc7/0x210 [ 35.430919] [<ffffffff8130df58>] pci_restore_state+0x18/0x20 [ 35.436666] [<ffffffff81311dac>] pci_pm_restore_noirq+0x4c/0x100 [ 35.442757] [<ffffffff81311d60>] ? pci_pm_suspend+0x160/0x160 [ 35.448591] [<ffffffff8147180a>] dpm_run_callback+0x7a/0x280 [ 35.454336] [<ffffffff81471aa3>] device_resume_noirq+0x93/0x150 [ 35.460341] [<ffffffff81471b7d>] async_resume_noirq+0x1d/0x50 [ 35.466173] [<ffffffff81079666>] async_run_entry_fn+0x46/0xf0 [ 35.472005] [<ffffffff810700fc>] process_one_work+0x1ec/0x610 [ 35.477830] [<ffffffff81070058>] ? process_one_work+0x148/0x610 [ 35.483835] [<ffffffff81070586>] worker_thread+0x66/0x450 [ 35.489321] [<ffffffff81070520>] ? process_one_work+0x610/0x610 [ 35.495326] [<ffffffff81076a58>] kthread+0x108/0x120 [ 35.500380] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0 [ 35.506905] [<ffffffff816c4cff>] ret_from_fork+0x3f/0x70 [ 35.512302] [<ffffffff81076950>] ? kthread_create_on_node+0x1f0/0x1f0
On Mon, Sep 21, 2015 at 03:31:26PM +0200, Borislav Petkov wrote:
Hi guys,
this assert_drm_connector_list_read_locked() thing fires here when suspending to disk with Linus' master from around a week ago and tip/master merged ontop.
After I resume, box comes up but wedges solid. I've managed to capture that splat in its whole glory too, see the end of this mail.
Btw, this happens on rc2+tip too.
On Mon, Sep 21, 2015 at 9:31 AM, Borislav Petkov bp@alien8.de wrote:
Hi guys,
this assert_drm_connector_list_read_locked() thing fires here when suspending to disk with Linus' master from around a week ago and tip/master merged ontop.
After I resume, box comes up but wedges solid. I've managed to capture that splat in its whole glory too, see the end of this mail.
Let me know if you need more info.
Thanks.
What system is this? What GPU are you using? Can you bisect?
Alex
Hi Alex,
On Tue, Sep 22, 2015 at 03:58:03PM -0400, Alex Deucher wrote:
What system is this?
my workstation - an
"To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013"
you gotta love the "To be filled" crap. In any case, it is an ASUS M5A97 EVO R2.0. RD890 chip AFAICT.
What GPU are you using?
RV635. Here's some dmesg:
[ 6.489016] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1043:0x01DA). [ 7.509177] radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used) [ 7.518010] radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF [ 7.525724] [drm] Detected VRAM RAM=512M, BAR=256M [ 7.530608] [drm] RAM width 128bits DDR [ 7.535168] [TTM] Zone kernel: Available graphics memory: 8132226 kiB [ 7.541779] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 7.548420] [TTM] Initializing pool allocator [ 7.552896] [TTM] Initializing DMA pool allocator [ 7.558176] [drm] radeon: 512M of VRAM memory ready [ 7.563131] [drm] radeon: 512M of GTT memory ready. [ 7.568151] [drm] Loading RV635 Microcode [ 7.577382] [drm] Internal thermal controller without fan control [ 7.584349] [drm] radeon: power management initialized [ 7.590443] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 7.597266] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 7.624386] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000). [ 7.631544] radeon 0000:01:00.0: WB enabled [ 7.635794] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880427ef7c00 [ 7.647039] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0xffffc900008121d0 [ 7.657924] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 7.664601] [drm] Driver supports precise vblank timestamp query. [ 7.670780] radeon 0000:01:00.0: radeon: MSI limited to 32-bit [ 7.676801] radeon 0000:01:00.0: radeon: using MSI. [ 7.681863] [drm] radeon: irq initialized. [ 7.717757] [drm] ring test on 0 succeeded in 0 usecs [ 7.897466] [drm] ring test on 5 succeeded in 1 usecs [ 7.902585] [drm] UVD initialized successfully. [ 7.908108] [drm] ib test on ring 0 succeeded in 0 usecs [ 8.558968] [drm] ib test on ring 5 succeeded [ 8.568734] [drm] Radeon Display Connectors [ 8.573005] [drm] Connector 0: [ 8.576189] [drm] DVI-I-1 [ 8.579062] [drm] HPD1 [ 8.581657] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [ 8.589172] [drm] Encoders: [ 8.592234] [drm] DFP1: INTERNAL_UNIPHY [ 8.596492] [drm] CRT2: INTERNAL_KLDSCP_DAC2 [ 8.601182] [drm] Connector 1: [ 8.604302] [drm] DIN-1 [ 8.607012] [drm] Encoders: [ 8.610043] [drm] TV1: INTERNAL_KLDSCP_DAC2 [ 8.614642] [drm] Connector 2: [ 8.617760] [drm] DVI-I-2 [ 8.620621] [drm] HPD2 [ 8.623226] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [ 8.630719] [drm] Encoders: [ 8.633749] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [ 8.638436] [drm] DFP2: INTERNAL_KLDSCP_LVTMA [ 8.719815] [drm] fb mappable at 0xC0355000 [ 8.724089] [drm] vram apper at 0xC0000000 [ 8.728243] [drm] size 9216000 [ 8.731371] [drm] fb depth is 24 [ 8.734664] [drm] pitch is 7680 [ 8.739009] fbcon: radeondrmfb (fb0) is primary device [ 8.802887] Console: switching to colour frame buffer device 240x75 [ 8.818487] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device [ 8.824948] radeon 0000:01:00.0: registered panic notifier [ 8.846452] [drm] Initialized radeon 2.42.0 20080528 for 0000:01:00.0 on minor 0
Can you bisect?
It is my workstation so it will take longer but I'll try.
If you can think of some particular commits I should try, let me know.
Btw, I have the following things selected in my config, maybe something's missing there:
CONFIG_DRM=y CONFIG_DRM_KMS_HELPER=y CONFIG_DRM_KMS_FB_HELPER=y CONFIG_DRM_FBDEV_EMULATION=y # CONFIG_DRM_LOAD_EDID_FIRMWARE is not set CONFIG_DRM_TTM=y # CONFIG_DRM_I2C_ADV7511 is not set # CONFIG_DRM_I2C_CH7006 is not set # CONFIG_DRM_I2C_SIL164 is not set # CONFIG_DRM_I2C_NXP_TDA998X is not set # CONFIG_DRM_TDFX is not set # CONFIG_DRM_R128 is not set CONFIG_DRM_RADEON=m # CONFIG_DRM_RADEON_USERPTR is not set # CONFIG_DRM_RADEON_UMS is not set CONFIG_DRM_AMDGPU=y # CONFIG_DRM_AMDGPU_CIK is not set # CONFIG_DRM_AMDGPU_USERPTR is not set # CONFIG_DRM_NOUVEAU is not set # CONFIG_DRM_I915 is not set # CONFIG_DRM_MGA is not set # CONFIG_DRM_SIS is not set # CONFIG_DRM_VIA is not set # CONFIG_DRM_SAVAGE is not set # CONFIG_DRM_VGEM is not set # CONFIG_DRM_VMWGFX is not set # CONFIG_DRM_GMA500 is not set # CONFIG_DRM_UDL is not set # CONFIG_DRM_AST is not set # CONFIG_DRM_MGAG200 is not set # CONFIG_DRM_CIRRUS_QEMU is not set # CONFIG_DRM_QXL is not set # CONFIG_DRM_BOCHS is not set # CONFIG_DRM_VIRTIO_GPU is not set CONFIG_DRM_BRIDGE=y
I'm attaching the whole .config too.
Thanks!
On Tue, Sep 22, 2015 at 4:21 PM, Borislav Petkov bp@alien8.de wrote:
Hi Alex,
On Tue, Sep 22, 2015 at 03:58:03PM -0400, Alex Deucher wrote:
What system is this?
my workstation - an
"To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013"
you gotta love the "To be filled" crap. In any case, it is an ASUS M5A97 EVO R2.0. RD890 chip AFAICT.
What GPU are you using?
RV635. Here's some dmesg:
[ 6.489016] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1043:0x01DA). [ 7.509177] radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used) [ 7.518010] radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF [ 7.525724] [drm] Detected VRAM RAM=512M, BAR=256M [ 7.530608] [drm] RAM width 128bits DDR [ 7.535168] [TTM] Zone kernel: Available graphics memory: 8132226 kiB [ 7.541779] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 7.548420] [TTM] Initializing pool allocator [ 7.552896] [TTM] Initializing DMA pool allocator [ 7.558176] [drm] radeon: 512M of VRAM memory ready [ 7.563131] [drm] radeon: 512M of GTT memory ready. [ 7.568151] [drm] Loading RV635 Microcode [ 7.577382] [drm] Internal thermal controller without fan control [ 7.584349] [drm] radeon: power management initialized [ 7.590443] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 7.597266] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 7.624386] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000). [ 7.631544] radeon 0000:01:00.0: WB enabled [ 7.635794] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880427ef7c00 [ 7.647039] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0xffffc900008121d0 [ 7.657924] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 7.664601] [drm] Driver supports precise vblank timestamp query. [ 7.670780] radeon 0000:01:00.0: radeon: MSI limited to 32-bit [ 7.676801] radeon 0000:01:00.0: radeon: using MSI. [ 7.681863] [drm] radeon: irq initialized. [ 7.717757] [drm] ring test on 0 succeeded in 0 usecs [ 7.897466] [drm] ring test on 5 succeeded in 1 usecs [ 7.902585] [drm] UVD initialized successfully. [ 7.908108] [drm] ib test on ring 0 succeeded in 0 usecs [ 8.558968] [drm] ib test on ring 5 succeeded [ 8.568734] [drm] Radeon Display Connectors [ 8.573005] [drm] Connector 0: [ 8.576189] [drm] DVI-I-1 [ 8.579062] [drm] HPD1 [ 8.581657] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [ 8.589172] [drm] Encoders: [ 8.592234] [drm] DFP1: INTERNAL_UNIPHY [ 8.596492] [drm] CRT2: INTERNAL_KLDSCP_DAC2 [ 8.601182] [drm] Connector 1: [ 8.604302] [drm] DIN-1 [ 8.607012] [drm] Encoders: [ 8.610043] [drm] TV1: INTERNAL_KLDSCP_DAC2 [ 8.614642] [drm] Connector 2: [ 8.617760] [drm] DVI-I-2 [ 8.620621] [drm] HPD2 [ 8.623226] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [ 8.630719] [drm] Encoders: [ 8.633749] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [ 8.638436] [drm] DFP2: INTERNAL_KLDSCP_LVTMA [ 8.719815] [drm] fb mappable at 0xC0355000 [ 8.724089] [drm] vram apper at 0xC0000000 [ 8.728243] [drm] size 9216000 [ 8.731371] [drm] fb depth is 24 [ 8.734664] [drm] pitch is 7680 [ 8.739009] fbcon: radeondrmfb (fb0) is primary device [ 8.802887] Console: switching to colour frame buffer device 240x75 [ 8.818487] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device [ 8.824948] radeon 0000:01:00.0: registered panic notifier [ 8.846452] [drm] Initialized radeon 2.42.0 20080528 for 0000:01:00.0 on minor 0
Can you bisect?
It is my workstation so it will take longer but I'll try.
If you can think of some particular commits I should try, let me know.
Sorry, I can't think of anything off hand. I suspect it was some change or cleanup in the core drm code.
Alex
On Tue, Sep 22, 2015 at 04:54:54PM -0400, Alex Deucher wrote:
On Tue, Sep 22, 2015 at 4:21 PM, Borislav Petkov bp@alien8.de wrote:
Hi Alex,
On Tue, Sep 22, 2015 at 03:58:03PM -0400, Alex Deucher wrote:
What system is this?
my workstation - an
"To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013"
you gotta love the "To be filled" crap. In any case, it is an ASUS M5A97 EVO R2.0. RD890 chip AFAICT.
What GPU are you using?
RV635. Here's some dmesg:
[ 6.489016] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1043:0x01DA). [ 7.509177] radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used) [ 7.518010] radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF [ 7.525724] [drm] Detected VRAM RAM=512M, BAR=256M [ 7.530608] [drm] RAM width 128bits DDR [ 7.535168] [TTM] Zone kernel: Available graphics memory: 8132226 kiB [ 7.541779] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 7.548420] [TTM] Initializing pool allocator [ 7.552896] [TTM] Initializing DMA pool allocator [ 7.558176] [drm] radeon: 512M of VRAM memory ready [ 7.563131] [drm] radeon: 512M of GTT memory ready. [ 7.568151] [drm] Loading RV635 Microcode [ 7.577382] [drm] Internal thermal controller without fan control [ 7.584349] [drm] radeon: power management initialized [ 7.590443] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 7.597266] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 7.624386] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000). [ 7.631544] radeon 0000:01:00.0: WB enabled [ 7.635794] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff880427ef7c00 [ 7.647039] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0xffffc900008121d0 [ 7.657924] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 7.664601] [drm] Driver supports precise vblank timestamp query. [ 7.670780] radeon 0000:01:00.0: radeon: MSI limited to 32-bit [ 7.676801] radeon 0000:01:00.0: radeon: using MSI. [ 7.681863] [drm] radeon: irq initialized. [ 7.717757] [drm] ring test on 0 succeeded in 0 usecs [ 7.897466] [drm] ring test on 5 succeeded in 1 usecs [ 7.902585] [drm] UVD initialized successfully. [ 7.908108] [drm] ib test on ring 0 succeeded in 0 usecs [ 8.558968] [drm] ib test on ring 5 succeeded [ 8.568734] [drm] Radeon Display Connectors [ 8.573005] [drm] Connector 0: [ 8.576189] [drm] DVI-I-1 [ 8.579062] [drm] HPD1 [ 8.581657] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [ 8.589172] [drm] Encoders: [ 8.592234] [drm] DFP1: INTERNAL_UNIPHY [ 8.596492] [drm] CRT2: INTERNAL_KLDSCP_DAC2 [ 8.601182] [drm] Connector 1: [ 8.604302] [drm] DIN-1 [ 8.607012] [drm] Encoders: [ 8.610043] [drm] TV1: INTERNAL_KLDSCP_DAC2 [ 8.614642] [drm] Connector 2: [ 8.617760] [drm] DVI-I-2 [ 8.620621] [drm] HPD2 [ 8.623226] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [ 8.630719] [drm] Encoders: [ 8.633749] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [ 8.638436] [drm] DFP2: INTERNAL_KLDSCP_LVTMA [ 8.719815] [drm] fb mappable at 0xC0355000 [ 8.724089] [drm] vram apper at 0xC0000000 [ 8.728243] [drm] size 9216000 [ 8.731371] [drm] fb depth is 24 [ 8.734664] [drm] pitch is 7680 [ 8.739009] fbcon: radeondrmfb (fb0) is primary device [ 8.802887] Console: switching to colour frame buffer device 240x75 [ 8.818487] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device [ 8.824948] radeon 0000:01:00.0: registered panic notifier [ 8.846452] [drm] Initialized radeon 2.42.0 20080528 for 0000:01:00.0 on minor 0
Can you bisect?
It is my workstation so it will take longer but I'll try.
If you can think of some particular commits I should try, let me know.
Sorry, I can't think of anything off hand. I suspect it was some change or cleanup in the core drm code.
The locking check is new, but I was only adding locking checks, not yet reworking the locking itself. So the backtrace is likely (but not 100% guaranteed) a red herring.
Strange thing is that I've tested this on a radeon over here and I don't see this backtrace ... wut. Below diff should appease the backtraces at least. -Daniel
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index d8319dae8358..9f05de73ae97 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1734,9 +1734,11 @@ int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon) if (fbcon) { drm_helper_resume_force_mode(dev); /* turn on display hw */ + drm_modeset_lock_all(dev); list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_ON); } + drm_modeset_unlock_all(dev); }
drm_kms_helper_poll_enable(dev);
On Wed, Sep 23, 2015 at 09:25:23AM +0200, Daniel Vetter wrote:
Strange thing is that I've tested this on a radeon over here and I don't see this backtrace ... wut. Below diff should appease the backtraces at least.
Doesn't look like it.
This is what it says when suspending:
[ 42.962275] hib.sh (3269): drop_caches: 3 [ 42.967671] PM: Hibernation mode set to 'shutdown' [ 42.979329] PM: Syncing filesystems ... done. [ 42.993401] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 43.003632] PM: Marking nosave pages: [mem 0x00000000-0x00000fff] [ 43.009840] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff] [ 43.015991] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff] [ 43.022241] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff] [ 43.028357] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff] [ 43.034500] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff] [ 43.041371] PM: Basic memory bitmaps created [ 43.045656] PM: Preallocating image memory... done (allocated 128867 pages) [ 43.346216] PM: Allocated 515468 kbytes in 0.29 seconds (1777.47 MB/s) [ 43.352759] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 43.366104] ------------[ cut here ]------------ [ 43.370746] WARNING: CPU: 4 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 43.380681] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel ae s_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 r adeon acpi_cpufreq [ 43.403916] CPU: 4 PID: 55 Comm: kworker/u16:2 Not tainted 4.3.0-rc2+ #3 [ 43.410633] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 43.420567] Workqueue: events_unbound async_run_entry_fn [ 43.425919] ffffffff8194ff67 ffff88042a223b60 ffffffff812c758a 0000000000000000 [ 43.433424] ffff88042a223b98 ffffffff810534c1 ffff880429eca000 ffff880429fe9200 [ 43.440959] ffff880429de1000 0000000000000000 ffffffff819571c3 ffff88042a223ba8 [ 43.448461] Call Trace: [ 43.450929] [<ffffffff812c758a>] dump_stack+0x4e/0x84 [ 43.456094] [<ffffffff810534c1>] warn_slowpath_common+0x91/0xd0 [ 43.462126] [<ffffffff810535ba>] warn_slowpath_null+0x1a/0x20 [ 43.467983] [<ffffffff813bdc58>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 43.474884] [<ffffffff813be0b6>] drm_helper_connector_dpms+0x56/0x110 [ 43.481463] [<ffffffffa003346b>] radeon_suspend_kms+0x6b/0x380 [radeon] [ 43.488196] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 43.495020] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.500976] [<ffffffffa00311ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 43.507730] [<ffffffff8130feba>] pci_pm_freeze+0x6a/0x100 [ 43.513241] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.519187] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 43.524959] [<ffffffff8146f4a4>] __device_suspend+0x104/0x2c0 [ 43.530818] [<ffffffff8146f67f>] async_suspend+0x1f/0xa0 [ 43.536242] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 43.542127] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 43.547984] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 43.554019] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 43.559530] [<ffffffff8107e133>] ? preempt_count_sub+0xb3/0x110 [ 43.565562] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 43.571603] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 43.576597] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.583152] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 43.588574] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.595170] ---[ end trace aab225b93a6f1dcc ]--- [ 43.595235] ------------[ cut here ]------------ [ 43.595240] WARNING: CPU: 5 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 43.595267] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 43.595271] CPU: 5 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc2+ #3 [ 43.595272] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 43.595276] Workqueue: events_unbound async_run_entry_fn [ 43.595281] ffffffff8194ff67 ffff88042a223b60 ffffffff812c758a 0000000000000000 [ 43.595286] ffff88042a223b98 ffffffff810534c1 ffff880429eca000 ffff880429de1000 [ 43.595291] ffff880429de1000 0000000000000000 0000000000000003 ffff88042a223ba8 [ 43.595293] Call Trace: [ 43.595295] [<ffffffff812c758a>] dump_stack+0x4e/0x84 [ 43.595298] [<ffffffff810534c1>] warn_slowpath_common+0x91/0xd0 [ 43.595300] [<ffffffff810535ba>] warn_slowpath_null+0x1a/0x20 [ 43.595303] [<ffffffff813bdcf1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 43.595315] [<ffffffffa003d860>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 43.595322] [<ffffffff813be124>] drm_helper_connector_dpms+0xc4/0x110 [ 43.595331] [<ffffffffa003346b>] radeon_suspend_kms+0x6b/0x380 [radeon] [ 43.595334] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 43.595337] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.595346] [<ffffffffa00311ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 43.595349] [<ffffffff8130feba>] pci_pm_freeze+0x6a/0x100 [ 43.595351] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.595353] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 43.595356] [<ffffffff8146f4a4>] __device_suspend+0x104/0x2c0 [ 43.595358] [<ffffffff8146f67f>] async_suspend+0x1f/0xa0 [ 43.595361] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 43.595363] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 43.595366] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 43.595369] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 43.595373] [<ffffffff8107e133>] ? preempt_count_sub+0xb3/0x110 [ 43.595375] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 43.595378] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 43.595382] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.595385] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 43.595387] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.595390] ---[ end trace aab225b93a6f1dcd ]--- [ 43.611465] ------------[ cut here ]------------ [ 43.611469] WARNING: CPU: 5 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() [ 43.611494] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 43.611498] CPU: 5 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc2+ #3 [ 43.611499] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 43.611503] Workqueue: events_unbound async_run_entry_fn [ 43.611508] ffffffff8194ff67 ffff88042a223b60 ffffffff812c758a 0000000000000000 [ 43.611516] ffff88042a223b98 ffffffff810534c1 ffff880429eca000 ffff880429fe9e00 [ 43.611520] ffff880429de7000 0000000000000000 ffffffff819571c3 ffff88042a223ba8 [ 43.611521] Call Trace: [ 43.611525] [<ffffffff812c758a>] dump_stack+0x4e/0x84 [ 43.611527] [<ffffffff810534c1>] warn_slowpath_common+0x91/0xd0 [ 43.611530] [<ffffffff810535ba>] warn_slowpath_null+0x1a/0x20 [ 43.611534] [<ffffffff813bdc58>] drm_helper_choose_encoder_dpms+0x88/0x90 [ 43.611537] [<ffffffff813be0b6>] drm_helper_connector_dpms+0x56/0x110 [ 43.611547] [<ffffffffa003346b>] radeon_suspend_kms+0x6b/0x380 [radeon] [ 43.611551] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 43.611554] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.611563] [<ffffffffa00311ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 43.611566] [<ffffffff8130feba>] pci_pm_freeze+0x6a/0x100 [ 43.611569] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.611573] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 43.611575] [<ffffffff8146f4a4>] __device_suspend+0x104/0x2c0 [ 43.611578] [<ffffffff8146f67f>] async_suspend+0x1f/0xa0 [ 43.611580] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 43.611582] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 43.611585] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 43.611588] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 43.611592] [<ffffffff8107e133>] ? preempt_count_sub+0xb3/0x110 [ 43.611594] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 43.611599] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 43.611604] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.611607] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 43.611609] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.611612] ---[ end trace aab225b93a6f1dce ]--- [ 43.611638] ------------[ cut here ]------------ [ 43.611641] WARNING: CPU: 5 PID: 55 at include/drm/drm_crtc.h:1577 drm_helper_choose_crtc_dpms+0x91/0xa0() [ 43.611665] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 43.611670] CPU: 5 PID: 55 Comm: kworker/u16:2 Tainted: G W 4.3.0-rc2+ #3 [ 43.611670] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 43.611675] Workqueue: events_unbound async_run_entry_fn [ 43.611711] ffffffff8194ff67 ffff88042a223b60 ffffffff812c758a 0000000000000000 [ 43.611716] ffff88042a223b98 ffffffff810534c1 ffff880429eca000 ffff880429de7000 [ 43.611720] ffff880429de7000 0000000000000000 0000000000000003 ffff88042a223ba8 [ 43.611721] Call Trace: [ 43.611725] [<ffffffff812c758a>] dump_stack+0x4e/0x84 [ 43.611728] [<ffffffff810534c1>] warn_slowpath_common+0x91/0xd0 [ 43.611730] [<ffffffff810535ba>] warn_slowpath_null+0x1a/0x20 [ 43.611732] [<ffffffff813bdcf1>] drm_helper_choose_crtc_dpms+0x91/0xa0 [ 43.611742] [<ffffffffa003d860>] ? atombios_blank_crtc+0x140/0x140 [radeon] [ 43.611747] [<ffffffff813be124>] drm_helper_connector_dpms+0xc4/0x110 [ 43.611756] [<ffffffffa003346b>] radeon_suspend_kms+0x6b/0x380 [radeon] [ 43.611759] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 43.611761] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.611771] [<ffffffffa00311ac>] radeon_pmops_freeze+0x1c/0x20 [radeon] [ 43.611774] [<ffffffff8130feba>] pci_pm_freeze+0x6a/0x100 [ 43.611776] [<ffffffff8130fe50>] ? pci_pm_poweroff+0x100/0x100 [ 43.611778] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 43.611781] [<ffffffff8146f4a4>] __device_suspend+0x104/0x2c0 [ 43.611783] [<ffffffff8146f67f>] async_suspend+0x1f/0xa0 [ 43.611786] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 43.611788] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 43.611790] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 43.611793] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 43.611797] [<ffffffff8107e133>] ? preempt_count_sub+0xb3/0x110 [ 43.611800] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 43.611802] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 43.611807] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.611809] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 43.611812] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 43.611814] ---[ end trace aab225b93a6f1dcf ]--- [ 44.409186] PM: freeze of devices complete after 1045.976 msecs [ 44.417505] PM: late freeze of devices complete after 2.393 msecs [ 44.427721] PM: noirq freeze of devices complete after 4.126 msecs [ 44.433913] Disabling non-boot CPUs ... [ 44.451184] smpboot: CPU 1 is now offline [ 44.498597] smpboot: CPU 2 is now offline [ 44.534426] smpboot: CPU 3 is now offline [ 44.574909] smpboot: CPU 4 is now offline [ 44.614270] smpboot: CPU 5 is now offline [ 44.652566] smpboot: CPU 6 is now offline [ 44.694825] smpboot: CPU 7 is now offline [ 44.711829] PM: Creating hibernation image: [ 45.179293] PM: Need to copy 138846 pages [ 45.183311] PM: Normal pages needed: 138846 + 1024, available pages: 4029960 [ 45.898261] PM: Hibernation image created (138846 pages copied) [ 45.100717] LVT offset 0 assigned for vector 0x400 [ 45.106190] Enabling non-boot CPUs ... [ 45.110174] x86: Booting SMP configuration: [ 45.114369] smpboot: Booting Node 0 Processor 1 APIC 0x11 [ 45.141886] cache: parent cpu1 should not be sleeping [ 45.148488] CPU1 is up [ 45.150958] smpboot: Booting Node 0 Processor 2 APIC 0x12 [ 45.178801] cache: parent cpu2 should not be sleeping [ 45.185799] CPU2 is up [ 45.188316] smpboot: Booting Node 0 Processor 3 APIC 0x13 [ 45.215892] cache: parent cpu3 should not be sleeping [ 45.222959] CPU3 is up [ 45.225467] smpboot: Booting Node 0 Processor 4 APIC 0x14 [ 45.244730] cache: parent cpu4 should not be sleeping [ 45.250525] CPU4 is up [ 45.252954] smpboot: Booting Node 0 Processor 5 APIC 0x15 [ 45.273487] cache: parent cpu5 should not be sleeping [ 45.279302] CPU5 is up [ 45.281722] smpboot: Booting Node 0 Processor 6 APIC 0x16 [ 45.309551] cache: parent cpu6 should not be sleeping [ 45.316600] CPU6 is up [ 45.319109] smpboot: Booting Node 0 Processor 7 APIC 0x17 [ 45.344059] cache: parent cpu7 should not be sleeping [ 45.351114] CPU7 is up [ 45.376664] PM: noirq thaw of devices complete after 2.185 msecs [ 45.384992] PM: early thaw of devices complete after 2.249 msecs [ 45.391875] rtc_cmos 00:03: System wakeup disabled by ACPI [ 45.393086] [drm] PCIE gen 2 link speeds already enabled [ 45.393444] serial 00:06: activated [ 45.396643] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000). [ 45.396699] radeon 0000:01:00.0: WB enabled [ 45.396704] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8804292eec00 [ 45.397110] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0xffffc900008121d0 [ 45.428076] [drm] ring test on 0 succeeded in 0 usecs [ 45.510492] r8169 0000:02:00.0 eth0: link down [ 45.602794] [drm] ring test on 5 succeeded in 1 usecs [ 45.602803] [drm] UVD initialized successfully. [ 45.603013] [drm] ib test on ring 0 succeeded in 0 usecs [ 45.710859] ata5: SATA link down (SStatus 0 SControl 300) [ 45.710907] ata6: SATA link down (SStatus 0 SControl 300) [ 45.882925] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 45.882994] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 45.883059] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 45.883092] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 45.884972] ata1.00: supports DRM functions and may not be fully accessible [ 45.885071] ata2.00: supports DRM functions and may not be fully accessible [ 45.885123] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 45.885218] ata2.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 45.885907] ata1.00: supports DRM functions and may not be fully accessible [ 45.886024] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 45.886068] ata2.00: supports DRM functions and may not be fully accessible [ 45.886110] ata1.00: configured for UDMA/133 [ 45.886207] ata2.00: failed to get NCQ Send/Recv Log Emask 0x1 [ 45.886417] ata2.00: configured for UDMA/133 [ 45.896091] ata4.00: configured for UDMA/133 [ 45.906938] ata3.00: configured for UDMA/133 [ 45.907075] sd 2:0:0:0: [sdc] 1220942646 4096-byte logical blocks: (4.88 TB/4.54 TiB) [ 46.250991] [drm] ib test on ring 5 succeeded [ 46.644971] PM: thaw of devices complete after 1253.970 msecs [ 46.848000] usb 8-2: reset low-speed USB device number 2 using ohci-pci [ 47.084410] r8169 0000:02:00.0 eth0: link up [ 47.193411] PM: writing image. [ 47.203074] PM: Using 3 thread(s) for compression. [ 47.203074] PM: Compressing and saving image data (139118 pages)... [ 47.220107] PM: Image saving progress: 0% [ 47.356371] PM: Image saving progress: 10% [ 47.444633] PM: Image saving progress: 20% [ 47.586282] PM: Image saving progress: 30% [ 47.676366] PM: Image saving progress: 40% [ 47.751889] PM: Image saving progress: 50% [ 47.838831] PM: Image saving progress: 60% [ 47.927719] PM: Image saving progress: 70% [ 48.020115] PM: Image saving progress: 80% [ 48.107692] PM: Image saving progress: 90% [ 48.193569] PM: Image saving progress: 100% [ 48.199217] PM: Image saving done. [ 48.203784] PM: Wrote 556472 kbytes in 0.97 seconds (573.68 MB/s) [ 48.211456] PM: S| [ 48.311190] kvm: exiting hardware virtualization [ 48.321689] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0010 address=0x0000000020001000 flags=0x0000] [ 48.522749] sd 3:0:0:0: [sdd] Synchronizing SCSI cache [ 48.531044] sd 3:0:0:0: [sdd] Stopping disk [ 49.418426] sd 2:0:0:0: [sdc] Synchronizing SCSI cache [ 49.424890] sd 2:0:0:0: [sdc] Stopping disk [ 49.596437] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [ 49.605855] sd 1:0:0:0: [sdb] Stopping disk [ 49.920447] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 49.929890] sd 0:0:0:0: [sda] Stopping disk [ 50.124196] pcieport 0000:00:04.0: System wakeup enabled by ACPI [ 50.152897] ACPI: Preparing to enter system sleep state S5 [ 50.160275] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline [ 50.170923] reboot: Power down [ 50.176837] acpi_power_off called
Then the resume kernel starts and loads the suspended one, which bombs out completely:
[ 5.758121] PM: Checking hibernation image partition /dev/sda1 [ 5.764067] PM: Hibernation image partition 8:1 present [ 5.769372] PM: Looking for hibernation image. [ 5.769846] hid-generic 0003:04B4:0101.0003: input,hidraw2: USB HID v1.00 Device [DATACOMP SteelS쀁̄Љ̒DATA] on usb-0000 :00:12.0-2/input1 [ 5.788988] PM: Image signature found, resuming [ 5.795952] PM: Preparing processes for restore. [ 5.800605] Freezing user space processes ... (elapsed 0.000 seconds) done. [ 5.807811] PM: Loading hibernation image. [ 5.812211] PM: Marking nosave pages: [mem 0x00000000-0x00000fff] [ 5.818339] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff] [ 5.824462] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff] [ 5.830712] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff] [ 5.836844] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff] [ 5.843010] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff] [ 5.849924] PM: Basic memory bitmaps created [ 5.869152] PM: Using 3 thread(s) for decompression. [ 5.869152] PM: Loading and decompressing image data (139118 pages)... [ 5.946421] PM: Image loading progress: 0% [ 6.241471] PM: Image loading progress: 10% [ 6.321625] PM: Image loading progress: 20% [ 6.362579] random: nonblocking pool is initialized [ 6.414363] PM: Image loading progress: 30% [ 6.495027] PM: Image loading progress: 40% [ 6.573551] PM: Image loading progress: 50% [ 6.654276] PM: Image loading progress: 60% [ 6.735344] PM: Image loading progress: 70% [ 6.810312] PM: Image loading progress: 80% [ 6.885237] PM: Image loading progress: 90% [ 6.963869] PM: Image loading progress: 100% [ 6.968292] PM: Image loading done. [ 6.971907] PM: Read 556472 kbytes in 1.08 seconds (515.25 MB/s) [ 6.981245] PM: Image successfully loaded [ 7.208118] PM: quiesce of devices complete after 221.517 msecs [ 7.215932] PM: late quiesce of devices complete after 1.806 msecs [ 7.236424] PM: noirq quiesce of devices complete after 14.231 msecs [ 7.242867] Disabling non-boot CPUs ... [ 44.968602] LVT offset 0 assigned for vector 0x400 [ 44.973810] Enabling non-boot CPUs ... [ 44.977701] x86: Booting SMP configuration: [ 44.981885] smpboot: Booting Node 0 Processor 1 APIC 0x11 [ 45.004260] cache: parent cpu1 should not be sleeping [ 45.010101] CPU1 is up [ 45.012520] smpboot: Booting Node 0 Processor 2 APIC 0x12 [ 45.036083] cache: parent cpu2 should not be sleeping [ 45.041998] CPU2 is up [ 45.044429] smpboot: Booting Node 0 Processor 3 APIC 0x13 [ 45.064873] cache: parent cpu3 should not be sleeping [ 45.070712] CPU3 is up [ 45.073177] smpboot: Booting Node 0 Processor 4 APIC 0x14 [ 45.092508] cache: parent cpu4 should not be sleeping [ 45.098293] CPU4 is up [ 45.100772] smpboot: Booting Node 0 Processor 5 APIC 0x15 [ 45.120788] cache: parent cpu5 should not be sleeping [ 45.126577] CPU5 is up [ 45.128997] smpboot: Booting Node 0 Processor 6 APIC 0x16 [ 45.149043] cache: parent cpu6 should not be sleeping [ 45.154843] CPU6 is up [ 45.157261] smpboot: Booting Node 0 Processor 7 APIC 0x17 [ 45.177371] cache: parent cpu7 should not be sleeping [ 45.183182] CPU7 is up [ 45.195686] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 45.203696] IP: [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 45.210379] PGD 418009067 PUD 41800a067 PMD 0 [ 45.215023] Oops: 0000 [#1] PREEMPT SMP [ 45.219132] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel ae s_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 r adeon acpi_cpufreq [ 45.242447] CPU: 2 PID: 804 Comm: kworker/u16:5 Tainted: G W 4.3.0-rc2+ #3 [ 45.250584] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 45.260630] Workqueue: events_unbound async_run_entry_fn [ 45.266105] task: ffff88042983df00 ti: ffff880428a40000 task.ti: ffff880428a40000 [ 45.273724] RIP: 0010:[<ffffffff81321296>] [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 45.282840] RSP: 0018:ffff880428a43c28 EFLAGS: 00010286 [ 45.288292] RAX: 0000000000000000 RBX: ffff880429c2d000 RCX: 0000000000000000 [ 45.295564] RDX: 0000000000000001 RSI: ffffffff81304448 RDI: ffffffff816c0f2b [ 45.302835] RBP: ffff880428a43c40 R08: 0000000000000001 R09: 0000000000522000 [ 45.310114] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 45.317386] R13: ffff880429c2d7b0 R14: ffff880429c2d010 R15: ffff880429c2d038 [ 45.324666] FS: 00007f653bc43700(0000) GS:ffff88042ca00000(0000) knlGS:0000000000000000 [ 45.332898] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 45.338784] CR2: 0000000000000034 CR3: 0000000418053000 CR4: 00000000000406e0 [ 45.346063] Stack: [ 45.348221] 0080002c29c2d7b0 0000000000000000 ffff880429c2d000 ffff880428a43c78 [ 45.355822] ffffffff8130c141 ffff880429c2d098 ffff880429c2d000 0000000000000000 [ 45.363424] ffff88042a1216e8 ffffffff81957114 ffff880428a43c88 ffffffff8130c2b8 [ 45.371041] Call Trace: [ 45.373642] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 45.380316] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 45.386209] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 45.392367] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 45.398611] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 45.404531] [<ffffffff8146eaa3>] device_resume_noirq+0x93/0x150 [ 45.410682] [<ffffffff8146eb7d>] async_resume_noirq+0x1d/0x50 [ 45.416693] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 45.422669] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 45.428648] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 45.434804] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 45.440435] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 45.446590] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 45.451704] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 45.458376] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 45.463923] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 45.470587] Code: 66 89 4d ee 0f b7 c9 e8 79 41 fe ff 48 89 df e8 d1 7a ce ff 0f b6 53 4b 8b 73 38 48 8d 4d ee 48 8b 7b 10 83 c2 02 e8 1a 31 fe ff <41> 0f b6 4c 24 34 41 8b 54 24 30 be ff ff ff ff c0 e9 04 83 e1 [ 45.490744] RIP [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 45.497522] RSP <ffff880428a43c28> [ 45.501162] CR2: 0000000000000034 [ 45.504630] ---[ end trace aab225b93a6f1dd0 ]--- [ 45.509270] BUG: unable to handle kernel paging request at ffffffffffffff98 [ 45.516436] IP: [<ffffffff81076770>] kthread_data+0x10/0x20 [ 45.522180] PGD 19e5067 PUD 19e7067 PMD 0 [ 45.526512] Oops: 0000 [#2] PREEMPT SMP [ 45.530628] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 45.553976] CPU: 2 PID: 804 Comm: kworker/u16:5 Tainted: G D W 4.3.0-rc2+ #3 [ 45.562132] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 45.572199] task: ffff88042983df00 ti: ffff880428a40000 task.ti: ffff880428a40000 [ 45.579856] RIP: 0010:[<ffffffff81076770>] [<ffffffff81076770>] kthread_data+0x10/0x20 [ 45.588037] RSP: 0018:ffff880428a43928 EFLAGS: 00010002 [ 45.593505] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 45.600794] RDX: 000000012b244210 RSI: 0000000000000002 RDI: ffff88042983df00 [ 45.608083] RBP: ffff880428a43928 R08: ffff88042983df88 R09: ffff88042cbd5cb0 [ 45.615371] R10: ffff88042983df60 R11: 0000000000000000 R12: 00000000001d5c00 [ 45.622661] R13: ffff88042cbd5c18 R14: ffff88042983df00 R15: 0000000000000002 [ 45.629949] FS: 00007f653bc43700(0000) GS:ffff88042ca00000(0000) knlGS:0000000000000000 [ 45.638191] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 45.644094] CR2: 0000000000000028 CR3: 0000000418053000 CR4: 00000000000406e0 [ 45.651389] Stack: [ 45.653565] ffff880428a43940 ffffffff810707a1 ffff88042cbd5c00 ffff880428a43998 [ 45.661185] ffffffff816bb6e6 ffffffff81055b9b 0000000000000000 0000000000000000 [ 45.668835] ffff88042983df00 ffff880428a44000 ffff880428a439f0 0000000000000000 [ 45.676453] Call Trace: [ 45.679064] [<ffffffff810707a1>] wq_worker_sleeping+0x11/0x90 [ 45.685062] [<ffffffff816bb6e6>] __schedule+0x796/0xec0 [ 45.690538] [<ffffffff81055b9b>] ? do_exit+0x63b/0xac0 [ 45.695930] [<ffffffff816bbe9d>] schedule+0x3d/0x90 [ 45.701060] [<ffffffff81055c58>] do_exit+0x6f8/0xac0 [ 45.706279] [<ffffffff81007d8c>] oops_end+0x6c/0x90 [ 45.711408] [<ffffffff81045c13>] no_context+0x153/0x360 [ 45.716875] [<ffffffff81045f2b>] __bad_area_nosemaphore+0x10b/0x210 [ 45.723385] [<ffffffff812f3b77>] ? debug_smp_processor_id+0x17/0x20 [ 45.729894] [<ffffffff81046043>] bad_area_nosemaphore+0x13/0x20 [ 45.736055] [<ffffffff81046487>] __do_page_fault+0x1e7/0x360 [ 45.741958] [<ffffffff81000f70>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 45.748640] [<ffffffff8104663c>] do_page_fault+0xc/0x10 [ 45.754099] [<ffffffff816c38af>] page_fault+0x1f/0x30 [ 45.759386] [<ffffffff81304448>] ? pci_bus_read_config_word+0x98/0xa0 [ 45.766058] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 45.772992] [<ffffffff81321296>] ? pci_restore_msi_state+0x196/0x240 [ 45.779581] [<ffffffff81321296>] ? pci_restore_msi_state+0x196/0x240 [ 45.786165] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 45.792840] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 45.798732] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 45.804886] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 45.811127] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 45.817019] [<ffffffff8146eaa3>] device_resume_noirq+0x93/0x150 [ 45.823173] [<ffffffff8146eb7d>] async_resume_noirq+0x1d/0x50 [ 45.829153] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 45.835132] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 45.841104] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 45.847248] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 45.852873] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 45.859018] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 45.864124] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 45.870796] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 45.876333] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 45.882998] Code: 60 74 0a 48 89 df e8 10 a8 64 00 eb d3 48 8b 53 48 eb b1 e8 33 cb fd ff 0f 1f 00 0f 1f 44 00 00 48 8b 87 f8 03 00 00 55 48 89 e5 <48> 8b 40 98 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 [ 45.903148] RIP [<ffffffff81076770>] kthread_data+0x10/0x20 [ 45.908972] RSP <ffff880428a43928> [ 45.912612] CR2: ffffffffffffff98 [ 45.916079] ---[ end trace aab225b93a6f1dd1 ]--- [ 45.920697] Fixing recursive fault but reboot is needed! [ 45.926002] BUG: scheduling while atomic: kworker/u16:5/804/0x00000004 [ 45.932529] INFO: lockdep is turned off. [ 45.936453] Modules linked in: binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp fam15h_power edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 45.959455] irq event stamp: 2072 [ 45.962766] hardirqs last enabled at (2071): [<ffffffff816c0f45>] _raw_spin_unlock_irqrestore+0x65/0x80 [ 45.972248] hardirqs last disabled at (2072): [<ffffffff816c3a80>] error_entry+0x60/0xb0 [ 45.980342] softirqs last enabled at (1792): [<ffffffff81058547>] __do_softirq+0x3a7/0x480 [ 45.988705] softirqs last disabled at (1765): [<ffffffff81058798>] irq_exit+0x88/0xb0 [ 45.996539] Preemption disabled at:[<ffffffff81007d8c>] oops_end+0x6c/0x90 [ 46.003420] [ 46.004913] CPU: 2 PID: 804 Comm: kworker/u16:5 Tainted: G D W 4.3.0-rc2+ #3 [ 46.012911] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 46.022823] 00000000001d5c00 ffff880428a43628 ffffffff812c758a ffff88042983df00 [ 46.030280] ffff880428a43640 ffffffff8107d528 ffff88042cbd5c00 ffff880428a43698 [ 46.037740] ffffffff816bb84f ffff880428a436b0 ffffffff81126b16 0000000000000008 [ 46.045193] Call Trace: [ 46.047641] [<ffffffff812c758a>] dump_stack+0x4e/0x84 [ 46.052779] [<ffffffff8107d528>] __schedule_bug+0x68/0xc0 [ 46.058264] [<ffffffff816bb84f>] __schedule+0x8ff/0xec0 [ 46.063569] [<ffffffff81126b16>] ? printk+0x48/0x50 [ 46.068535] [<ffffffff816bbe9d>] schedule+0x3d/0x90 [ 46.073500] [<ffffffff81055e38>] do_exit+0x8d8/0xac0 [ 46.078580] [<ffffffff810b6d75>] ? kmsg_dump+0x135/0x180 [ 46.083978] [<ffffffff810b6c62>] ? kmsg_dump+0x22/0x180 [ 46.089291] [<ffffffff81007d8c>] oops_end+0x6c/0x90 [ 46.094257] [<ffffffff81045c13>] no_context+0x153/0x360 [ 46.099571] [<ffffffff812d47a4>] ? delay_tsc+0x94/0xc0 [ 46.104796] [<ffffffff81045f2b>] __bad_area_nosemaphore+0x10b/0x210 [ 46.111149] [<ffffffff81046043>] bad_area_nosemaphore+0x13/0x20 [ 46.117154] [<ffffffff81046487>] __do_page_fault+0x1e7/0x360 [ 46.122901] [<ffffffff81000f70>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 46.129427] [<ffffffff8104663c>] do_page_fault+0xc/0x10 [ 46.134740] [<ffffffff816c38af>] page_fault+0x1f/0x30 [ 46.139879] [<ffffffff81076770>] ? kthread_data+0x10/0x20 [ 46.145364] [<ffffffff810707a1>] wq_worker_sleeping+0x11/0x90 [ 46.151197] [<ffffffff816bb6e6>] __schedule+0x796/0xec0 [ 46.156510] [<ffffffff81055b9b>] ? do_exit+0x63b/0xac0 [ 46.161736] [<ffffffff816bbe9d>] schedule+0x3d/0x90 [ 46.166702] [<ffffffff81055c58>] do_exit+0x6f8/0xac0 [ 46.171754] [<ffffffff81007d8c>] oops_end+0x6c/0x90 [ 46.176721] [<ffffffff81045c13>] no_context+0x153/0x360 [ 46.182033] [<ffffffff81045f2b>] __bad_area_nosemaphore+0x10b/0x210 [ 46.188385] [<ffffffff812f3b77>] ? debug_smp_processor_id+0x17/0x20 [ 46.194730] [<ffffffff81046043>] bad_area_nosemaphore+0x13/0x20 [ 46.200736] [<ffffffff81046487>] __do_page_fault+0x1e7/0x360 [ 46.206481] [<ffffffff81000f70>] ? trace_hardirqs_off_thunk+0x17/0x19 [ 46.213008] [<ffffffff8104663c>] do_page_fault+0xc/0x10 [ 46.218320] [<ffffffff816c38af>] page_fault+0x1f/0x30 [ 46.223461] [<ffffffff81304448>] ? pci_bus_read_config_word+0x98/0xa0 [ 46.229986] [<ffffffff816c0f2b>] ? _raw_spin_unlock_irqrestore+0x4b/0x80 [ 46.236772] [<ffffffff81321296>] ? pci_restore_msi_state+0x196/0x240 [ 46.243212] [<ffffffff81321296>] ? pci_restore_msi_state+0x196/0x240 [ 46.249649] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 46.256177] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 46.261922] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 46.267928] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 46.274022] [<ffffffff8146e7e7>] dpm_run_callback+0x77/0x2a0 [ 46.279768] [<ffffffff8146eaa3>] device_resume_noirq+0x93/0x150 [ 46.285773] [<ffffffff8146eb7d>] async_resume_noirq+0x1d/0x50 [ 46.291606] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 46.297439] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 46.303271] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 46.309278] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 46.314762] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 46.320761] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 46.325726] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 46.332251] [<ffffffff816c1e3f>] ret_from_fork+0x3f/0x70 [ 46.337644] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0
Thanks.
On Wed, Sep 23, 2015 at 10:59:51AM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 09:25:23AM +0200, Daniel Vetter wrote:
Strange thing is that I've tested this on a radeon over here and I don't see this backtrace ... wut. Below diff should appease the backtraces at least.
Doesn't look like it.
sorry I sprinkled the locking stuff in the wrong places. Still confused why the resume side doesn't blow up anywhere ... Oh well. New patch below.
Thanks, Daniel
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index d8319dae8358..f3f562f6d848 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1573,10 +1573,12 @@ int radeon_suspend_kms(struct drm_device *dev, bool suspend, bool fbcon)
drm_kms_helper_poll_disable(dev);
+ drm_modeset_lock_all(dev); /* turn off display hw */ list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_OFF); } + drm_modeset_unlock_all(dev);
/* unpin the front buffers and cursors */ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { @@ -1734,9 +1736,11 @@ int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon) if (fbcon) { drm_helper_resume_force_mode(dev); /* turn on display hw */ + drm_modeset_lock_all(dev); list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_ON); } + drm_modeset_unlock_all(dev); }
drm_kms_helper_poll_enable(dev);
On Wed, Sep 23, 2015 at 04:44:50PM +0200, Daniel Vetter wrote:
sorry I sprinkled the locking stuff in the wrong places. Still confused why the resume side doesn't blow up anywhere
But it does:
[ 69.394204] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 69.402080] IP: [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.408624] PGD 4162b8067 PUD 416581067 PMD 0 [ 69.413122] Oops: 0000 [#1] PREEMPT SMP [ 69.417101] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kv m_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod edac_mce_amd fa m15h_power k10temp amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 69.443647] CPU: 4 PID: 814 Comm: kworker/u16:5 Not tainted 4.3.0-rc2+ #3 [ 69.450430] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 69.460336] Workqueue: events_unbound async_run_entry_fn [ 69.465667] task: ffff88042a255f00 ti: ffff880428a68000 task.ti: ffff880428a68000 [ 69.473145] RIP: 0010:[<ffffffff81321296>] [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.482131] RSP: 0018:ffff880428a6bc28 EFLAGS: 00010286 [ 69.487436] RAX: 0000000000000000 RBX: ffff88042a308000 RCX: 0000000000000000 [ 69.494568] RDX: 0000000000000001 RSI: ffffffff81304448 RDI: ffffffff816c7a1b [ 69.501700] RBP: ffff880428a6bc40 R08: 0000000000000001 R09: 0000000000522000 [ 69.508833] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 69.515965] R13: ffff88042a3087b0 R14: ffff88042a308010 R15: ffff88042a308038 [ 69.523097] FS: 00007fc91328a700(0000) GS:ffff88042ce00000(0000) knlGS:0000000000000000 [ 69.531185] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 69.536931] CR2: 0000000000000034 CR3: 00000004164c7000 CR4: 00000000000406e0 [ 69.544061] Stack: [ 69.546073] 0080002c2a3087b0 0000000000000000 ffff88042a308000 ffff880428a6bc78 [ 69.553525] ffffffff8130c141 ffff88042a308098 ffff88042a308000 0000000000000000 [ 69.560996] ffff8804284e77a8 ffffffff81961ef1 ffff880428a6bc88 ffffffff8130c2b8 [ 69.568450] Call Trace: [ 69.571044] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 69.577706] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 69.583591] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 69.589734] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 69.595966] [<ffffffff8146e847>] dpm_run_callback+0x77/0x2a0 [ 69.601850] [<ffffffff8146eb03>] device_resume_noirq+0x93/0x150 [ 69.607994] [<ffffffff8146ebdd>] async_resume_noirq+0x1d/0x50 [ 69.613967] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 69.619939] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 69.625910] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 69.632054] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 69.637677] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 69.643822] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 69.648927] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.655591] [<ffffffff816c893f>] ret_from_fork+0x3f/0x70 [ 69.661128] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.667794] Code: 66 89 4d ee 0f b7 c9 e8 79 41 fe ff 48 89 df e8 d1 7a ce ff 0f b6 53 4b 8b 73 38 48 8d 4d ee 48 8b 7b 10 83 c2 02 e8 1a 31 fe ff <41> 0f b6 4c 24 34 41 8b 54 24 30 be ff ff ff ff c0 e9 04 83 e1 [ 69.687986] RIP [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.694772] RSP <ffff880428a6bc28> [ 69.698412] CR2: 0000000000000034 [ 69.701879] ---[ end trace 814dd8cc56e427ae ]---
This happens at resume - I caught the output over serial - screen is dead, it doesn't show anything because it simply locks up/panics.
... Oh well. New patch below.
Yep, this one took care of the warning in drm_helper_choose_encoder_dpms(). Thanks!
Now I need to go decypher that NULL ptr deref above.
On Wed, Sep 23, 2015 at 06:06:21PM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 04:44:50PM +0200, Daniel Vetter wrote:
sorry I sprinkled the locking stuff in the wrong places. Still confused why the resume side doesn't blow up anywhere
But it does:
[ 69.394204] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 69.402080] IP: [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.408624] PGD 4162b8067 PUD 416581067 PMD 0 [ 69.413122] Oops: 0000 [#1] PREEMPT SMP [ 69.417101] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kv m_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod edac_mce_amd fa m15h_power k10temp amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 69.443647] CPU: 4 PID: 814 Comm: kworker/u16:5 Not tainted 4.3.0-rc2+ #3 [ 69.450430] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 69.460336] Workqueue: events_unbound async_run_entry_fn [ 69.465667] task: ffff88042a255f00 ti: ffff880428a68000 task.ti: ffff880428a68000 [ 69.473145] RIP: 0010:[<ffffffff81321296>] [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.482131] RSP: 0018:ffff880428a6bc28 EFLAGS: 00010286 [ 69.487436] RAX: 0000000000000000 RBX: ffff88042a308000 RCX: 0000000000000000 [ 69.494568] RDX: 0000000000000001 RSI: ffffffff81304448 RDI: ffffffff816c7a1b [ 69.501700] RBP: ffff880428a6bc40 R08: 0000000000000001 R09: 0000000000522000 [ 69.508833] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 69.515965] R13: ffff88042a3087b0 R14: ffff88042a308010 R15: ffff88042a308038 [ 69.523097] FS: 00007fc91328a700(0000) GS:ffff88042ce00000(0000) knlGS:0000000000000000 [ 69.531185] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 69.536931] CR2: 0000000000000034 CR3: 00000004164c7000 CR4: 00000000000406e0 [ 69.544061] Stack: [ 69.546073] 0080002c2a3087b0 0000000000000000 ffff88042a308000 ffff880428a6bc78 [ 69.553525] ffffffff8130c141 ffff88042a308098 ffff88042a308000 0000000000000000 [ 69.560996] ffff8804284e77a8 ffffffff81961ef1 ffff880428a6bc88 ffffffff8130c2b8 [ 69.568450] Call Trace: [ 69.571044] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 69.577706] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 69.583591] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 69.589734] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 69.595966] [<ffffffff8146e847>] dpm_run_callback+0x77/0x2a0 [ 69.601850] [<ffffffff8146eb03>] device_resume_noirq+0x93/0x150 [ 69.607994] [<ffffffff8146ebdd>] async_resume_noirq+0x1d/0x50 [ 69.613967] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 69.619939] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 69.625910] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 69.632054] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 69.637677] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 69.643822] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 69.648927] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.655591] [<ffffffff816c893f>] ret_from_fork+0x3f/0x70 [ 69.661128] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.667794] Code: 66 89 4d ee 0f b7 c9 e8 79 41 fe ff 48 89 df e8 d1 7a ce ff 0f b6 53 4b 8b 73 38 48 8d 4d ee 48 8b 7b 10 83 c2 02 e8 1a 31 fe ff <41> 0f b6 4c 24 34 41 8b 54 24 30 be ff ff ff ff c0 e9 04 83 e1 [ 69.687986] RIP [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.694772] RSP <ffff880428a6bc28> [ 69.698412] CR2: 0000000000000034 [ 69.701879] ---[ end trace 814dd8cc56e427ae ]---
Ok, after some quick staring, we're at __pci_restore_msi_state():
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control); msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap), entry->masked);
which is:
.loc 1 411 0 movq %rbx, %rdi # dev, call arch_restore_msi_irqs # .LBB1921: .LBB1922: .loc 2 902 0 movzbl 75(%rbx), %edx # dev_2(D)->msi_cap, D.31945 movl 56(%rbx), %esi # MEM[(const struct pci_dev *)dev_2(D)].devfn, MEM[(const struct pci_dev *)dev_2(D)].devfn leaq -18(%rbp), %rcx #, tmp266 movq 16(%rbx), %rdi # MEM[(const struct pci_dev *)dev_2(D)].bus, MEM[(const struct pci_dev *)dev_2(D)].bus addl $2, %edx #, D.31945 call pci_bus_read_config_word # .LBE1922: .LBE1921: .loc 1 414 0 movzbl 52(%r12), %ecx # *_85, tmp208 <--- faulting insn movl 48(%r12), %edx # _85->D.27233.D.27231.masked, D.31946 .LBB1923: .LBB1924: .loc 1 176 0 movl $-1, %esi #, D.31951
and that %r12 is supposed to contain struct msi_desc *entry in __pci_restore_msi_state():
entry = irq_get_msi_desc(dev->irq);
which is
.loc 4 654 0 movl 1340(%rdi), %edi # dev_2(D)->irq, dev_2(D)->irq call irq_get_irq_data # .loc 4 655 0 testq %rax, %rax # d je .L405 #, movq 16(%rax), %rax # d_62->common, d_62->common movq 16(%rax), %r12 # _63->msi_desc, D.31954
but as we see above %r12 is 0.
For some reason that entry thing in __pci_restore_msi_state() is not checked for NULL even though irq_get_msi_desc() can return NULL.
Maybe tglx would have an idea...
Hrmmm.
On Wed, Sep 23, 2015 at 06:18:39PM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 06:06:21PM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 04:44:50PM +0200, Daniel Vetter wrote:
sorry I sprinkled the locking stuff in the wrong places. Still confused why the resume side doesn't blow up anywhere
But it does:
[ 69.394204] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 69.402080] IP: [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.408624] PGD 4162b8067 PUD 416581067 PMD 0 [ 69.413122] Oops: 0000 [#1] PREEMPT SMP [ 69.417101] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kv m_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod edac_mce_amd fa m15h_power k10temp amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 69.443647] CPU: 4 PID: 814 Comm: kworker/u16:5 Not tainted 4.3.0-rc2+ #3 [ 69.450430] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 69.460336] Workqueue: events_unbound async_run_entry_fn [ 69.465667] task: ffff88042a255f00 ti: ffff880428a68000 task.ti: ffff880428a68000 [ 69.473145] RIP: 0010:[<ffffffff81321296>] [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.482131] RSP: 0018:ffff880428a6bc28 EFLAGS: 00010286 [ 69.487436] RAX: 0000000000000000 RBX: ffff88042a308000 RCX: 0000000000000000 [ 69.494568] RDX: 0000000000000001 RSI: ffffffff81304448 RDI: ffffffff816c7a1b [ 69.501700] RBP: ffff880428a6bc40 R08: 0000000000000001 R09: 0000000000522000 [ 69.508833] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 69.515965] R13: ffff88042a3087b0 R14: ffff88042a308010 R15: ffff88042a308038 [ 69.523097] FS: 00007fc91328a700(0000) GS:ffff88042ce00000(0000) knlGS:0000000000000000 [ 69.531185] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 69.536931] CR2: 0000000000000034 CR3: 00000004164c7000 CR4: 00000000000406e0 [ 69.544061] Stack: [ 69.546073] 0080002c2a3087b0 0000000000000000 ffff88042a308000 ffff880428a6bc78 [ 69.553525] ffffffff8130c141 ffff88042a308098 ffff88042a308000 0000000000000000 [ 69.560996] ffff8804284e77a8 ffffffff81961ef1 ffff880428a6bc88 ffffffff8130c2b8 [ 69.568450] Call Trace: [ 69.571044] [<ffffffff8130c141>] pci_restore_state.part.18+0xf1/0x250 [ 69.577706] [<ffffffff8130c2b8>] pci_restore_state+0x18/0x20 [ 69.583591] [<ffffffff8130f7fc>] pci_pm_restore_noirq+0x4c/0xd0 [ 69.589734] [<ffffffff8130f7b0>] ? pci_pm_freeze_noirq+0xf0/0xf0 [ 69.595966] [<ffffffff8146e847>] dpm_run_callback+0x77/0x2a0 [ 69.601850] [<ffffffff8146eb03>] device_resume_noirq+0x93/0x150 [ 69.607994] [<ffffffff8146ebdd>] async_resume_noirq+0x1d/0x50 [ 69.613967] [<ffffffff81078a06>] async_run_entry_fn+0x46/0xf0 [ 69.619939] [<ffffffff8106f548>] process_one_work+0x1f8/0x640 [ 69.625910] [<ffffffff8106f4a4>] ? process_one_work+0x154/0x640 [ 69.632054] [<ffffffff8106f9db>] worker_thread+0x4b/0x440 [ 69.637677] [<ffffffff8106f990>] ? process_one_work+0x640/0x640 [ 69.643822] [<ffffffff81075e86>] kthread+0xf6/0x110 [ 69.648927] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.655591] [<ffffffff816c893f>] ret_from_fork+0x3f/0x70 [ 69.661128] [<ffffffff81075d90>] ? kthread_create_on_node+0x1f0/0x1f0 [ 69.667794] Code: 66 89 4d ee 0f b7 c9 e8 79 41 fe ff 48 89 df e8 d1 7a ce ff 0f b6 53 4b 8b 73 38 48 8d 4d ee 48 8b 7b 10 83 c2 02 e8 1a 31 fe ff <41> 0f b6 4c 24 34 41 8b 54 24 30 be ff ff ff ff c0 e9 04 83 e1 [ 69.687986] RIP [<ffffffff81321296>] pci_restore_msi_state+0x196/0x240 [ 69.694772] RSP <ffff880428a6bc28> [ 69.698412] CR2: 0000000000000034 [ 69.701879] ---[ end trace 814dd8cc56e427ae ]---
Ok, after some quick staring, we're at __pci_restore_msi_state():
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control); msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap), entry->masked);
which is:
.loc 1 411 0 movq %rbx, %rdi # dev, call arch_restore_msi_irqs # .LBB1921: .LBB1922: .loc 2 902 0 movzbl 75(%rbx), %edx # dev_2(D)->msi_cap, D.31945 movl 56(%rbx), %esi # MEM[(const struct pci_dev *)dev_2(D)].devfn, MEM[(const struct pci_dev *)dev_2(D)].devfn leaq -18(%rbp), %rcx #, tmp266 movq 16(%rbx), %rdi # MEM[(const struct pci_dev *)dev_2(D)].bus, MEM[(const struct pci_dev *)dev_2(D)].bus addl $2, %edx #, D.31945 call pci_bus_read_config_word # .LBE1922: .LBE1921: .loc 1 414 0 movzbl 52(%r12), %ecx # *_85, tmp208 <--- faulting insn movl 48(%r12), %edx # _85->D.27233.D.27231.masked, D.31946 .LBB1923: .LBB1924: .loc 1 176 0 movl $-1, %esi #, D.31951
and that %r12 is supposed to contain struct msi_desc *entry in __pci_restore_msi_state():
entry = irq_get_msi_desc(dev->irq);
which is
.loc 4 654 0 movl 1340(%rdi), %edi # dev_2(D)->irq, dev_2(D)->irq call irq_get_irq_data # .loc 4 655 0 testq %rax, %rax # d je .L405 #, movq 16(%rax), %rax # d_62->common, d_62->common movq 16(%rax), %r12 # _63->msi_desc, D.31954
but as we see above %r12 is 0.
For some reason that entry thing in __pci_restore_msi_state() is not checked for NULL even though irq_get_msi_desc() can return NULL.
Ok, I bisected it.
First of all, Daniel, you didn't see the resume side blow up because of the NULL ptr deref f*cking up the box much earlier. Once I reverted the bad commit by hand (it wouldn't revert cleanly) the resume splats showed.
And in talking about the bad commit, it is this one:
991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit commit 991de2e59090e55c65a7f59a049142e3c480f7bd Author: Jiang Liu jiang.liu@linux.intel.com Date: Wed Jun 10 16:54:59 2015 +0800
PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()
To support IOAPIC hotplug, we need to allocate PCI IRQ resources on demand and free them when not used anymore.
Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically allocate and free PCI IRQs.
Remove mp_should_keep_irq(), which is no longer used.
[bhelgaas: changelog] Signed-off-by: Jiang Liu jiang.liu@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Thomas Gleixner tglx@linutronix.de
:040000 040000 765e2d5232d53247ec260b34b51589c3bccb36ae f680234a27685e94b1a35ae2a7218f8eafa9071a M arch :040000 040000 d55a682bcde72682e883365e88ad1df6186fd54d f82c470a04a6845fcf5e0aa934512c75628f798d M drivers
Jiang, you have to stop breaking my box with your changes. This is maybe the third time I'm bisecting fallout from your patches. If you're touching all x86, you need to test on an AMD box too. Like everyone else testing on the hardware their changes affect. It is that simple.
Anyway, reverting that commit by hand fixes my resume splat.
Here's the partial revert I did by hand:
--- diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h index fa1195dae425..164e3f8d3c3d 100644 --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -93,6 +93,8 @@ extern raw_spinlock_t pci_config_lock; extern int (*pcibios_enable_irq)(struct pci_dev *dev); extern void (*pcibios_disable_irq)(struct pci_dev *dev);
+extern bool mp_should_keep_irq(struct device *dev); + struct pci_raw_ops { int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 *val); diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc0a181..3bff24438b00 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -672,20 +672,22 @@ int pcibios_add_device(struct pci_dev *dev) return 0; }
-int pcibios_alloc_irq(struct pci_dev *dev) +int pcibios_enable_device(struct pci_dev *dev, int mask) { - return pcibios_enable_irq(dev); -} + int err;
-void pcibios_free_irq(struct pci_dev *dev) -{ - if (pcibios_disable_irq) - pcibios_disable_irq(dev); + if ((err = pci_enable_resources(dev, mask)) < 0) + return err; + + if (!pci_dev_msi_enabled(dev)) + return pcibios_enable_irq(dev); + return 0; }
-int pcibios_enable_device(struct pci_dev *dev, int mask) +void pcibios_disable_device (struct pci_dev *dev) { - return pci_enable_resources(dev, mask); + if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq) + pcibios_disable_irq(dev); }
int pci_ext_cfg_avail(void) diff --git a/arch/x86/pci/irq.c b/arch/x86/pci/irq.c index 32e70343e6fd..f229834b36d4 100644 --- a/arch/x86/pci/irq.c +++ b/arch/x86/pci/irq.c @@ -1186,6 +1186,18 @@ void pcibios_penalize_isa_irq(int irq, int active) pirq_penalize_isa_irq(irq, active); }
+bool mp_should_keep_irq(struct device *dev) +{ + if (dev->power.is_prepared) + return true; +#ifdef CONFIG_PM + if (dev->power.runtime_status == RPM_SUSPENDING) + return true; +#endif + + return false; +} + static int pirq_enable_irq(struct pci_dev *dev) { u8 pin = 0; @@ -1258,7 +1270,8 @@ static int pirq_enable_irq(struct pci_dev *dev)
static void pirq_disable_irq(struct pci_dev *dev) { - if (io_apic_assign_pci_irqs && pci_has_managed_irq(dev)) { + if (io_apic_assign_pci_irqs && !mp_should_keep_irq(&dev->dev) && + dev->irq_managed && dev->irq) { mp_unmap_irq(dev->irq); pci_reset_managed_irq(dev); } diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index 6da0f9beab19..d8a3f49a960c 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -479,6 +479,14 @@ void acpi_pci_irq_disable(struct pci_dev *dev) if (!pin || !pci_has_managed_irq(dev)) return;
+ /* Keep IOAPIC pin configuration when suspending */ + if (dev->dev.power.is_prepared) + return; +#ifdef CONFIG_PM + if (dev->dev.power.runtime_status == RPM_SUSPENDING) + return; +#endif + entry = acpi_pci_irq_lookup(dev, pin); if (!entry) return;
On 2015/9/27 0:46, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 06:18:39PM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 06:06:21PM +0200, Borislav Petkov wrote:
On Wed, Sep 23, 2015 at 04:44:50PM +0200, Daniel Vetter wrote:
sorry I sprinkled the locking stuff in the wrong places. Still confused why the resume side doesn't blow up anywhere
But it does:
<snit>
Ok, I bisected it.
First of all, Daniel, you didn't see the resume side blow up because of the NULL ptr deref f*cking up the box much earlier. Once I reverted the bad commit by hand (it wouldn't revert cleanly) the resume splats showed.
And in talking about the bad commit, it is this one:
991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit commit 991de2e59090e55c65a7f59a049142e3c480f7bd Author: Jiang Liu jiang.liu@linux.intel.com Date: Wed Jun 10 16:54:59 2015 +0800
PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq() To support IOAPIC hotplug, we need to allocate PCI IRQ resources on demand and free them when not used anymore. Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically allocate and free PCI IRQs. Remove mp_should_keep_irq(), which is no longer used. [bhelgaas: changelog] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
:040000 040000 765e2d5232d53247ec260b34b51589c3bccb36ae f680234a27685e94b1a35ae2a7218f8eafa9071a M arch :040000 040000 d55a682bcde72682e883365e88ad1df6186fd54d f82c470a04a6845fcf5e0aa934512c75628f798d M drivers
Jiang, you have to stop breaking my box with your changes. This is maybe the third time I'm bisecting fallout from your patches. If you're touching all x86, you need to test on an AMD box too. Like everyone else testing on the hardware their changes affect. It is that simple.
Hi Boris and Daniel, Sorry for the regression! I have tried to reproduce the regression by doing suspend/resume with a laptop, but failed. The PCI MSI suspend/resume code work as expected. And I have checked msi.c and radeon driver, but haven't gotten any hint about the cause. So could you please help to apply the attached debug patch to gather more information about the regression? Thanks! Gerry
Anyway, reverting that commit by hand fixes my resume splat.
Here's the partial revert I did by hand:
diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h index fa1195dae425..164e3f8d3c3d 100644 --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -93,6 +93,8 @@ extern raw_spinlock_t pci_config_lock; extern int (*pcibios_enable_irq)(struct pci_dev *dev); extern void (*pcibios_disable_irq)(struct pci_dev *dev);
+extern bool mp_should_keep_irq(struct device *dev);
struct pci_raw_ops { int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 *val); diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc0a181..3bff24438b00 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -672,20 +672,22 @@ int pcibios_add_device(struct pci_dev *dev) return 0; }
-int pcibios_alloc_irq(struct pci_dev *dev) +int pcibios_enable_device(struct pci_dev *dev, int mask) {
- return pcibios_enable_irq(dev);
-}
- int err;
-void pcibios_free_irq(struct pci_dev *dev) -{
- if (pcibios_disable_irq)
pcibios_disable_irq(dev);
- if ((err = pci_enable_resources(dev, mask)) < 0)
return err;
- if (!pci_dev_msi_enabled(dev))
return pcibios_enable_irq(dev);
- return 0;
}
-int pcibios_enable_device(struct pci_dev *dev, int mask) +void pcibios_disable_device (struct pci_dev *dev) {
- return pci_enable_resources(dev, mask);
- if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
pcibios_disable_irq(dev);
}
int pci_ext_cfg_avail(void) diff --git a/arch/x86/pci/irq.c b/arch/x86/pci/irq.c index 32e70343e6fd..f229834b36d4 100644 --- a/arch/x86/pci/irq.c +++ b/arch/x86/pci/irq.c @@ -1186,6 +1186,18 @@ void pcibios_penalize_isa_irq(int irq, int active) pirq_penalize_isa_irq(irq, active); }
+bool mp_should_keep_irq(struct device *dev) +{
- if (dev->power.is_prepared)
return true;
+#ifdef CONFIG_PM
- if (dev->power.runtime_status == RPM_SUSPENDING)
return true;
+#endif
- return false;
+}
static int pirq_enable_irq(struct pci_dev *dev) { u8 pin = 0; @@ -1258,7 +1270,8 @@ static int pirq_enable_irq(struct pci_dev *dev)
static void pirq_disable_irq(struct pci_dev *dev) {
- if (io_apic_assign_pci_irqs && pci_has_managed_irq(dev)) {
- if (io_apic_assign_pci_irqs && !mp_should_keep_irq(&dev->dev) &&
mp_unmap_irq(dev->irq); pci_reset_managed_irq(dev); }dev->irq_managed && dev->irq) {
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index 6da0f9beab19..d8a3f49a960c 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -479,6 +479,14 @@ void acpi_pci_irq_disable(struct pci_dev *dev) if (!pin || !pci_has_managed_irq(dev)) return;
- /* Keep IOAPIC pin configuration when suspending */
- if (dev->dev.power.is_prepared)
return;
+#ifdef CONFIG_PM
- if (dev->dev.power.runtime_status == RPM_SUSPENDING)
return;
+#endif
- entry = acpi_pci_irq_lookup(dev, pin); if (!entry) return;
On Tue, Sep 29, 2015 at 04:50:36PM +0800, Jiang Liu wrote:
So could you please help to apply the attached debug patch to gather more information about the regression?
Sure, just did.
I'm sending you a full s/r cycle attempt caught over serial in a private message.
Thanks.
n 2015/9/29 18:51, Borislav Petkov wrote:
On Tue, Sep 29, 2015 at 04:50:36PM +0800, Jiang Liu wrote:
So could you please help to apply the attached debug patch to gather more information about the regression?
Sure, just did.
I'm sending you a full s/r cycle attempt caught over serial in a private message.
Hi Boris,
From the log file, we got to know that the NULL pointer dereference
was caused by AMD IOMMU device. For normal MSI-enabled PCI devices, we get valid irq numbers such as: [ 74.661170] ahci 0000:04:00.0: irqdomain: freeze msi 1 irq28 [ 74.661297] radeon 0000:01:00.0: irqdomain: freeze msi 1 irq47 But for AMD IOMMU device, we got an invalid irq number(0) after enabling MSI as: [ 74.662488] pci 0000:00:00.2: irqdomain: freeze msi 1 irq0 which then caused NULL pointer deference when __pci_restore_msi_state() gets called by system resume code. So we need to figure out why we got irq number 0 after enabling MSI for AMD IOMMU device. The only hint I got is that iommu driver just grabbing the PCI device without providing a PCI device driver for IOMMU PCI device, we have solved a similar case for eata driver. So could you please help to apply this debug patch to gather more info and send me /proc/interrupts? Thanks! Gerry
O>
Thanks.
On Wed, Sep 30, 2015 at 03:45:39PM +0800, Jiang Liu wrote:
So we need to figure out why we got irq number 0 after enabling MSI for AMD IOMMU device. The only hint I got is that iommu driver just grabbing the PCI device without providing a PCI device driver for IOMMU PCI device, we have solved a similar case for eata driver. So could you please help to apply this debug patch to gather more info and send me /proc/interrupts?
I think I have an idea on how dev->irq got 0 after pci_enable_msi(). The PCI probe code calls pcibios_alloc_irq() and after a failed probe it calls pcibios_free_irq(), which sets dev->irq to 0. The AMD IOMMU driver does not register a pci_driver for itself, it just doesn't make sense for it. But the PCI device containing the IOMMU gets probed later, which fails because there is no driver for it. So the following call to pcibios_free_irq() clears dev->irq, so that it is 0 on the next resume. Does that make sense?
Joerg
On 2015/9/30 20:44, Joerg Roedel wrote:
On Wed, Sep 30, 2015 at 03:45:39PM +0800, Jiang Liu wrote:
So we need to figure out why we got irq number 0 after enabling MSI for AMD IOMMU device. The only hint I got is that iommu driver just grabbing the PCI device without providing a PCI device driver for IOMMU PCI device, we have solved a similar case for eata driver. So could you please help to apply this debug patch to gather more info and send me /proc/interrupts?
I think I have an idea on how dev->irq got 0 after pci_enable_msi(). The PCI probe code calls pcibios_alloc_irq() and after a failed probe it calls pcibios_free_irq(), which sets dev->irq to 0. The AMD IOMMU driver does not register a pci_driver for itself, it just doesn't make sense for it. But the PCI device containing the IOMMU gets probed later, which fails because there is no driver for it. So the following call to pcibios_free_irq() clears dev->irq, so that it is 0 on the next resume. Does that make sense?
Thanks Joerg, that makes sense. If some driver tries to binding to the IOMMU device, it will trigger the scenario as you described. For example, Xen backend driver will try to probe all PCI devices if enabled. I will do more investigation tomorrow. Thanks! Gerry
On Thu, Oct 01, 2015 at 01:00:44AM +0800, Jiang Liu wrote:
Thanks Joerg, that makes sense. If some driver tries to binding to the IOMMU device, it will trigger the scenario as you described. For example, Xen backend driver will try to probe all PCI devices if enabled. I will do more investigation tomorrow.
Right, so this fixes the issue on my box, courtesy of Joerg. WE basically don't disable the IRQ on MSI-enabled devices. The AMD IOMMU uses a barebones PCI device but not a PCI driver, which would be an overkill.
--- diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc..29ec2eb 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -674,12 +674,15 @@ int pcibios_add_device(struct pci_dev *dev)
int pcibios_alloc_irq(struct pci_dev *dev) { + if (pci_dev_msi_enabled(dev)) + return 0; + return pcibios_enable_irq(dev); }
void pcibios_free_irq(struct pci_dev *dev) { - if (pcibios_disable_irq) + if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq) pcibios_disable_irq(dev); } --
On Wed, Sep 30, 2015 at 07:36:19PM +0200, Borislav Petkov wrote:
Right, so this fixes the issue on my box, courtesy of Joerg. WE basically don't disable the IRQ on MSI-enabled devices. The AMD IOMMU uses a barebones PCI device but not a PCI driver, which would be an overkill.
Well, not only overkill, but actually harmful. As I just wrote to Jiang, a device can be forcibly unbound from its driver, which is something we don't want for the IOMMU.
Joerg
On 2015/10/1 1:36, Borislav Petkov wrote:
On Thu, Oct 01, 2015 at 01:00:44AM +0800, Jiang Liu wrote:
Thanks Joerg, that makes sense. If some driver tries to binding to the IOMMU device, it will trigger the scenario as you described. For example, Xen backend driver will try to probe all PCI devices if enabled. I will do more investigation tomorrow.
Right, so this fixes the issue on my box, courtesy of Joerg. WE basically don't disable the IRQ on MSI-enabled devices. The AMD IOMMU uses a barebones PCI device but not a PCI driver, which would be an overkill.
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc..29ec2eb 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -674,12 +674,15 @@ int pcibios_add_device(struct pci_dev *dev)
int pcibios_alloc_irq(struct pci_dev *dev) {
- if (pci_dev_msi_enabled(dev))
return 0;
We may return -EBUSY here to reject the probe operation. It doesn't make sense to continue the probe if MSI is already enabled, tt also helps to avoid calling pcibios_free_irq() in function pci_device_probe().
- return pcibios_enable_irq(dev);
}
void pcibios_free_irq(struct pci_dev *dev) {
- if (pcibios_disable_irq)
- if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
The above change is not needed, pcibios_disable_irq() will first check !pci_has_managed_irq(dev) before actually freeing PCI irq. pci_has_managed_irq(dev) only returns true if pcibios_alloc_irq() succeeds.
So to summary, I think we only need following change to fix the regression: int pcibios_alloc_irq(struct pci_dev *dev) { + if (pci_dev_msi_enabled(dev)) + return -EBUSY;
What do you think? Thanks! Gerry
pcibios_disable_irq(dev);
}
On Sat, Oct 03, 2015 at 03:36:35PM +0800, Jiang Liu wrote:
The above change is not needed, pcibios_disable_irq() will first check !pci_has_managed_irq(dev) before actually freeing PCI irq. pci_has_managed_irq(dev) only returns true if pcibios_alloc_irq() succeeds.
So to summary, I think we only need following change to fix the regression: int pcibios_alloc_irq(struct pci_dev *dev) {
- if (pci_dev_msi_enabled(dev))
return -EBUSY;
What do you think?
Yap, that works too. I've got only this ontop of 4.3+tip:
--- diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index dc78a4a9a466..a4687aa6c1fb 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -675,6 +675,9 @@ int pcibios_add_device(struct pci_dev *dev)
int pcibios_alloc_irq(struct pci_dev *dev) { + if (pci_dev_msi_enabled(dev)) + return -EBUSY; + return pcibios_enable_irq(dev); }
---
and it suspend+resumed fine.
I guess it is time for Joerg to write a proper patch. :-)
Thanks.
Hi Jiang,
On Sat, Oct 03, 2015 at 03:36:35PM +0800, Jiang Liu wrote:
So to summary, I think we only need following change to fix the regression: int pcibios_alloc_irq(struct pci_dev *dev) {
- if (pci_dev_msi_enabled(dev))
return -EBUSY;
What do you think?
Yes, that works too and has the added benefit that no driver can attach to the iommu device and get in the way of the driver.
Will you send the patch for this change or should I do it?
Joerg
On 2015/10/5 18:03, Joerg Roedel wrote:
Hi Jiang,
On Sat, Oct 03, 2015 at 03:36:35PM +0800, Jiang Liu wrote:
So to summary, I think we only need following change to fix the regression: int pcibios_alloc_irq(struct pci_dev *dev) {
- if (pci_dev_msi_enabled(dev))
return -EBUSY;
What do you think?
Yes, that works too and has the added benefit that no driver can attach to the iommu device and get in the way of the driver.
Will you send the patch for this change or should I do it?
Hi Joerg, We are on leave for Chinese National Holiday and has limited access to my working environment. It would be appreciated if you could help to send out a patch for it. Otherwise I will send out a patch within 2-3 days. Thanks! Gerry
On Tue, Oct 06, 2015 at 09:13:11PM +0800, Jiang Liu wrote:
We are on leave for Chinese National Holiday and has limited access to my working environment. It would be appreciated if you could help to send out a patch for it. Otherwise I will send out a patch within 2-3 days.
Okay, I just sent the patch.
Joerg
On Thu, Oct 01, 2015 at 01:00:44AM +0800, Jiang Liu wrote:
Thanks Joerg, that makes sense. If some driver tries to binding to the IOMMU device, it will trigger the scenario as you described. For example, Xen backend driver will try to probe all PCI devices if enabled. I will do more investigation tomorrow.
Not only that, the probe code looks like this in __pci_device_probe:
error = -ENODEV;
id = pci_match_device(drv, pci_dev); if (id) error = pci_call_probe(drv, pci_dev, id); if (error >= 0) error = 0;
The pci_match_device() function will always return NULL for the iommu pci_dev, because no driver matches the ids of it. So the function returns -ENODEV, which will be handled in the caller (pci_device_probe):
error = pcibios_alloc_irq(pci_dev); if (error < 0) return error;
pci_dev_get(pci_dev); error = __pci_device_probe(drv, pci_dev); if (error) { pcibios_free_irq(pci_dev); pci_dev_put(pci_dev); }
For the IOMMU pci_dev a pcibios-irq will be allocated (if there is one, like on Boris' system) and because __pci_device_probe returns -ENODEV it will be freed again with pcibios_free_irq().
The pcibios_free_irq() function will set dev->irq = 0, which overwrites the value that pci_enable_msi() wrote there. So later in suspend/resume code the msi-handling part tries to fetch the irq-descriptor for the wrong irq (which is NULL) and causes the crash.
The issue got introduced because with your changes pci_enable_msi() is only allowed after a pci-device was successfully probed by the driver. But this assumption is not true, as the AMD IOMMU driver does not register as a pci-driver.
Registering a pci-driver would actually be harmful, because a device can be forcibly unbound from its driver, which would be pretty bad for an IOMMU in the running system.
So the right fix is to allow pci_enable_msi() for pci-devices not registered against a driver. The fix I sent Boris has issues (I think it leaks pcibios irqs when MSI is in use), but was thinking about fixing it in pci_device_probe by not allocating a pcibios-irq when MSI is already active. What do you think?
Regards,
Joerg
dri-devel@lists.freedesktop.org