Hi,
We've fixed piles of those in recent kernels, but didn't backport all the fixes (since usually it's a silent failure, but it can correlate with black screens).
Not quite completely, it seems ...
I have built drm-intel-nightly (f261f82359), and I'm getting this:
| [ 15.855007] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun | [ 15.855007] [drm:intel_set_cpu_fifo_underrun_reporting [i915]] *ERROR* pipe B underrun | [ 15.855007] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun | [ 15.863175] [drm] RC6 disabled, disabling runtime PM support | [ 15.863543] [drm] initialized overlay support | [ 15.933338] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun | [ 15.997130] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device | [ 16.061856] [drm:intel_set_cpu_fifo_underrun_reporting [i915]] *ERROR* pipe A underrun | [ 16.725274] [drm] Initialized i915 1.6.0 20160330 for 0000:00:02.0 on minor 0 | [ 16.805727] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
| [ 2520.457732] WARNING: CPU: 0 PID: 3193 at drivers/gpu/drm/i915/i915_gem.c:4508 i915_gem_free_object+0x277/0x280 [i915]() | [ 2520.457736] WARN_ON(obj->frontbuffer_bits)
Hm, this one should be fixed, and the patches should all be correctly marked for stable. Either there's a backlog somewhere, or we failed.
Would be great if you can test a drm-intel-nightly build (or 4.6-rc1) for either and confirm that they're gone. And for the later we really should hunt down the bugfix if it's stuck.
| [ 141.999803] ------------[ cut here ]------------ | [ 141.999916] WARNING: CPU: 0 PID: 3349 at drivers/gpu/drm/i915/i915_gem.c:4568 i915_gem_free_object+0x25f/0x270 [i915] | [ 141.999923] WARN_ON(obj->frontbuffer_bits) | [ 141.999928] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_CT iptable_raw xt_nat xt_tcpudp xt_addrtype iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables dummy tun nfsd exportfs nfs lockd grace sunrpc ipv6 fbcon bitblit softcursor font loop mousedev i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea snd_intel8x0 drm snd_ac97_codec ac97_bus i2c_core snd_pcm_oss fb snd_mixer_oss fbdev snd_pcm ipw2200 snd_timer snd libipw soundcore lib80211 nsc_ircc thinkpad_acpi cfg80211 pcmcia psmouse sdhci_pci irda uhci_hcd ehci_pci sdhci crc_ccitt ehci_hcd serio_raw e1000 mmc_core nvram evdev usbcore parport_pc yenta_socket hwmon parport pcmcia_rsrc video usb_common pcmcia_core backlight ac battery acpi_cpufreq intel_agp processor button intel_gtt agpgart twofish_generic twofish_i586 twofish_common xts gf128mul dm_crypt dm_mod thermal | [ 142.000114] CPU: 0 PID: 3349 Comm: Xorg Not tainted 4.6.0-rc1+ #1 | [ 142.000120] Hardware name: IBM 23716JG/23716JG, BIOS 1UETD3WW (2.08 ) 12/21/2006 | [ 142.000127] c11b8f7a c1037247 f8dea59b c0051dc4 00000d15 f8dda000 000011d8 f8d3ff2f | [ 142.000141] f8d3ff2f 000011d8 f3ef1dcc f3ef1e30 f3ef1dcc f3ef1dc0 c1037309 00000009 | [ 142.000154] 00000000 c0051dac f8dea59b c0051dc4 f8d3ff2f f8dda000 000011d8 f8dea59b | [ 142.000168] Call Trace: | [ 142.000185] [<c11b8f7a>] ? dump_stack+0xa/0x20 | [ 142.000197] [<c1037247>] ? __warn+0xe7/0x100 | [ 142.000269] [<f8d3ff2f>] ? i915_gem_free_object+0x25f/0x270 [i915] | [ 142.000337] [<f8d3ff2f>] ? i915_gem_free_object+0x25f/0x270 [i915] | [ 142.000347] [<c1037309>] ? warn_slowpath_fmt+0x39/0x40 | [ 142.000416] [<f8d3ff2f>] ? i915_gem_free_object+0x25f/0x270 [i915] | [ 142.000452] [<f892ed83>] ? drm_gem_object_free+0x23/0x40 [drm] | [ 142.000478] [<f892f58f>] ? drm_gem_object_handle_unreference_unlocked+0xcf/0xe0 [drm] | [ 142.000504] [<f892f5e7>] ? drm_gem_object_release_handle+0x47/0x90 [drm] | [ 142.000529] [<f892f67e>] ? drm_gem_handle_delete+0x4e/0x80 [drm] | [ 142.000554] [<f892f8d0>] ? drm_gem_handle_create+0x30/0x30 [drm] | [ 142.000580] [<f89302c0>] ? drm_ioctl+0x230/0x570 [drm] | [ 142.000606] [<f892f8d0>] ? drm_gem_handle_create+0x30/0x30 [drm] | [ 142.000618] [<c10b34a3>] ? unmap_page_range+0x433/0x530 | [ 142.000627] [<c11be1c3>] ? __rb_erase_color+0xf3/0x250 | [ 142.000637] [<c10b7116>] ? unlink_file_vma+0x36/0x70 | [ 142.000645] [<c10b1db9>] ? tlb_finish_mmu+0x9/0x30 | [ 142.000671] [<f8930090>] ? drm_ioctl_permit+0x80/0x80 [drm] | [ 142.000682] [<c10e7250>] ? do_vfs_ioctl+0x80/0x6a0 | [ 142.000690] [<c11bf570>] ? timerqueue_del+0x20/0x70 | [ 142.000699] [<c10cbde5>] ? kmem_cache_free+0x95/0xa0 | [ 142.000708] [<c10b6d0e>] ? remove_vma+0x3e/0x50 | [ 142.000717] [<c10b9019>] ? do_munmap+0x219/0x2d0 | [ 142.000726] [<c10e78b3>] ? SyS_ioctl+0x43/0x80 | [ 142.000735] [<c1001272>] ? do_fast_syscall_32+0x82/0x110 | [ 142.000745] [<c134644f>] ? sysenter_past_esp+0x40/0x6a | [ 142.000777] ---[ end trace c0ddddf77cdb5434 ]---
Each time an Xv window disappears from view--sometimes with slight variations in the stacktrace. Do you need full debug info or a bunch more stacktraces or is this enough to get an idea?
Also, I have occasional X server crashes (every few weeks or so) which started with 4.1.9, I think (I had 3.11.0 before that), and I had some kind of problem with Xv not working anymore until reboot with 4.1.9 which hasn't happened with 4.4.5 yet ... do you think any of those would be worth further investigation? If so, any suggestions as to how to split it all into separate issues/how to go about it?
No idea about X stuff, not my expertise ;-)
Well, I would guess that something that persists until reboot smells like a kernel/driver bug? Also, IIRC, there was no (major) X server upgrade between 3.11.0 and 4.1.9, so chances are that's a kernel/driver bug as well ;-)
Regards, Florian