On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:
On 19.11.2014 09:21, Andy Lutomirski wrote:
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:
On 15.11.2014 07:21, Andy Lutomirski wrote:
On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.
On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed
Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.
Are there known issues here?
Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?
With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.
With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)
I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.
With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?
Got one with perf:
16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159
and:
Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu
The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.
I'll build a kernel with latencytop.
--Andy