Long radeon stalls on recent kernels

List overview All Threads
Download

newer

older

RE: [PATCH] Add new...

Re: [Regression] 83f45fc turns...

Andy Lutomirski

14 Nov 2014 14 Nov '14

10:21 p.m.

I have a Caicos card, like this:

[ 3.077260] [drm] radeon kernel modesetting enabled. [ 3.077338] checking generic (e0000000 600000) vs hw (e0000000 10000000) [ 3.077339] fb: switching to radeondrmfb from EFI VGA [ 3.077377] Console: switching to colour dummy device 80x25 [ 3.078881] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x174B:0xE164). [ 3.078903] [drm] register mmio base: 0xF4A20000 [ 3.078904] [drm] register mmio size: 131072 [ 3.078982] ATOM BIOS: C26401 [ 3.079572] radeon 0000:09:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) [ 3.079574] radeon 0000:09:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [ 3.079576] [drm] Detected VRAM RAM=1024M, BAR=256M [ 3.079577] [drm] RAM width 64bits DDR [ 3.079755] [TTM] Zone kernel: Available graphics memory: 8186568 kiB [ 3.079757] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 3.079757] [TTM] Initializing pool allocator [ 3.079773] [TTM] Initializing DMA pool allocator [ 3.080011] [drm] radeon: 1024M of VRAM memory ready [ 3.080012] [drm] radeon: 1024M of GTT memory ready. [ 3.080049] [drm] Loading CAICOS Microcode [ 3.080330] [drm] Internal thermal controller without fan control [ 3.081425] [drm] radeon: power management initialized [ 3.081551] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 3.085030] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 3.085221] radeon 0000:09:00.0: WB enabled [ 3.085224] radeon 0000:09:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88043d914c00 [ 3.085225] radeon 0000:09:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88043d914c0c [ 3.097438] radeon 0000:09:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900128b2118 [ 3.097441] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.097442] [drm] Driver supports precise vblank timestamp query. [ 3.097514] radeon 0000:09:00.0: irq 56 for MSI/MSI-X [ 3.097544] radeon 0000:09:00.0: radeon: using MSI. [ 3.097614] [drm] radeon: irq initialized.

On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Thanks, Andy

Show replies by date

Michel Dänzer

17 Nov 17 Nov

9:51 a.m.

On 15.11.2014 07:21, Andy Lutomirski wrote:

...

I have a Caicos card, like this:

[ 3.077260] [drm] radeon kernel modesetting enabled. [ 3.077338] checking generic (e0000000 600000) vs hw (e0000000 10000000) [ 3.077339] fb: switching to radeondrmfb from EFI VGA [ 3.077377] Console: switching to colour dummy device 80x25 [ 3.078881] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x174B:0xE164). [ 3.078903] [drm] register mmio base: 0xF4A20000 [ 3.078904] [drm] register mmio size: 131072 [ 3.078982] ATOM BIOS: C26401 [ 3.079572] radeon 0000:09:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) [ 3.079574] radeon 0000:09:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [ 3.079576] [drm] Detected VRAM RAM=1024M, BAR=256M [ 3.079577] [drm] RAM width 64bits DDR [ 3.079755] [TTM] Zone kernel: Available graphics memory: 8186568 kiB [ 3.079757] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 3.079757] [TTM] Initializing pool allocator [ 3.079773] [TTM] Initializing DMA pool allocator [ 3.080011] [drm] radeon: 1024M of VRAM memory ready [ 3.080012] [drm] radeon: 1024M of GTT memory ready. [ 3.080049] [drm] Loading CAICOS Microcode [ 3.080330] [drm] Internal thermal controller without fan control [ 3.081425] [drm] radeon: power management initialized [ 3.081551] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 3.085030] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 3.085221] radeon 0000:09:00.0: WB enabled [ 3.085224] radeon 0000:09:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88043d914c00 [ 3.085225] radeon 0000:09:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88043d914c0c [ 3.097438] radeon 0000:09:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900128b2118 [ 3.097441] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.097442] [drm] Driver supports precise vblank timestamp query. [ 3.097514] radeon 0000:09:00.0: irq 56 for MSI/MSI-X [ 3.097544] radeon 0000:09:00.0: radeon: using MSI. [ 3.097614] [drm] radeon: irq initialized.

On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

19 Nov 19 Nov

12:21 a.m.

On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...

On 15.11.2014 07:21, Andy Lutomirski wrote:

...
I have a Caicos card, like this:

[ 3.077260] [drm] radeon kernel modesetting enabled. [ 3.077338] checking generic (e0000000 600000) vs hw (e0000000 10000000) [ 3.077339] fb: switching to radeondrmfb from EFI VGA [ 3.077377] Console: switching to colour dummy device 80x25 [ 3.078881] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x174B:0xE164). [ 3.078903] [drm] register mmio base: 0xF4A20000 [ 3.078904] [drm] register mmio size: 131072 [ 3.078982] ATOM BIOS: C26401 [ 3.079572] radeon 0000:09:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) [ 3.079574] radeon 0000:09:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [ 3.079576] [drm] Detected VRAM RAM=1024M, BAR=256M [ 3.079577] [drm] RAM width 64bits DDR [ 3.079755] [TTM] Zone kernel: Available graphics memory: 8186568 kiB [ 3.079757] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 3.079757] [TTM] Initializing pool allocator [ 3.079773] [TTM] Initializing DMA pool allocator [ 3.080011] [drm] radeon: 1024M of VRAM memory ready [ 3.080012] [drm] radeon: 1024M of GTT memory ready. [ 3.080049] [drm] Loading CAICOS Microcode [ 3.080330] [drm] Internal thermal controller without fan control [ 3.081425] [drm] radeon: power management initialized [ 3.081551] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 3.085030] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 3.085221] radeon 0000:09:00.0: WB enabled [ 3.085224] radeon 0000:09:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88043d914c00 [ 3.085225] radeon 0000:09:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88043d914c0c [ 3.097438] radeon 0000:09:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900128b2118 [ 3.097441] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.097442] [drm] Driver supports precise vblank timestamp query. [ 3.097514] radeon 0000:09:00.0: irq 56 for MSI/MSI-X [ 3.097544] radeon 0000:09:00.0: radeon: using MSI. [ 3.097614] [drm] radeon: irq initialized.

On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

--Andy

...

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

Andy Lutomirski

12:34 a.m.

On Tue, Nov 18, 2014 at 4:21 PM, Andy Lutomirski luto@amacapital.net wrote:

...

On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
I have a Caicos card, like this:

[ 3.077260] [drm] radeon kernel modesetting enabled. [ 3.077338] checking generic (e0000000 600000) vs hw (e0000000 10000000) [ 3.077339] fb: switching to radeondrmfb from EFI VGA [ 3.077377] Console: switching to colour dummy device 80x25 [ 3.078881] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x174B:0xE164). [ 3.078903] [drm] register mmio base: 0xF4A20000 [ 3.078904] [drm] register mmio size: 131072 [ 3.078982] ATOM BIOS: C26401 [ 3.079572] radeon 0000:09:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) [ 3.079574] radeon 0000:09:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [ 3.079576] [drm] Detected VRAM RAM=1024M, BAR=256M [ 3.079577] [drm] RAM width 64bits DDR [ 3.079755] [TTM] Zone kernel: Available graphics memory: 8186568 kiB [ 3.079757] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 3.079757] [TTM] Initializing pool allocator [ 3.079773] [TTM] Initializing DMA pool allocator [ 3.080011] [drm] radeon: 1024M of VRAM memory ready [ 3.080012] [drm] radeon: 1024M of GTT memory ready. [ 3.080049] [drm] Loading CAICOS Microcode [ 3.080330] [drm] Internal thermal controller without fan control [ 3.081425] [drm] radeon: power management initialized [ 3.081551] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 3.085030] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 3.085221] radeon 0000:09:00.0: WB enabled [ 3.085224] radeon 0000:09:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88043d914c00 [ 3.085225] radeon 0000:09:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88043d914c0c [ 3.097438] radeon 0000:09:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900128b2118 [ 3.097441] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.097442] [drm] Driver supports precise vblank timestamp query. [ 3.097514] radeon 0000:09:00.0: irq 56 for MSI/MSI-X [ 3.097544] radeon 0000:09:00.0: radeon: using MSI. [ 3.097614] [drm] radeon: irq initialized.

On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I could be wrong here, but I think that radeon.dpm=0, power_profile=default is okay, but radeon.dpm=0, power_profile=high is bad.

--Andy

...

--Andy

...
-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

-- Andy Lutomirski AMA Capital Management, LLC

Andy Lutomirski

12:50 a.m.

On Tue, Nov 18, 2014 at 4:34 PM, Andy Lutomirski luto@amacapital.net wrote:

...

On Tue, Nov 18, 2014 at 4:21 PM, Andy Lutomirski luto@amacapital.net wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
I have a Caicos card, like this:

[ 3.077260] [drm] radeon kernel modesetting enabled. [ 3.077338] checking generic (e0000000 600000) vs hw (e0000000 10000000) [ 3.077339] fb: switching to radeondrmfb from EFI VGA [ 3.077377] Console: switching to colour dummy device 80x25 [ 3.078881] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x174B:0xE164). [ 3.078903] [drm] register mmio base: 0xF4A20000 [ 3.078904] [drm] register mmio size: 131072 [ 3.078982] ATOM BIOS: C26401 [ 3.079572] radeon 0000:09:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) [ 3.079574] radeon 0000:09:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [ 3.079576] [drm] Detected VRAM RAM=1024M, BAR=256M [ 3.079577] [drm] RAM width 64bits DDR [ 3.079755] [TTM] Zone kernel: Available graphics memory: 8186568 kiB [ 3.079757] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 3.079757] [TTM] Initializing pool allocator [ 3.079773] [TTM] Initializing DMA pool allocator [ 3.080011] [drm] radeon: 1024M of VRAM memory ready [ 3.080012] [drm] radeon: 1024M of GTT memory ready. [ 3.080049] [drm] Loading CAICOS Microcode [ 3.080330] [drm] Internal thermal controller without fan control [ 3.081425] [drm] radeon: power management initialized [ 3.081551] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [ 3.085030] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). [ 3.085221] radeon 0000:09:00.0: WB enabled [ 3.085224] radeon 0000:09:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88043d914c00 [ 3.085225] radeon 0000:09:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88043d914c0c [ 3.097438] radeon 0000:09:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900128b2118 [ 3.097441] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.097442] [drm] Driver supports precise vblank timestamp query. [ 3.097514] radeon 0000:09:00.0: irq 56 for MSI/MSI-X [ 3.097544] radeon 0000:09:00.0: radeon: using MSI. [ 3.097614] [drm] radeon: irq initialized.

On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I could be wrong here, but I think that radeon.dpm=0, power_profile=default is okay, but radeon.dpm=0, power_profile=high is bad.

I'm wrong again. power_profile=default is also bad.

Grr.

--Andy

Michel Dänzer

7:19 a.m.

On 19.11.2014 09:21, Andy Lutomirski wrote:

...

On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

20 Nov 20 Nov

12:07 a.m.

On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:

...

On 19.11.2014 09:21, Andy Lutomirski wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

Got one with perf:

16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159

and:

Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu

The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.

I'll build a kernel with latencytop.

--Andy

Andy Lutomirski

12:58 a.m.

On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski luto@amacapital.net wrote:

...

On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 19.11.2014 09:21, Andy Lutomirski wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

Got one with perf:

16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159

and:

Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu

The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.

I'll build a kernel with latencytop.

I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg.

Turning off THP gets rid of the khugepaged thing. The 489.2ms is radeon_fence_default_wait is amazingly reproducible -- I've seen that exact number three times now.

...

--Andy

-- Andy Lutomirski AMA Capital Management, LLC

Michel Dänzer

26 Nov 26 Nov

6:42 a.m.

On 20.11.2014 09:58, Andy Lutomirski wrote:

...

On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 19.11.2014 09:21, Andy Lutomirski wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote:

...
On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything graphics intensive seems to cause my system to become unusable for tens of seconds. Pointing Firefox at Google Maps is a big offender -- it can take several minutes for me to move my mouse far enough to close the tab and get my computer back.

On bootup, I get this warning: [drm:btc_dpm_set_power_state] *ERROR* rv770_restrict_performance_levels_before_switch failed

Setting radeon.dpm=0 seems to work around this problem at the cost of giving my rather slow graphics.

Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

Got one with perf:

16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159

and:

Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu

The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.

I'll build a kernel with latencytop.

I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg.

Turning off THP gets rid of the khugepaged thing. The 489.2ms is radeon_fence_default_wait is amazingly reproducible -- I've seen that exact number three times now.

Sounds like the long stalls were THP, but the shorter ones might be radeon?

Can you get some call graphs for the profile or from latencytop? Make sure at least the kernel is built with frame pointers (CONFIG_FRAME_POINTER=y), preferably also userspace (-fno-omit-frame-pointer).

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

3:38 p.m.

On Tue, Nov 25, 2014 at 10:42 PM, Michel Dänzer michel@daenzer.net wrote:

...

On 20.11.2014 09:58, Andy Lutomirski wrote:

...
On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 19.11.2014 09:21, Andy Lutomirski wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 15.11.2014 07:21, Andy Lutomirski wrote: > > > > On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything > graphics intensive seems to cause my system to become unusable for > tens of seconds. Pointing Firefox at Google Maps is a big offender > -- > it can take several minutes for me to move my mouse far enough to > close the tab and get my computer back. > > On bootup, I get this warning: > [drm:btc_dpm_set_power_state] *ERROR* > rv770_restrict_performance_levels_before_switch failed > > Setting radeon.dpm=0 seems to work around this problem at the cost of > giving my rather slow graphics. > > Are there known issues here?

Can you bisect the kernel, or at least isolate which kernel version first introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

Got one with perf:

16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159

and:

Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu

The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.

I'll build a kernel with latencytop.

I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg.

Turning off THP gets rid of the khugepaged thing. The 489.2ms is radeon_fence_default_wait is amazingly reproducible -- I've seen that exact number three times now.

Sounds like the long stalls were THP, but the shorter ones might be radeon?

Can you get some call graphs for the profile or from latencytop? Make sure at least the kernel is built with frame pointers (CONFIG_FRAME_POINTER=y), preferably also userspace (-fno-omit-frame-pointer).

Will try next week. I'm out of town.

--Andy

...

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

Andy Lutomirski

9 Dec 9 Dec

12:24 a.m.

On Wed, Nov 26, 2014 at 7:38 AM, Andy Lutomirski luto@amacapital.net wrote:

...

On Tue, Nov 25, 2014 at 10:42 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 20.11.2014 09:58, Andy Lutomirski wrote:

...
On Wed, Nov 19, 2014 at 4:07 PM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Nov 18, 2014 at 11:19 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 19.11.2014 09:21, Andy Lutomirski wrote:

...
On Mon, Nov 17, 2014 at 1:51 AM, Michel Dänzer michel@daenzer.net wrote: > > > On 15.11.2014 07:21, Andy Lutomirski wrote: >> >> >> >> On recent kernels (3.16 through 3.18-rc4, perhaps), doing anything >> graphics intensive seems to cause my system to become unusable for >> tens of seconds. Pointing Firefox at Google Maps is a big offender >> -- >> it can take several minutes for me to move my mouse far enough to >> close the tab and get my computer back. >> >> On bootup, I get this warning: >> [drm:btc_dpm_set_power_state] *ERROR* >> rv770_restrict_performance_levels_before_switch failed >> >> Setting radeon.dpm=0 seems to work around this problem at the cost of >> giving my rather slow graphics. >> >> Are there known issues here? > > > > > Can you bisect the kernel, or at least isolate which kernel version > first > introduced the problem?

With whatever userspace I'm running, I'm seeing it 3.13, 3.14, 3.15, 3.16, and 3.18-rc4+. I haven't tried other versions.

With radeon.dpm=0, I can still trigger short stalls (around one second), but I seem unable to trigger long stalls easily. (I say easily because, just as I was typing this email, my system stalled for about a minute.)

I can only think of two things offhand that could cause such extremely long stalls: Swap thrashing or IRQ storms.

With a setup where you can easily trigger long stalls, can you try getting a CPU profile for a stall with sysprof or perf?

Got one with perf:

16.82% Xorg libc-2.18.so [.] __memcpy_sse2_unaligned 9.20% swapper [kernel.kallsyms] [k] intel_idle 1.00% Xorg [kernel.kallsyms] [k] evergreen_irq_set 0.83% firefox libxul.so [.] 0x0000000001d93281 0.69% firefox libxul.so [.] 0x0000000001d932ad 0.62% firefox [kernel.kallsyms] [k] copy_user_generic_string 0.55% swapper [kernel.kallsyms] [k] evergreen_irq_ack 0.54% firefox libpthread-2.18.so [.] pthread_mutex_lock 0.52% firefox libpthread-2.18.so [.] pthread_mutex_unlock 0.45% Xorg [kernel.kallsyms] [k] drm_mm_insert_node_in_range_generic 0.41% Xorg [kernel.kallsyms] [k] lock_release 0.40% Xorg [kernel.kallsyms] [k] lock_acquire 0.35% firefox firefox [.] 0x000000000001245d 0.33% Xorg [kernel.kallsyms] [k] __module_address 0.31% firefox [kernel.kallsyms] [k] clear_page_c 0.29% Xorg [kernel.kallsyms] [k] copy_user_generic_string 0.28% firefox firefox [.] 0x0000000000013159

and:

Samples: 11K of event 'irq:irq_handler_entry', Event count (approx.): 11802 87.43% swapper [kernel.kallsyms] [k] handle_irq_event_percpu 7.52% firefox [kernel.kallsyms] [k] handle_irq_event_percpu 1.84% irq/36-ahci [kernel.kallsyms] [k] handle_irq_event_percpu 1.14% Xorg [kernel.kallsyms] [k] handle_irq_event_percpu 0.75% kworker/5:0 [kernel.kallsyms] [k] handle_irq_event_percpu 0.32% gnome-shell [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% kworker/5:1H [kernel.kallsyms] [k] handle_irq_event_percpu 0.25% Media D~ode #10 [kernel.kallsyms] [k] handle_irq_event_percpu 0.19% ImageDe~er #330 [kernel.kallsyms] [k] handle_irq_event_percpu 0.07% pulseaudio [kernel.kallsyms] [k] handle_irq_event_percpu

The cycles were with -e cycles:pp, so I think that iret would have shown up if there were enough IRQs to cause the problem.

I'll build a kernel with latencytop.

I just caught call_rwsem_down_write_failed for 5379 ms in khugepaged (holy crap) and radeon_fence_default_wait for 489.2ms in Xorg.

Turning off THP gets rid of the khugepaged thing. The 489.2ms is radeon_fence_default_wait is amazingly reproducible -- I've seen that exact number three times now.

Sounds like the long stalls were THP, but the shorter ones might be radeon?

Can you get some call graphs for the profile or from latencytop? Make sure at least the kernel is built with frame pointers (CONFIG_FRAME_POINTER=y), preferably also userspace (-fno-omit-frame-pointer).

Will try next week. I'm out of town.

The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

--Andy

...

--Andy

...
-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

-- Andy Lutomirski AMA Capital Management, LLC

Michel Dänzer

9:18 a.m.

On 09.12.2014 09:24, Andy Lutomirski wrote:

...

The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

4:06 p.m.

On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...

On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...

Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64

I'm planning on upgrading to Fedora 21 fairly soon.

--Andy

...

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

Andy Lutomirski

9:39 p.m.

On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...

On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64

I'm planning on upgrading to Fedora 21 fairly soon.

Upgrading to mesa-dri-drivers-10.3.3-1.20141110.fc21.x86_64 seems to have helped enough that my usual test (open a couple of Firefox tabs with graphics in them) doesn't hang anymore.

This card still isn't *fast*. Is there some way I can check that I'm actually using all 16 PCIe lanes? In my tinkering w/ power management settings, I got some odd logs suggesting that only one lane was in use.

Other than that, maybe everything works :) But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

--Andy

...

--Andy

...
-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

-- Andy Lutomirski AMA Capital Management, LLC

Michel Dänzer

10 Dec 10 Dec

9:44 a.m.

On 10.12.2014 06:39, Andy Lutomirski wrote:

...

On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64

I'm planning on upgrading to Fedora 21 fairly soon.

Upgrading to mesa-dri-drivers-10.3.3-1.20141110.fc21.x86_64 seems to have helped enough that my usual test (open a couple of Firefox tabs with graphics in them) doesn't hang anymore.

Hmm, since that looks like the exact same upstream version, maybe it was actually upgrading something else that made the difference?

...

This card still isn't *fast*.

I'm afraid it wasn't exactly a high-end card even when it was new. What kind of operations are slow?

...

Is there some way I can check that I'm actually using all 16 PCIe lanes? In my tinkering w/ power management settings, I got some odd logs suggesting that only one lane was in use.

You can try forcing off ASPM with radeon.aspm=0, other than that I'm not sure.

...

But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

Actually, this looks more like buggy userspace stalling itself. :)

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

8:28 p.m.

On Wed, Dec 10, 2014 at 1:44 AM, Michel Dänzer michel@daenzer.net wrote:

...

On 10.12.2014 06:39, Andy Lutomirski wrote:

...
On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64

I'm planning on upgrading to Fedora 21 fairly soon.

Upgrading to mesa-dri-drivers-10.3.3-1.20141110.fc21.x86_64 seems to have helped enough that my usual test (open a couple of Firefox tabs with graphics in them) doesn't hang anymore.

Hmm, since that looks like the exact same upstream version, maybe it was actually upgrading something else that made the difference?

Maybe mutter?

...

...
This card still isn't *fast*.

I'm afraid it wasn't exactly a high-end card even when it was new. What kind of operations are slow?

Things like scrolling in Google Maps. It's not *that* bad, but older Intel IGPs still seem considerably smoother.

...

...
Is there some way I can check that I'm actually using all 16 PCIe lanes? In my tinkering w/ power management settings, I got some odd logs suggesting that only one lane was in use.

You can try forcing off ASPM with radeon.aspm=0, other than that I'm not sure.

...
But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

Actually, this looks more like buggy userspace stalling itself. :)

I thought the stall was the kernel evicting things from vram. Why does it need to wait for userspace for that? Is it that userspace is actively using whatever's being evicted?

--Andy

...

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

Michel Dänzer

11 Dec 11 Dec

4:24 a.m.

On 11.12.2014 05:28, Andy Lutomirski wrote:

...

On Wed, Dec 10, 2014 at 1:44 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 10.12.2014 06:39, Andy Lutomirski wrote:

...
On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

[...]

...

...
...
But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

Actually, this looks more like buggy userspace stalling itself. :)

I thought the stall was the kernel evicting things from vram. Why does it need to wait for userspace for that? Is it that userspace is actively using whatever's being evicted?

As I explained above, the stall happens because userspace does CPU access to a BO which resides in the CPU-inaccessible part of VRAM. The kernel has to move the BO into the CPU accessible part of VRAM before it can let userspace proceed.

Current Mesa (10.4 or newer I think) sets a hint for BOs which will likely be accessed by the CPU, so recent kernels can prioritize putting those into the CPU accessible part of VRAM in the first place.

Or, if you're using EXA, the problem could be in the xf86-video-ati EXA code.

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Andy Lutomirski

5:13 a.m.

On Wed, Dec 10, 2014 at 8:24 PM, Michel Dänzer michel@daenzer.net wrote:

...

On 11.12.2014 05:28, Andy Lutomirski wrote:

...
On Wed, Dec 10, 2014 at 1:44 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 10.12.2014 06:39, Andy Lutomirski wrote:

...
On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote: > > The relevant line from latencytop seems to be: > > 154 20441402 489139 radeon_fence_default_wait [radeon] > fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] > radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] > ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first > [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] > radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

[...]

...
...
...
But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

Actually, this looks more like buggy userspace stalling itself. :)

I thought the stall was the kernel evicting things from vram. Why does it need to wait for userspace for that? Is it that userspace is actively using whatever's being evicted?

As I explained above, the stall happens because userspace does CPU access to a BO which resides in the CPU-inaccessible part of VRAM. The kernel has to move the BO into the CPU accessible part of VRAM before it can let userspace proceed.

Sure, but why does that take nearly 500ms? Even if the object in question is the entire framebuffer, that still seems extraordinarily slow.

--Andy

...

Current Mesa (10.4 or newer I think) sets a hint for BOs which will likely be accessed by the CPU, so recent kernels can prioritize putting those into the CPU accessible part of VRAM in the first place.

Or, if you're using EXA, the problem could be in the xf86-video-ati EXA code.

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

-- Andy Lutomirski AMA Capital Management, LLC

Michel Dänzer

16 Dec 16 Dec

8 a.m.

On 11.12.2014 14:13, Andy Lutomirski wrote:

...

On Wed, Dec 10, 2014 at 8:24 PM, Michel Dänzer michel@daenzer.net wrote:

...
On 11.12.2014 05:28, Andy Lutomirski wrote:

...
On Wed, Dec 10, 2014 at 1:44 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 10.12.2014 06:39, Andy Lutomirski wrote:

...
On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote: > On 09.12.2014 09:24, Andy Lutomirski wrote: >> >> The relevant line from latencytop seems to be: >> >> 154 20441402 489139 radeon_fence_default_wait [radeon] >> fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] >> radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] >> ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first >> [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] >> radeon_bo_fault_reserve_notify [radeon] > > Which process is this?

Xorg

> > Looks like CPU access to a BO in VRAM, but the BO is located outside of > the CPU visible area of VRAM, so it has to be moved into the CPU visible > area first.

[...]

...
...
...
But I'm still waiting for the day that buggy userspace *can't* cause kernel graphics stalls.

Actually, this looks more like buggy userspace stalling itself. :)

I thought the stall was the kernel evicting things from vram. Why does it need to wait for userspace for that? Is it that userspace is actively using whatever's being evicted?

As I explained above, the stall happens because userspace does CPU access to a BO which resides in the CPU-inaccessible part of VRAM. The kernel has to move the BO into the CPU accessible part of VRAM before it can let userspace proceed.

Sure, but why does that take nearly 500ms? Even if the object in question is the entire framebuffer, that still seems extraordinarily slow.

It has to wait for any previously queued GPU operations and the eviction of other buffers. Also, TTM buffer moves are currently synchronous, i.e. TTM waits for a buffer to become idle before starting its move, which means we don't get maximum throughput for a series of buffer moves.

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Alex Deucher

10 Dec 10 Dec

2:56 p.m.

On Tue, Dec 9, 2014 at 4:39 PM, Andy Lutomirski luto@amacapital.net wrote:

...

On Tue, Dec 9, 2014 at 8:06 AM, Andy Lutomirski luto@amacapital.net wrote:

...
On Tue, Dec 9, 2014 at 1:18 AM, Michel Dänzer michel@daenzer.net wrote:

...
On 09.12.2014 09:24, Andy Lutomirski wrote:

...
The relevant line from latencytop seems to be:

154 20441402 489139 radeon_fence_default_wait [radeon] fence_wait_timeout ttm_bo_wait [ttm] ttm_bo_move_accel_cleanup [ttm] radeon_move_blit.isra.12 [radeon] radeon_bo_move [radeon] ttm_bo_handle_move_mem [ttm] ttm_bo_evict [ttm] ttm_mem_evict_first [ttm] ttm_bo_mem_space [ttm] ttm_bo_validate [ttm] radeon_bo_fault_reserve_notify [radeon]

Which process is this?

Xorg

...
Looks like CPU access to a BO in VRAM, but the BO is located outside of the CPU visible area of VRAM, so it has to be moved into the CPU visible area first.

Which version of Mesa are you using?

mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64

I'm planning on upgrading to Fedora 21 fairly soon.

Upgrading to mesa-dri-drivers-10.3.3-1.20141110.fc21.x86_64 seems to have helped enough that my usual test (open a couple of Firefox tabs with graphics in them) doesn't hang anymore.

This card still isn't *fast*. Is there some way I can check that I'm actually using all 16 PCIe lanes? In my tinkering w/ power management settings, I got some odd logs suggesting that only one lane was in use.

You should be using all the lanes available. The main issue with that card is vram memory bandwidth. Those chips have a single channel memory interface and most OEMs populate them with DDR3 memory rather than GDDR5.

from your log: [ 3.079577] [drm] RAM width 64bits DDR ... [ 3.082589] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0

Alex

3804

Age (days ago)

3836

Last active (days ago)

dri-devel@lists.freedesktop.org

19 comments

3 participants

tags (0)

participants (3)

Alex Deucher
Andy Lutomirski
Michel Dänzer