Hi!
I've never been able to run the open source drm driver on my 7870 Tahiti card. The console kms works but it crashes as soon as X is started. There have been many mentions of it in bug reports, but none of the attempts at fixes worked. https://bugs.freedesktop.org/show_bug.cgi?id=71689 is currently open https://bugs.freedesktop.org/show_bug.cgi?id=60879 was about a few different issues, including this one, but none of the proposed fixes worked on my 7870
I decided to debug this on my own, and though I am a total noob at driver development, I think I made some progress at understanding the issue.
The 7870 Tahiti is a "harvested" chip, which means some CUs are disabled. 25% of them in this case. The code handling this is in si.c, in the function is si_setup_spi(). The idea seems to be that a bit mask telling which CUs are truly available must be set in the SPI_STATIC_THREAD_MGMT_3 register. But the algorithm to build that mask seems fuzzy to me. It walks the bits of active_cu until it finds an active one, and stop there to build its make.
data = RREG32(SPI_STATIC_THREAD_MGMT_3); active_cu = si_get_cu_enabled(rdev, cu_per_sh); mask = 1; for (k = 0; k < 16; k++) { mask <<= k; if (active_cu & mask) { data &= ~mask; WREG32(SPI_STATIC_THREAD_MGMT_3, data); break; } }
However, from the little I understand that doesn't cover all cases, but only works if the disabled CUs are in the lower bits. For my card, the active_cu results are: Decimal - Binary 252 - 11111100 252 - 11111100 207 - 11001111 252 - 11111100
As you can see the 3rd group has its disabled CUs straight in the middle of it, but the algorithm probably thinks that they are all good since the first bit is 1 and it stops right there. So I guess it tries to use bad CUs at runtime and fails miserably I tried to change the way the register data is computed by I just can't figure the logic of it (and I couldn't find much details in AMD's doc). Assuming it generates the right thing for active_cu == 11111100, I get data == 1111111111110111. So I have 2 disabled units, but only a single 0 in the mask? Pretending to have more disabled units, active_cu == 11110000, that generates data == 1111111110111111. Again, single 0. Is that right? What would be the bit pattern required for active_cu == 11001111 ?
I might be way off track in my investigation, so please enlighten me! Thanks for your help, Alexandre
On Thu, Oct 8, 2015 at 9:59 PM, Alexandre Biron bironalexandre@gmail.com wrote:
Hi!
I've never been able to run the open source drm driver on my 7870 Tahiti card. The console kms works but it crashes as soon as X is started. There have been many mentions of it in bug reports, but none of the attempts at fixes worked. https://bugs.freedesktop.org/show_bug.cgi?id=71689 is currently open https://bugs.freedesktop.org/show_bug.cgi?id=60879 was about a few different issues, including this one, but none of the proposed fixes worked on my 7870
I decided to debug this on my own, and though I am a total noob at driver development, I think I made some progress at understanding the issue.
The 7870 Tahiti is a "harvested" chip, which means some CUs are disabled. 25% of them in this case. The code handling this is in si.c, in the function is si_setup_spi(). The idea seems to be that a bit mask telling which CUs are truly available must be set in the SPI_STATIC_THREAD_MGMT_3 register. But the algorithm to build that mask seems fuzzy to me. It walks the bits of active_cu until it finds an active one, and stop there to build its make.
data = RREG32(SPI_STATIC_THREAD_MGMT_3); active_cu = si_get_cu_enabled(rdev, cu_per_sh); mask = 1; for (k = 0; k < 16; k++) { mask <<= k; if (active_cu & mask) { data &= ~mask; WREG32(SPI_STATIC_THREAD_MGMT_3, data); break; } }
However, from the little I understand that doesn't cover all cases, but only works if the disabled CUs are in the lower bits. For my card, the active_cu results are: Decimal - Binary 252 - 11111100 252 - 11111100 207 - 11001111 252 - 11111100
As you can see the 3rd group has its disabled CUs straight in the middle of it, but the algorithm probably thinks that they are all good since the first bit is 1 and it stops right there. So I guess it tries to use bad CUs at runtime and fails miserably I tried to change the way the register data is computed by I just can't figure the logic of it (and I couldn't find much details in AMD's doc). Assuming it generates the right thing for active_cu == 11111100, I get data == 1111111111110111. So I have 2 disabled units, but only a single 0 in the mask? Pretending to have more disabled units, active_cu == 11110000, that generates data == 1111111110111111. Again, single 0. Is that right? What would be the bit pattern required for active_cu == 11001111 ?
I might be way off track in my investigation, so please enlighten me!
I don't think this register even needs to be programmed. Can you try skipping the call to si_setup_spi()? The hw default values 0xffff should be fine. These registers are not for harvesting, but rather for limiting the number of CUs uses by specific shader stages so even if you program them wrong, the hw will do the right thing internally.
Alex
Hi!
Skipping si_setup_spi doesn't help at all. Where is harvesting handled/setup then? Only other place that to me seems related would be si_setup_rb, but from what I can see by logging none of my backends seem to be disabled.
Anything else I can check? Thanks! Alexandre
On Mon, Oct 12, 2015 at 5:20 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Thu, Oct 8, 2015 at 9:59 PM, Alexandre Biron bironalexandre@gmail.com wrote:
Hi!
I've never been able to run the open source drm driver on my 7870 Tahiti card. The console kms works but it crashes as soon as X is started. There have been many mentions of it in bug reports, but none of the attempts at fixes worked. https://bugs.freedesktop.org/show_bug.cgi?id=71689 is currently open https://bugs.freedesktop.org/show_bug.cgi?id=60879 was about a few different issues, including this one, but none of the proposed fixes worked on my 7870
I decided to debug this on my own, and though I am a total noob at driver development, I think I made some progress at understanding the issue.
The 7870 Tahiti is a "harvested" chip, which means some CUs are disabled. 25% of them in this case. The code handling this is in si.c, in the function is si_setup_spi(). The idea seems to be that a bit mask telling which CUs are truly available must be set in the SPI_STATIC_THREAD_MGMT_3 register. But the algorithm to build that mask seems fuzzy to me. It walks the bits of active_cu until it finds an active one, and stop there to build its make.
data = RREG32(SPI_STATIC_THREAD_MGMT_3); active_cu = si_get_cu_enabled(rdev, cu_per_sh); mask = 1; for (k = 0; k < 16; k++) { mask <<= k; if (active_cu & mask) { data &= ~mask; WREG32(SPI_STATIC_THREAD_MGMT_3, data); break; } }
However, from the little I understand that doesn't cover all cases, but only works if the disabled CUs are in the lower bits. For my card, the active_cu results are: Decimal - Binary 252 - 11111100 252 - 11111100 207 - 11001111 252 - 11111100
As you can see the 3rd group has its disabled CUs straight in the middle of it, but the algorithm probably thinks that they are all good since the first bit is 1 and it stops right there. So I guess it tries to use bad CUs at runtime and fails miserably I tried to change the way the register data is computed by I just can't figure the logic of it (and I couldn't find much details in AMD's doc). Assuming it generates the right thing for active_cu == 11111100, I get data == 1111111111110111. So I have 2 disabled units, but only a single 0 in the mask? Pretending to have more disabled units, active_cu == 11110000, that generates data == 1111111110111111. Again, single 0. Is that right? What would be the bit pattern required for active_cu == 11001111 ?
I might be way off track in my investigation, so please enlighten me!
I don't think this register even needs to be programmed. Can you try skipping the call to si_setup_spi()? The hw default values 0xffff should be fine. These registers are not for harvesting, but rather for limiting the number of CUs uses by specific shader stages so even if you program them wrong, the hw will do the right thing internally.
Alex
dri-devel@lists.freedesktop.org