Am 13.06.2014 23:31, schrieb Alex Deucher:
On Fri, Jun 13, 2014 at 11:45 AM, Christian König deathsimple@vodafone.de wrote:
Hi Marek,
ah, yes! Piglit in combination with that patch can indeed crash the box.
Going to investigate now that I can reproduce it.
I wonder if it's a clockgating issue with the MC or BIF? You might try adjusting the rdev->cg_flags (try setting it to 0) in radeon_asic.c or disabling dpm.
Unfortunately that was just a false alarm.
I was just on a branch which didn't had the "stop poisoning the GART TLB" patch, after applying this patch I can again let piglit run for the whole night without a lockup.
No idea what goes wrong when Marek runs piglit, but 3.15.0+"stop poisoning the GART TLB"+"force_gtt" is rock solid here.
Christian.
Alex
Thanks, Christian.
Am 13.06.2014 15:19, schrieb Marek Olšák:
Hi,
With my "force_gtt" patch, Cape Verde is unstable too, so all GCN chips are affected.
I recommend applying that patch, because it will reproduce the problem faster. Without it, the hangs are very rare and it may take a while before they occur.
Marek
On Thu, Jun 12, 2014 at 1:23 PM, Christian König deathsimple@vodafone.de wrote:
Please do so, and you might want to try 3.15.0 as well.
I've tested multiple piglit runs over night with my Bonaire and 3.15.0 and that seemed to work perfectly fine.
Going to test Alex drm-next-3.16 a bit more as well.
Christian.
Am 11.06.2014 12:56, schrieb Marek Olšák:
I only tested Bonaire. I can test Cape Verde if needed.
Marek
On Wed, Jun 11, 2014 at 11:29 AM, Christian König deathsimple@vodafone.de wrote:
Crap, I already wanted to check back with you if that really fixes your problems.
Thanks for the info, this crash also only happens on CIK doesn't it?
Christian.
Am 11.06.2014 01:30, schrieb Marek Olšák:
> Sorry to tell you the bad news. This patch doesn't fix the hangs on my > machine. > > I tested drm-next-3.16 from Alex's tree. I also switched copying from > SDMA to CP DMA, which hung too. > > I also tried this: > > git checkout (the problematic commit): > 6d2f294 - drm/radeon: use normal BOs for the page tables v4 > > git cherry-pick (fixes): > 0e97703c - drm/radeon: add define for flags used in R600+ GTT > 0986c1a5 - drm/radeon: stop poisoning the GART TLB > 4906f689 - drm/radeon: fix page directory update size estimation > 4b095566 - drm/radeon: fix buffer placement under memory pressure v2 > > Then I tested both SDMA and CP DMA copying. Both were unstable. > > Testing was done with piglit / quick.tests. > > Marek > > > On Wed, Jun 4, 2014 at 3:29 PM, Christian König > deathsimple@vodafone.de > wrote: >> From: Christian König christian.koenig@amd.com >> >> When we set the valid bit on invalid GART entries they are >> loaded into the TLB when an adjacent entry is loaded. This >> poisons the TLB with invalid entries which are sometimes >> not correctly removed on TLB flush. >> >> For stable inclusion the patch probably needs to be modified a bit. >> >> Signed-off-by: Christian König christian.koenig@amd.com >> Cc: stable@vger.kernel.org >> --- >> drivers/gpu/drm/radeon/rs600.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/radeon/rs600.c >> b/drivers/gpu/drm/radeon/rs600.c >> index 0a8be63..e0465b2 100644 >> --- a/drivers/gpu/drm/radeon/rs600.c >> +++ b/drivers/gpu/drm/radeon/rs600.c >> @@ -634,7 +634,10 @@ int rs600_gart_set_page(struct radeon_device >> *rdev, >> int i, uint64_t addr) >> return -EINVAL; >> } >> addr = addr & 0xFFFFFFFFFFFFF000ULL; >> - addr |= R600_PTE_GART; >> + if (addr == rdev->dummy_page.addr) >> + addr |= R600_PTE_SYSTEM | R600_PTE_SNOOPED; >> + else >> + addr |= R600_PTE_GART; >> writeq(addr, ptr + (i * 8)); >> return 0; >> } >> -- >> 1.9.1 >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel