From: Christian König christian.koenig@amd.com
Stop fiddling with jiffies, always wait for RADEON_FENCE_JIFFIES_TIMEOUT. Consolidate the two wait sequence implementations into just one function. Activate all waiters and remember if the reset was already done instead of trying to reset from only one thread.
v2: clear reset flag earlier to avoid timeout in IB test
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_device.c | 8 + drivers/gpu/drm/radeon/radeon_fence.c | 347 +++++++++++---------------------- 3 files changed, 127 insertions(+), 230 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 24f4960..f289308 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -327,7 +327,6 @@ struct radeon_fence_driver { /* sync_seq is protected by ring emission lock */ uint64_t sync_seq[RADEON_NUM_RINGS]; atomic64_t last_seq; - unsigned long last_activity; bool initialized; };
@@ -2170,6 +2169,7 @@ struct radeon_device { bool need_dma32; bool accel_working; bool fastfb_working; /* IGP feature*/ + bool needs_reset; struct radeon_surface_reg surface_regs[RADEON_GEM_MAX_SURFACES]; const struct firmware *me_fw; /* all family ME firmware */ const struct firmware *pfp_fw; /* r6/700 PFP firmware */ diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 841d0e0..3f35f21 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1549,6 +1549,14 @@ int radeon_gpu_reset(struct radeon_device *rdev) int resched;
down_write(&rdev->exclusive_lock); + + if (!rdev->needs_reset) { + up_write(&rdev->exclusive_lock); + return 0; + } + + rdev->needs_reset = false; + radeon_save_bios_scratch_regs(rdev); /* block TTM */ resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev); diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index ddb8f8e..b8f68b2 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -190,10 +190,8 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) } } while (atomic64_xchg(&rdev->fence_drv[ring].last_seq, seq) > seq);
- if (wake) { - rdev->fence_drv[ring].last_activity = jiffies; + if (wake) wake_up_all(&rdev->fence_queue); - } }
/** @@ -212,13 +210,13 @@ static void radeon_fence_destroy(struct kref *kref) }
/** - * radeon_fence_seq_signaled - check if a fence sequeuce number has signaled + * radeon_fence_seq_signaled - check if a fence sequence number has signaled * * @rdev: radeon device pointer * @seq: sequence number * @ring: ring index the fence is associated with * - * Check if the last singled fence sequnce number is >= the requested + * Check if the last signaled fence sequnce number is >= the requested * sequence number (all asics). * Returns true if the fence has signaled (current fence value * is >= requested value) or false if it has not (current fence @@ -263,113 +261,131 @@ bool radeon_fence_signaled(struct radeon_fence *fence) }
/** - * radeon_fence_wait_seq - wait for a specific sequence number + * radeon_fence_any_seq_signaled - check if any sequence number is signaled * * @rdev: radeon device pointer - * @target_seq: sequence number we want to wait for - * @ring: ring index the fence is associated with + * @seq: sequence numbers + * + * Check if the last signaled fence sequnce number is >= the requested + * sequence number (all asics). + * Returns true if any has signaled (current value is >= requested value) + * or false if it has not. Helper function for radeon_fence_wait_seq. + */ +static bool radeon_fence_any_seq_signaled(struct radeon_device *rdev, u64 *seq) +{ + unsigned i; + + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (seq[i] && radeon_fence_seq_signaled(rdev, seq[i], i)) + return true; + } + return false; +} + +/** + * radeon_fence_wait_seq - wait for a specific sequence numbers + * + * @rdev: radeon device pointer + * @target_seq: sequence number(s) we want to wait for * @intr: use interruptable sleep * @lock_ring: whether the ring should be locked or not * - * Wait for the requested sequence number to be written (all asics). + * Wait for the requested sequence number(s) to be written by any ring + * (all asics). Sequnce number array is indexed by ring id. * @intr selects whether to use interruptable (true) or non-interruptable * (false) sleep when waiting for the sequence number. Helper function - * for radeon_fence_wait(), et al. + * for radeon_fence_wait_*(). * Returns 0 if the sequence number has passed, error for all other cases. - * -EDEADLK is returned when a GPU lockup has been detected and the ring is - * marked as not ready so no further jobs get scheduled until a successful - * reset. + * -EDEADLK is returned when a GPU lockup has been detected. */ -static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 target_seq, - unsigned ring, bool intr, bool lock_ring) +static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 *target_seq, + bool intr, bool lock_ring) { - unsigned long timeout, last_activity; - uint64_t seq; - unsigned i; + uint64_t last_seq[RADEON_NUM_RINGS]; bool signaled; - int r; + int i, r; + + while (!radeon_fence_any_seq_signaled(rdev, target_seq)) { + + /* Save current sequence values, used to check for GPU lockups */ + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (!target_seq[i]) + continue;
- while (target_seq > atomic64_read(&rdev->fence_drv[ring].last_seq)) { - if (!rdev->ring[ring].ready) { - return -EBUSY; + last_seq[i] = atomic64_read(&rdev->fence_drv[i].last_seq); + trace_radeon_fence_wait_begin(rdev->ddev, target_seq[i]); + radeon_irq_kms_sw_irq_get(rdev, i); }
- timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT; - if (time_after(rdev->fence_drv[ring].last_activity, timeout)) { - /* the normal case, timeout is somewhere before last_activity */ - timeout = rdev->fence_drv[ring].last_activity - timeout; + if (intr) { + r = wait_event_interruptible_timeout(rdev->fence_queue, ( + (signaled = radeon_fence_any_seq_signaled(rdev, target_seq)) + || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT); } else { - /* either jiffies wrapped around, or no fence was signaled in the last 500ms - * anyway we will just wait for the minimum amount and then check for a lockup - */ - timeout = 1; + r = wait_event_timeout(rdev->fence_queue, ( + (signaled = radeon_fence_any_seq_signaled(rdev, target_seq)) + || rdev->needs_reset), RADEON_FENCE_JIFFIES_TIMEOUT); } - seq = atomic64_read(&rdev->fence_drv[ring].last_seq); - /* Save current last activity valuee, used to check for GPU lockups */ - last_activity = rdev->fence_drv[ring].last_activity;
- trace_radeon_fence_wait_begin(rdev->ddev, seq); - radeon_irq_kms_sw_irq_get(rdev, ring); - if (intr) { - r = wait_event_interruptible_timeout(rdev->fence_queue, - (signaled = radeon_fence_seq_signaled(rdev, target_seq, ring)), - timeout); - } else { - r = wait_event_timeout(rdev->fence_queue, - (signaled = radeon_fence_seq_signaled(rdev, target_seq, ring)), - timeout); + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (!target_seq[i]) + continue; + + radeon_irq_kms_sw_irq_put(rdev, i); + trace_radeon_fence_wait_end(rdev->ddev, target_seq[i]); } - radeon_irq_kms_sw_irq_put(rdev, ring); - if (unlikely(r < 0)) { + + if (unlikely(r < 0)) return r; - } - trace_radeon_fence_wait_end(rdev->ddev, seq);
if (unlikely(!signaled)) { + if (rdev->needs_reset) + return -EDEADLK; + /* we were interrupted for some reason and fence * isn't signaled yet, resume waiting */ - if (r) { + if (r) continue; + + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (!target_seq[i]) + continue; + + if (last_seq[i] != atomic64_read(&rdev->fence_drv[i].last_seq)) + break; }
- /* check if sequence value has changed since last_activity */ - if (seq != atomic64_read(&rdev->fence_drv[ring].last_seq)) { + if (i != RADEON_NUM_RINGS) continue; - }
- if (lock_ring) { + if (lock_ring) mutex_lock(&rdev->ring_lock); - }
- /* test if somebody else has already decided that this is a lockup */ - if (last_activity != rdev->fence_drv[ring].last_activity) { - if (lock_ring) { - mutex_unlock(&rdev->ring_lock); - } - continue; + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (!target_seq[i]) + continue; + + if (radeon_ring_is_lockup(rdev, i, &rdev->ring[i])) + break; }
- if (radeon_ring_is_lockup(rdev, ring, &rdev->ring[ring])) { + if (i < RADEON_NUM_RINGS) { /* good news we believe it's a lockup */ - dev_warn(rdev->dev, "GPU lockup (waiting for 0x%016llx last fence id 0x%016llx)\n", - target_seq, seq); - - /* change last activity so nobody else think there is a lockup */ - for (i = 0; i < RADEON_NUM_RINGS; ++i) { - rdev->fence_drv[i].last_activity = jiffies; - } - - /* mark the ring as not ready any more */ - rdev->ring[ring].ready = false; - if (lock_ring) { + dev_warn(rdev->dev, "GPU lockup (waiting for " + "0x%016llx last fence id 0x%016llx on" + " ring %d)\n", + target_seq[i], last_seq[i], i); + + /* remember that we need an reset */ + rdev->needs_reset = true; + if (lock_ring) mutex_unlock(&rdev->ring_lock); - } + wake_up_all(&rdev->fence_queue); return -EDEADLK; }
- if (lock_ring) { + if (lock_ring) mutex_unlock(&rdev->ring_lock); - } } } return 0; @@ -388,6 +404,7 @@ static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 target_seq, */ int radeon_fence_wait(struct radeon_fence *fence, bool intr) { + uint64_t seq[RADEON_NUM_RINGS] = {}; int r;
if (fence == NULL) { @@ -395,147 +412,15 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr) return -EINVAL; }
- r = radeon_fence_wait_seq(fence->rdev, fence->seq, - fence->ring, intr, true); - if (r) { - return r; - } - fence->seq = RADEON_FENCE_SIGNALED_SEQ; - return 0; -} - -static bool radeon_fence_any_seq_signaled(struct radeon_device *rdev, u64 *seq) -{ - unsigned i; - - for (i = 0; i < RADEON_NUM_RINGS; ++i) { - if (seq[i] && radeon_fence_seq_signaled(rdev, seq[i], i)) { - return true; - } - } - return false; -} - -/** - * radeon_fence_wait_any_seq - wait for a sequence number on any ring - * - * @rdev: radeon device pointer - * @target_seq: sequence number(s) we want to wait for - * @intr: use interruptable sleep - * - * Wait for the requested sequence number(s) to be written by any ring - * (all asics). Sequnce number array is indexed by ring id. - * @intr selects whether to use interruptable (true) or non-interruptable - * (false) sleep when waiting for the sequence number. Helper function - * for radeon_fence_wait_any(), et al. - * Returns 0 if the sequence number has passed, error for all other cases. - */ -static int radeon_fence_wait_any_seq(struct radeon_device *rdev, - u64 *target_seq, bool intr) -{ - unsigned long timeout, last_activity, tmp; - unsigned i, ring = RADEON_NUM_RINGS; - bool signaled; - int r; - - for (i = 0, last_activity = 0; i < RADEON_NUM_RINGS; ++i) { - if (!target_seq[i]) { - continue; - } - - /* use the most recent one as indicator */ - if (time_after(rdev->fence_drv[i].last_activity, last_activity)) { - last_activity = rdev->fence_drv[i].last_activity; - } - - /* For lockup detection just pick the lowest ring we are - * actively waiting for - */ - if (i < ring) { - ring = i; - } - } - - /* nothing to wait for ? */ - if (ring == RADEON_NUM_RINGS) { - return -ENOENT; - } - - while (!radeon_fence_any_seq_signaled(rdev, target_seq)) { - timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT; - if (time_after(last_activity, timeout)) { - /* the normal case, timeout is somewhere before last_activity */ - timeout = last_activity - timeout; - } else { - /* either jiffies wrapped around, or no fence was signaled in the last 500ms - * anyway we will just wait for the minimum amount and then check for a lockup - */ - timeout = 1; - } + seq[fence->ring] = fence->seq; + if (seq[fence->ring] == RADEON_FENCE_SIGNALED_SEQ) + return 0;
- trace_radeon_fence_wait_begin(rdev->ddev, target_seq[ring]); - for (i = 0; i < RADEON_NUM_RINGS; ++i) { - if (target_seq[i]) { - radeon_irq_kms_sw_irq_get(rdev, i); - } - } - if (intr) { - r = wait_event_interruptible_timeout(rdev->fence_queue, - (signaled = radeon_fence_any_seq_signaled(rdev, target_seq)), - timeout); - } else { - r = wait_event_timeout(rdev->fence_queue, - (signaled = radeon_fence_any_seq_signaled(rdev, target_seq)), - timeout); - } - for (i = 0; i < RADEON_NUM_RINGS; ++i) { - if (target_seq[i]) { - radeon_irq_kms_sw_irq_put(rdev, i); - } - } - if (unlikely(r < 0)) { - return r; - } - trace_radeon_fence_wait_end(rdev->ddev, target_seq[ring]); - - if (unlikely(!signaled)) { - /* we were interrupted for some reason and fence - * isn't signaled yet, resume waiting */ - if (r) { - continue; - } - - mutex_lock(&rdev->ring_lock); - for (i = 0, tmp = 0; i < RADEON_NUM_RINGS; ++i) { - if (time_after(rdev->fence_drv[i].last_activity, tmp)) { - tmp = rdev->fence_drv[i].last_activity; - } - } - /* test if somebody else has already decided that this is a lockup */ - if (last_activity != tmp) { - last_activity = tmp; - mutex_unlock(&rdev->ring_lock); - continue; - } - - if (radeon_ring_is_lockup(rdev, ring, &rdev->ring[ring])) { - /* good news we believe it's a lockup */ - dev_warn(rdev->dev, "GPU lockup (waiting for 0x%016llx)\n", - target_seq[ring]); - - /* change last activity so nobody else think there is a lockup */ - for (i = 0; i < RADEON_NUM_RINGS; ++i) { - rdev->fence_drv[i].last_activity = jiffies; - } + r = radeon_fence_wait_seq(fence->rdev, seq, intr, true); + if (r) + return r;
- /* mark the ring as not ready any more */ - rdev->ring[ring].ready = false; - mutex_unlock(&rdev->ring_lock); - return -EDEADLK; - } - mutex_unlock(&rdev->ring_lock); - } - } + fence->seq = RADEON_FENCE_SIGNALED_SEQ; return 0; }
@@ -557,7 +442,7 @@ int radeon_fence_wait_any(struct radeon_device *rdev, bool intr) { uint64_t seq[RADEON_NUM_RINGS]; - unsigned i; + unsigned i, num_rings = 0; int r;
for (i = 0; i < RADEON_NUM_RINGS; ++i) { @@ -567,15 +452,19 @@ int radeon_fence_wait_any(struct radeon_device *rdev, continue; }
- if (fences[i]->seq == RADEON_FENCE_SIGNALED_SEQ) { - /* something was allready signaled */ - return 0; - } - seq[i] = fences[i]->seq; + ++num_rings; + + /* test if something was allready signaled */ + if (seq[i] == RADEON_FENCE_SIGNALED_SEQ) + return 0; }
- r = radeon_fence_wait_any_seq(rdev, seq, intr); + /* nothing to wait for ? */ + if (num_rings == 0) + return -ENOENT; + + r = radeon_fence_wait_seq(rdev, seq, intr, true); if (r) { return r; } @@ -594,15 +483,15 @@ int radeon_fence_wait_any(struct radeon_device *rdev, */ int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring) { - uint64_t seq; + uint64_t seq[RADEON_NUM_RINGS] = {};
- seq = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL; - if (seq >= rdev->fence_drv[ring].sync_seq[ring]) { + seq[ring] = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL; + if (seq[ring] >= rdev->fence_drv[ring].sync_seq[ring]) { /* nothing to wait for, last_seq is already the last emited fence */ return -ENOENT; } - return radeon_fence_wait_seq(rdev, seq, ring, false, false); + return radeon_fence_wait_seq(rdev, seq, false, false); }
/** @@ -617,14 +506,15 @@ int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring) */ int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring) { - uint64_t seq = rdev->fence_drv[ring].sync_seq[ring]; + uint64_t seq[RADEON_NUM_RINGS] = {}; int r;
- r = radeon_fence_wait_seq(rdev, seq, ring, false, false); + seq[ring] = rdev->fence_drv[ring].sync_seq[ring]; + r = radeon_fence_wait_seq(rdev, seq, false, false); if (r) { - if (r == -EDEADLK) { + if (r == -EDEADLK) return -EDEADLK; - } + dev_err(rdev->dev, "error waiting for ring[%d] to become idle (%d)\n", ring, r); } @@ -826,7 +716,6 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring) for (i = 0; i < RADEON_NUM_RINGS; ++i) rdev->fence_drv[ring].sync_seq[i] = 0; atomic64_set(&rdev->fence_drv[ring].last_seq, 0); - rdev->fence_drv[ring].last_activity = jiffies; rdev->fence_drv[ring].initialized = false; }
From: Christian König christian.koenig@amd.com
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/radeon/cik_sdma.c | 3 +++ drivers/gpu/drm/radeon/ni_dma.c | 3 +++ drivers/gpu/drm/radeon/radeon_trace.h | 24 ++++++++++++++++++++++++ drivers/gpu/drm/radeon/si_dma.c | 3 +++ 4 files changed, 33 insertions(+)
diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c index b628606..ec91427 100644 --- a/drivers/gpu/drm/radeon/cik_sdma.c +++ b/drivers/gpu/drm/radeon/cik_sdma.c @@ -25,6 +25,7 @@ #include <drm/drmP.h> #include "radeon.h" #include "radeon_asic.h" +#include "radeon_trace.h" #include "cikd.h"
/* sdma */ @@ -657,6 +658,8 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, uint64_t value; unsigned ndw;
+ trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + if (flags & RADEON_VM_PAGE_SYSTEM) { while (count) { ndw = count * 2; diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c index dd6e968..e9cfe8a 100644 --- a/drivers/gpu/drm/radeon/ni_dma.c +++ b/drivers/gpu/drm/radeon/ni_dma.c @@ -24,6 +24,7 @@ #include <drm/drmP.h> #include "radeon.h" #include "radeon_asic.h" +#include "radeon_trace.h" #include "nid.h"
u32 cayman_gpu_check_soft_reset(struct radeon_device *rdev); @@ -260,6 +261,8 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, uint64_t value; unsigned ndw;
+ trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + if ((flags & RADEON_VM_PAGE_SYSTEM) || (count == 1)) { while (count) { ndw = count * 2; diff --git a/drivers/gpu/drm/radeon/radeon_trace.h b/drivers/gpu/drm/radeon/radeon_trace.h index f7e3678..811bca6 100644 --- a/drivers/gpu/drm/radeon/radeon_trace.h +++ b/drivers/gpu/drm/radeon/radeon_trace.h @@ -47,6 +47,30 @@ TRACE_EVENT(radeon_cs, __entry->fences) );
+TRACE_EVENT(radeon_vm_set_page, + TP_PROTO(uint64_t pe, uint64_t addr, unsigned count, + uint32_t incr, uint32_t flags), + TP_ARGS(pe, addr, count, incr, flags), + TP_STRUCT__entry( + __field(u64, pe) + __field(u64, addr) + __field(u32, count) + __field(u32, incr) + __field(u32, flags) + ), + + TP_fast_assign( + __entry->pe = pe; + __entry->addr = addr; + __entry->count = count; + __entry->incr = incr; + __entry->flags = flags; + ), + TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%08x, count=%u", + __entry->pe, __entry->addr, __entry->incr, + __entry->flags, __entry->count) +); + DECLARE_EVENT_CLASS(radeon_fence_request,
TP_PROTO(struct drm_device *dev, u32 seqno), diff --git a/drivers/gpu/drm/radeon/si_dma.c b/drivers/gpu/drm/radeon/si_dma.c index 49909d2..17205fd 100644 --- a/drivers/gpu/drm/radeon/si_dma.c +++ b/drivers/gpu/drm/radeon/si_dma.c @@ -24,6 +24,7 @@ #include <drm/drmP.h> #include "radeon.h" #include "radeon_asic.h" +#include "radeon_trace.h" #include "sid.h"
u32 si_gpu_check_soft_reset(struct radeon_device *rdev); @@ -79,6 +80,8 @@ void si_dma_vm_set_page(struct radeon_device *rdev, uint64_t value; unsigned ndw;
+ trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + if (flags & RADEON_VM_PAGE_SYSTEM) { while (count) { ndw = count * 2;
From: Christian König christian.koenig@amd.com
The DMA ring seems to be stable now.
v2: remove pt_ring_index as well
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/radeon/cik.c | 61 ----------------------------- drivers/gpu/drm/radeon/cik_sdma.c | 21 ++++------ drivers/gpu/drm/radeon/ni.c | 76 ------------------------------------ drivers/gpu/drm/radeon/ni_dma.c | 18 ++++----- drivers/gpu/drm/radeon/radeon.h | 8 +++- drivers/gpu/drm/radeon/radeon_asic.c | 15 +++---- drivers/gpu/drm/radeon/radeon_asic.h | 31 ++++++++------- drivers/gpu/drm/radeon/radeon_gart.c | 29 +++++++++++--- drivers/gpu/drm/radeon/si.c | 60 ---------------------------- drivers/gpu/drm/radeon/si_dma.c | 21 ++++------ 10 files changed, 73 insertions(+), 267 deletions(-)
diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 9cd2bc9..f9ed992 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -67,11 +67,6 @@ extern void si_init_uvd_internal_cg(struct radeon_device *rdev); extern int cik_sdma_resume(struct radeon_device *rdev); extern void cik_sdma_enable(struct radeon_device *rdev, bool enable); extern void cik_sdma_fini(struct radeon_device *rdev); -extern void cik_sdma_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags); static void cik_rlc_stop(struct radeon_device *rdev); static void cik_pcie_gen3_enable(struct radeon_device *rdev); static void cik_program_aspm(struct radeon_device *rdev); @@ -4834,62 +4829,6 @@ void cik_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) } }
-/** - * cik_vm_set_page - update the page tables using sDMA - * - * @rdev: radeon_device pointer - * @ib: indirect buffer to fill with commands - * @pe: addr of the page entry - * @addr: dst addr to write into pe - * @count: number of page entries to update - * @incr: increase next addr by incr bytes - * @flags: access flags - * - * Update the page tables using CP or sDMA (CIK). - */ -void cik_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags) -{ - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); - uint64_t value; - unsigned ndw; - - if (rdev->asic->vm.pt_ring_index == RADEON_RING_TYPE_GFX_INDEX) { - /* CP */ - while (count) { - ndw = 2 + count * 2; - if (ndw > 0x3FFE) - ndw = 0x3FFE; - - ib->ptr[ib->length_dw++] = PACKET3(PACKET3_WRITE_DATA, ndw); - ib->ptr[ib->length_dw++] = (WRITE_DATA_ENGINE_SEL(0) | - WRITE_DATA_DST_SEL(1)); - ib->ptr[ib->length_dw++] = pe; - ib->ptr[ib->length_dw++] = upper_32_bits(pe); - for (; ndw > 2; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { - value = radeon_vm_map_gart(rdev, addr); - value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { - value = addr; - } else { - value = 0; - } - addr += incr; - value |= r600_flags; - ib->ptr[ib->length_dw++] = value; - ib->ptr[ib->length_dw++] = upper_32_bits(value); - } - } - } else { - /* DMA */ - cik_sdma_vm_set_page(rdev, ib, pe, addr, count, incr, flags); - } -} - /* * RLC * The RLC is a multi-purpose microengine that handles a diff --git a/drivers/gpu/drm/radeon/cik_sdma.c b/drivers/gpu/drm/radeon/cik_sdma.c index ec91427..8d84ebe 100644 --- a/drivers/gpu/drm/radeon/cik_sdma.c +++ b/drivers/gpu/drm/radeon/cik_sdma.c @@ -654,13 +654,12 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, uint64_t addr, unsigned count, uint32_t incr, uint32_t flags) { - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); uint64_t value; unsigned ndw;
- trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + trace_radeon_vm_set_page(pe, addr, count, incr, flags);
- if (flags & RADEON_VM_PAGE_SYSTEM) { + if (flags & R600_PTE_SYSTEM) { while (count) { ndw = count * 2; if (ndw > 0xFFFFE) @@ -672,16 +671,10 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = upper_32_bits(pe); ib->ptr[ib->length_dw++] = ndw; for (; ndw > 0; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { - value = radeon_vm_map_gart(rdev, addr); - value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { - value = addr; - } else { - value = 0; - } + value = radeon_vm_map_gart(rdev, addr); + value &= 0xFFFFFFFFFFFFF000ULL; addr += incr; - value |= r600_flags; + value |= flags; ib->ptr[ib->length_dw++] = value; ib->ptr[ib->length_dw++] = upper_32_bits(value); } @@ -692,7 +685,7 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, if (ndw > 0x7FFFF) ndw = 0x7FFFF;
- if (flags & RADEON_VM_PAGE_VALID) + if (flags & R600_PTE_VALID) value = addr; else value = 0; @@ -700,7 +693,7 @@ void cik_sdma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = SDMA_PACKET(SDMA_OPCODE_GENERATE_PTE_PDE, 0, 0); ib->ptr[ib->length_dw++] = pe; /* dst addr */ ib->ptr[ib->length_dw++] = upper_32_bits(pe); - ib->ptr[ib->length_dw++] = r600_flags; /* mask */ + ib->ptr[ib->length_dw++] = flags; /* mask */ ib->ptr[ib->length_dw++] = 0; ib->ptr[ib->length_dw++] = value; /* value */ ib->ptr[ib->length_dw++] = upper_32_bits(value); diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c index cac2866..11aab2a 100644 --- a/drivers/gpu/drm/radeon/ni.c +++ b/drivers/gpu/drm/radeon/ni.c @@ -174,11 +174,6 @@ extern void evergreen_pcie_gen2_enable(struct radeon_device *rdev); extern void evergreen_program_aspm(struct radeon_device *rdev); extern void sumo_rlc_fini(struct radeon_device *rdev); extern int sumo_rlc_init(struct radeon_device *rdev); -extern void cayman_dma_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags);
/* Firmware Names */ MODULE_FIRMWARE("radeon/BARTS_pfp.bin"); @@ -2400,77 +2395,6 @@ void cayman_vm_decode_fault(struct radeon_device *rdev, block, mc_id); }
-#define R600_ENTRY_VALID (1 << 0) -#define R600_PTE_SYSTEM (1 << 1) -#define R600_PTE_SNOOPED (1 << 2) -#define R600_PTE_READABLE (1 << 5) -#define R600_PTE_WRITEABLE (1 << 6) - -uint32_t cayman_vm_page_flags(struct radeon_device *rdev, uint32_t flags) -{ - uint32_t r600_flags = 0; - r600_flags |= (flags & RADEON_VM_PAGE_VALID) ? R600_ENTRY_VALID : 0; - r600_flags |= (flags & RADEON_VM_PAGE_READABLE) ? R600_PTE_READABLE : 0; - r600_flags |= (flags & RADEON_VM_PAGE_WRITEABLE) ? R600_PTE_WRITEABLE : 0; - if (flags & RADEON_VM_PAGE_SYSTEM) { - r600_flags |= R600_PTE_SYSTEM; - r600_flags |= (flags & RADEON_VM_PAGE_SNOOPED) ? R600_PTE_SNOOPED : 0; - } - return r600_flags; -} - -/** - * cayman_vm_set_page - update the page tables using the CP - * - * @rdev: radeon_device pointer - * @ib: indirect buffer to fill with commands - * @pe: addr of the page entry - * @addr: dst addr to write into pe - * @count: number of page entries to update - * @incr: increase next addr by incr bytes - * @flags: access flags - * - * Update the page tables using the CP (cayman/TN). - */ -void cayman_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags) -{ - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); - uint64_t value; - unsigned ndw; - - if (rdev->asic->vm.pt_ring_index == RADEON_RING_TYPE_GFX_INDEX) { - while (count) { - ndw = 1 + count * 2; - if (ndw > 0x3FFF) - ndw = 0x3FFF; - - ib->ptr[ib->length_dw++] = PACKET3(PACKET3_ME_WRITE, ndw); - ib->ptr[ib->length_dw++] = pe; - ib->ptr[ib->length_dw++] = upper_32_bits(pe) & 0xff; - for (; ndw > 1; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { - value = radeon_vm_map_gart(rdev, addr); - value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { - value = addr; - } else { - value = 0; - } - addr += incr; - value |= r600_flags; - ib->ptr[ib->length_dw++] = value; - ib->ptr[ib->length_dw++] = upper_32_bits(value); - } - } - } else { - cayman_dma_vm_set_page(rdev, ib, pe, addr, count, incr, flags); - } -} - /** * cayman_vm_flush - vm flush using the CP * diff --git a/drivers/gpu/drm/radeon/ni_dma.c b/drivers/gpu/drm/radeon/ni_dma.c index e9cfe8a..bdeb65e 100644 --- a/drivers/gpu/drm/radeon/ni_dma.c +++ b/drivers/gpu/drm/radeon/ni_dma.c @@ -246,8 +246,7 @@ bool cayman_dma_is_lockup(struct radeon_device *rdev, struct radeon_ring *ring) * @addr: dst addr to write into pe * @count: number of page entries to update * @incr: increase next addr by incr bytes - * @flags: access flags - * @r600_flags: hw access flags + * @flags: hw access flags * * Update the page tables using the DMA (cayman/TN). */ @@ -257,13 +256,12 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, uint64_t addr, unsigned count, uint32_t incr, uint32_t flags) { - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); uint64_t value; unsigned ndw;
- trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + trace_radeon_vm_set_page(pe, addr, count, incr, flags);
- if ((flags & RADEON_VM_PAGE_SYSTEM) || (count == 1)) { + if ((flags & R600_PTE_SYSTEM) || (count == 1)) { while (count) { ndw = count * 2; if (ndw > 0xFFFFE) @@ -274,16 +272,16 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = pe; ib->ptr[ib->length_dw++] = upper_32_bits(pe) & 0xff; for (; ndw > 0; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { + if (flags & R600_PTE_SYSTEM) { value = radeon_vm_map_gart(rdev, addr); value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { + } else if (flags & R600_PTE_VALID) { value = addr; } else { value = 0; } addr += incr; - value |= r600_flags; + value |= flags; ib->ptr[ib->length_dw++] = value; ib->ptr[ib->length_dw++] = upper_32_bits(value); } @@ -294,7 +292,7 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, if (ndw > 0xFFFFE) ndw = 0xFFFFE;
- if (flags & RADEON_VM_PAGE_VALID) + if (flags & R600_PTE_VALID) value = addr; else value = 0; @@ -302,7 +300,7 @@ void cayman_dma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = DMA_PTE_PDE_PACKET(ndw); ib->ptr[ib->length_dw++] = pe; /* dst addr */ ib->ptr[ib->length_dw++] = upper_32_bits(pe) & 0xff; - ib->ptr[ib->length_dw++] = r600_flags; /* mask */ + ib->ptr[ib->length_dw++] = flags; /* mask */ ib->ptr[ib->length_dw++] = 0; ib->ptr[ib->length_dw++] = value; /* value */ ib->ptr[ib->length_dw++] = upper_32_bits(value); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index f289308..f9124b3 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -831,6 +831,12 @@ struct radeon_mec { #define RADEON_VM_PTB_ALIGN_MASK (RADEON_VM_PTB_ALIGN_SIZE - 1) #define RADEON_VM_PTB_ALIGN(a) (((a) + RADEON_VM_PTB_ALIGN_MASK) & ~RADEON_VM_PTB_ALIGN_MASK)
+#define R600_PTE_VALID (1 << 0) +#define R600_PTE_SYSTEM (1 << 1) +#define R600_PTE_SNOOPED (1 << 2) +#define R600_PTE_READABLE (1 << 5) +#define R600_PTE_WRITEABLE (1 << 6) + struct radeon_vm { struct list_head list; struct list_head va; @@ -1674,8 +1680,6 @@ struct radeon_asic { struct { int (*init)(struct radeon_device *rdev); void (*fini)(struct radeon_device *rdev); - - u32 pt_ring_index; void (*set_page)(struct radeon_device *rdev, struct radeon_ib *ib, uint64_t pe, diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c index 8f7e045..efb2fd4 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.c +++ b/drivers/gpu/drm/radeon/radeon_asic.c @@ -1622,8 +1622,7 @@ static struct radeon_asic cayman_asic = { .vm = { .init = &cayman_vm_init, .fini = &cayman_vm_fini, - .pt_ring_index = R600_RING_TYPE_DMA_INDEX, - .set_page = &cayman_vm_set_page, + .set_page = &cayman_dma_vm_set_page, }, .ring = { [RADEON_RING_TYPE_GFX_INDEX] = &cayman_gfx_ring, @@ -1723,8 +1722,7 @@ static struct radeon_asic trinity_asic = { .vm = { .init = &cayman_vm_init, .fini = &cayman_vm_fini, - .pt_ring_index = R600_RING_TYPE_DMA_INDEX, - .set_page = &cayman_vm_set_page, + .set_page = &cayman_dma_vm_set_page, }, .ring = { [RADEON_RING_TYPE_GFX_INDEX] = &cayman_gfx_ring, @@ -1854,8 +1852,7 @@ static struct radeon_asic si_asic = { .vm = { .init = &si_vm_init, .fini = &si_vm_fini, - .pt_ring_index = R600_RING_TYPE_DMA_INDEX, - .set_page = &si_vm_set_page, + .set_page = &si_dma_vm_set_page, }, .ring = { [RADEON_RING_TYPE_GFX_INDEX] = &si_gfx_ring, @@ -2000,8 +1997,7 @@ static struct radeon_asic ci_asic = { .vm = { .init = &cik_vm_init, .fini = &cik_vm_fini, - .pt_ring_index = R600_RING_TYPE_DMA_INDEX, - .set_page = &cik_vm_set_page, + .set_page = &cik_sdma_vm_set_page, }, .ring = { [RADEON_RING_TYPE_GFX_INDEX] = &ci_gfx_ring, @@ -2100,8 +2096,7 @@ static struct radeon_asic kv_asic = { .vm = { .init = &cik_vm_init, .fini = &cik_vm_fini, - .pt_ring_index = R600_RING_TYPE_DMA_INDEX, - .set_page = &cik_vm_set_page, + .set_page = &cik_sdma_vm_set_page, }, .ring = { [RADEON_RING_TYPE_GFX_INDEX] = &ci_gfx_ring, diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h index 70c29d5..5798a8d 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.h +++ b/drivers/gpu/drm/radeon/radeon_asic.h @@ -581,17 +581,18 @@ int cayman_vm_init(struct radeon_device *rdev); void cayman_vm_fini(struct radeon_device *rdev); void cayman_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); uint32_t cayman_vm_page_flags(struct radeon_device *rdev, uint32_t flags); -void cayman_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags); int evergreen_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); int evergreen_dma_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); void cayman_dma_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib); bool cayman_gfx_is_lockup(struct radeon_device *rdev, struct radeon_ring *ring); bool cayman_dma_is_lockup(struct radeon_device *rdev, struct radeon_ring *ring); +void cayman_dma_vm_set_page(struct radeon_device *rdev, + struct radeon_ib *ib, + uint64_t pe, + uint64_t addr, unsigned count, + uint32_t incr, uint32_t flags); + void cayman_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm);
int ni_dpm_init(struct radeon_device *rdev); @@ -653,17 +654,17 @@ int si_irq_set(struct radeon_device *rdev); int si_irq_process(struct radeon_device *rdev); int si_vm_init(struct radeon_device *rdev); void si_vm_fini(struct radeon_device *rdev); -void si_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags); void si_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); int si_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); int si_copy_dma(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, unsigned num_gpu_pages, struct radeon_fence **fence); +void si_dma_vm_set_page(struct radeon_device *rdev, + struct radeon_ib *ib, + uint64_t pe, + uint64_t addr, unsigned count, + uint32_t incr, uint32_t flags); void si_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); u32 si_get_xclk(struct radeon_device *rdev); uint64_t si_get_gpu_clock_counter(struct radeon_device *rdev); @@ -731,11 +732,11 @@ int cik_irq_process(struct radeon_device *rdev); int cik_vm_init(struct radeon_device *rdev); void cik_vm_fini(struct radeon_device *rdev); void cik_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); -void cik_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags); +void cik_sdma_vm_set_page(struct radeon_device *rdev, + struct radeon_ib *ib, + uint64_t pe, + uint64_t addr, unsigned count, + uint32_t incr, uint32_t flags); void cik_dma_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm); int cik_ib_parse(struct radeon_device *rdev, struct radeon_ib *ib); u32 cik_compute_ring_get_rptr(struct radeon_device *rdev, diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index b990b1a..f6947dd 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -914,6 +914,26 @@ uint64_t radeon_vm_map_gart(struct radeon_device *rdev, uint64_t addr) }
/** + * radeon_vm_page_flags - translate page flags to what the hw uses + * + * @flags: flags comming from userspace + * + * Translate the flags the userspace ABI uses to hw flags. + */ +static uint32_t radeon_vm_page_flags(uint32_t flags) +{ + uint32_t hw_flags = 0; + hw_flags |= (flags & RADEON_VM_PAGE_VALID) ? R600_PTE_VALID : 0; + hw_flags |= (flags & RADEON_VM_PAGE_READABLE) ? R600_PTE_READABLE : 0; + hw_flags |= (flags & RADEON_VM_PAGE_WRITEABLE) ? R600_PTE_WRITEABLE : 0; + if (flags & RADEON_VM_PAGE_SYSTEM) { + hw_flags |= R600_PTE_SYSTEM; + hw_flags |= (flags & RADEON_VM_PAGE_SNOOPED) ? R600_PTE_SNOOPED : 0; + } + return hw_flags; +} + +/** * radeon_vm_update_pdes - make sure that page directory is valid * * @rdev: radeon_device pointer @@ -974,7 +994,7 @@ retry: if (count) { radeon_asic_vm_set_page(rdev, ib, last_pde, last_pt, count, incr, - RADEON_VM_PAGE_VALID); + R600_PTE_VALID); }
count = 1; @@ -987,7 +1007,7 @@ retry:
if (count) { radeon_asic_vm_set_page(rdev, ib, last_pde, last_pt, count, - incr, RADEON_VM_PAGE_VALID); + incr, R600_PTE_VALID);
}
@@ -1082,7 +1102,6 @@ int radeon_vm_bo_update_pte(struct radeon_device *rdev, struct radeon_bo *bo, struct ttm_mem_reg *mem) { - unsigned ridx = rdev->asic->vm.pt_ring_index; struct radeon_ib ib; struct radeon_bo_va *bo_va; unsigned nptes, npdes, ndw; @@ -1155,7 +1174,7 @@ int radeon_vm_bo_update_pte(struct radeon_device *rdev, if (ndw > 0xfffff) return -ENOMEM;
- r = radeon_ib_get(rdev, ridx, &ib, NULL, ndw * 4); + r = radeon_ib_get(rdev, R600_RING_TYPE_DMA_INDEX, &ib, NULL, ndw * 4); ib.length_dw = 0;
r = radeon_vm_update_pdes(rdev, vm, &ib, bo_va->soffset, bo_va->eoffset); @@ -1165,7 +1184,7 @@ int radeon_vm_bo_update_pte(struct radeon_device *rdev, }
radeon_vm_update_ptes(rdev, vm, &ib, bo_va->soffset, bo_va->eoffset, - addr, bo_va->flags); + addr, radeon_vm_page_flags(bo_va->flags));
radeon_ib_sync_to(&ib, vm->fence); r = radeon_ib_schedule(rdev, &ib, NULL); diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c index d96f7cb..5db4e75 100644 --- a/drivers/gpu/drm/radeon/si.c +++ b/drivers/gpu/drm/radeon/si.c @@ -78,11 +78,6 @@ extern void evergreen_mc_resume(struct radeon_device *rdev, struct evergreen_mc_ extern u32 evergreen_get_number_of_dram_channels(struct radeon_device *rdev); extern void evergreen_print_gpu_status_regs(struct radeon_device *rdev); extern bool evergreen_is_display_hung(struct radeon_device *rdev); -extern void si_dma_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags); static void si_enable_gui_idle_interrupt(struct radeon_device *rdev, bool enable); static void si_fini_pg(struct radeon_device *rdev); @@ -4673,61 +4668,6 @@ static void si_vm_decode_fault(struct radeon_device *rdev, block, mc_id); }
-/** - * si_vm_set_page - update the page tables using the CP - * - * @rdev: radeon_device pointer - * @ib: indirect buffer to fill with commands - * @pe: addr of the page entry - * @addr: dst addr to write into pe - * @count: number of page entries to update - * @incr: increase next addr by incr bytes - * @flags: access flags - * - * Update the page tables using the CP (SI). - */ -void si_vm_set_page(struct radeon_device *rdev, - struct radeon_ib *ib, - uint64_t pe, - uint64_t addr, unsigned count, - uint32_t incr, uint32_t flags) -{ - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); - uint64_t value; - unsigned ndw; - - if (rdev->asic->vm.pt_ring_index == RADEON_RING_TYPE_GFX_INDEX) { - while (count) { - ndw = 2 + count * 2; - if (ndw > 0x3FFE) - ndw = 0x3FFE; - - ib->ptr[ib->length_dw++] = PACKET3(PACKET3_WRITE_DATA, ndw); - ib->ptr[ib->length_dw++] = (WRITE_DATA_ENGINE_SEL(0) | - WRITE_DATA_DST_SEL(1)); - ib->ptr[ib->length_dw++] = pe; - ib->ptr[ib->length_dw++] = upper_32_bits(pe); - for (; ndw > 2; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { - value = radeon_vm_map_gart(rdev, addr); - value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { - value = addr; - } else { - value = 0; - } - addr += incr; - value |= r600_flags; - ib->ptr[ib->length_dw++] = value; - ib->ptr[ib->length_dw++] = upper_32_bits(value); - } - } - } else { - /* DMA */ - si_dma_vm_set_page(rdev, ib, pe, addr, count, incr, flags); - } -} - void si_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) { struct radeon_ring *ring = &rdev->ring[ridx]; diff --git a/drivers/gpu/drm/radeon/si_dma.c b/drivers/gpu/drm/radeon/si_dma.c index 17205fd..8e8f461 100644 --- a/drivers/gpu/drm/radeon/si_dma.c +++ b/drivers/gpu/drm/radeon/si_dma.c @@ -76,13 +76,12 @@ void si_dma_vm_set_page(struct radeon_device *rdev, uint64_t addr, unsigned count, uint32_t incr, uint32_t flags) { - uint32_t r600_flags = cayman_vm_page_flags(rdev, flags); uint64_t value; unsigned ndw;
- trace_radeon_vm_set_page(pe, addr, count, incr, r600_flags); + trace_radeon_vm_set_page(pe, addr, count, incr, flags);
- if (flags & RADEON_VM_PAGE_SYSTEM) { + if (flags & R600_PTE_SYSTEM) { while (count) { ndw = count * 2; if (ndw > 0xFFFFE) @@ -93,16 +92,10 @@ void si_dma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = pe; ib->ptr[ib->length_dw++] = upper_32_bits(pe) & 0xff; for (; ndw > 0; ndw -= 2, --count, pe += 8) { - if (flags & RADEON_VM_PAGE_SYSTEM) { - value = radeon_vm_map_gart(rdev, addr); - value &= 0xFFFFFFFFFFFFF000ULL; - } else if (flags & RADEON_VM_PAGE_VALID) { - value = addr; - } else { - value = 0; - } + value = radeon_vm_map_gart(rdev, addr); + value &= 0xFFFFFFFFFFFFF000ULL; addr += incr; - value |= r600_flags; + value |= flags; ib->ptr[ib->length_dw++] = value; ib->ptr[ib->length_dw++] = upper_32_bits(value); } @@ -113,7 +106,7 @@ void si_dma_vm_set_page(struct radeon_device *rdev, if (ndw > 0xFFFFE) ndw = 0xFFFFE;
- if (flags & RADEON_VM_PAGE_VALID) + if (flags & R600_PTE_VALID) value = addr; else value = 0; @@ -121,7 +114,7 @@ void si_dma_vm_set_page(struct radeon_device *rdev, ib->ptr[ib->length_dw++] = DMA_PTE_PDE_PACKET(ndw); ib->ptr[ib->length_dw++] = pe; /* dst addr */ ib->ptr[ib->length_dw++] = upper_32_bits(pe) & 0xff; - ib->ptr[ib->length_dw++] = r600_flags; /* mask */ + ib->ptr[ib->length_dw++] = flags; /* mask */ ib->ptr[ib->length_dw++] = 0; ib->ptr[ib->length_dw++] = value; /* value */ ib->ptr[ib->length_dw++] = upper_32_bits(value);
From: Christian König christian.koenig@amd.com
Clear page tables after allocating them in case we don't completely fill them later.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/radeon/radeon_gart.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index f6947dd..fd109d3 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -995,6 +995,10 @@ retry: radeon_asic_vm_set_page(rdev, ib, last_pde, last_pt, count, incr, R600_PTE_VALID); + + count *= RADEON_VM_PTE_COUNT; + radeon_asic_vm_set_page(rdev, ib, last_pt, 0, + count, 0, 0); }
count = 1; @@ -1009,6 +1013,9 @@ retry: radeon_asic_vm_set_page(rdev, ib, last_pde, last_pt, count, incr, R600_PTE_VALID);
+ count *= RADEON_VM_PTE_COUNT; + radeon_asic_vm_set_page(rdev, ib, last_pt, 0, + count, 0, 0); }
return 0; @@ -1170,6 +1177,9 @@ int radeon_vm_bo_update_pte(struct radeon_device *rdev, /* reserve space for pde addresses */ ndw += npdes * 2;
+ /* reserve space for clearing new page tables */ + ndw += npdes * 2 * RADEON_VM_PTE_COUNT; + /* update too big for an IB */ if (ndw > 0xfffff) return -ENOMEM;
From: Christian König christian.koenig@amd.com
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/radeon/radeon_gart.c | 34 +++++++++++++++++++++++++++++----- 1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index fd109d3..8a83b89 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -607,8 +607,8 @@ static int radeon_vm_evict(struct radeon_device *rdev, struct radeon_vm *vm) */ int radeon_vm_alloc_pt(struct radeon_device *rdev, struct radeon_vm *vm) { - unsigned pd_size, pts_size; - u64 *pd_addr; + unsigned pd_size, pd_entries, pts_size; + struct radeon_ib ib; int r;
if (vm == NULL) { @@ -619,8 +619,10 @@ int radeon_vm_alloc_pt(struct radeon_device *rdev, struct radeon_vm *vm) return 0; }
-retry: pd_size = radeon_vm_directory_size(rdev); + pd_entries = radeon_vm_num_pdes(rdev); + +retry: r = radeon_sa_bo_new(rdev, &rdev->vm_manager.sa_manager, &vm->page_directory, pd_size, RADEON_VM_PTB_ALIGN_SIZE, false); @@ -637,9 +639,31 @@ retry: vm->pd_gpu_addr = radeon_sa_bo_gpu_addr(vm->page_directory);
/* Initially clear the page directory */ - pd_addr = radeon_sa_bo_cpu_addr(vm->page_directory); - memset(pd_addr, 0, pd_size); + r = radeon_ib_get(rdev, R600_RING_TYPE_DMA_INDEX, &ib, + NULL, pd_entries * 2 + 64); + if (r) { + radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence); + return r; + } + + ib.length_dw = 0; + + radeon_asic_vm_set_page(rdev, &ib, vm->pd_gpu_addr, + 0, pd_entries, 0, 0); + + radeon_ib_sync_to(&ib, vm->fence); + r = radeon_ib_schedule(rdev, &ib, NULL); + if (r) { + radeon_ib_free(rdev, &ib); + radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence); + return r; + } + radeon_fence_unref(&vm->fence); + vm->fence = radeon_fence_ref(ib.fence); + radeon_ib_free(rdev, &ib); + radeon_fence_unref(&vm->last_flush);
+ /* allocate page table array */ pts_size = radeon_vm_num_pdes(rdev) * sizeof(struct radeon_sa_bo *); vm->page_tables = kzalloc(pts_size, GFP_KERNEL);
2013/10/29 Christian König deathsimple@vodafone.de:
From: Christian König christian.koenig@amd.com
Stop fiddling with jiffies, always wait for RADEON_FENCE_JIFFIES_TIMEOUT. Consolidate the two wait sequence implementations into just one function. Activate all waiters and remember if the reset was already done instead of trying to reset from only one thread.
With this patch I can't suspend my Samsung with: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Blackcomb [Radeon HD 6900M series] [1002:6720] anymore.
The backlight goes off, activity LED goes off and then nothing. Machine is still running and other lights (power, wifi, keyboard, touchpad) are still working. Seems like a lockup to me.
Am 03.11.2013 13:15, schrieb Rafał Miłecki:
2013/10/29 Christian König deathsimple@vodafone.de:
From: Christian König christian.koenig@amd.com
Stop fiddling with jiffies, always wait for RADEON_FENCE_JIFFIES_TIMEOUT. Consolidate the two wait sequence implementations into just one function. Activate all waiters and remember if the reset was already done instead of trying to reset from only one thread.
With this patch I can't suspend my Samsung with: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Blackcomb [Radeon HD 6900M series] [1002:6720] anymore.
The backlight goes off, activity LED goes off and then nothing. Machine is still running and other lights (power, wifi, keyboard, touchpad) are still working. Seems like a lockup to me.
Does the attached patch help?
Cheers, Christian.
2013/11/5 Christian König deathsimple@vodafone.de:
Am 03.11.2013 13:15, schrieb Rafał Miłecki:
2013/10/29 Christian König deathsimple@vodafone.de:
From: Christian König christian.koenig@amd.com
Stop fiddling with jiffies, always wait for RADEON_FENCE_JIFFIES_TIMEOUT. Consolidate the two wait sequence implementations into just one function. Activate all waiters and remember if the reset was already done instead of trying to reset from only one thread.
With this patch I can't suspend my Samsung with: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Blackcomb [Radeon HD 6900M series] [1002:6720] anymore.
The backlight goes off, activity LED goes off and then nothing. Machine is still running and other lights (power, wifi, keyboard, touchpad) are still working. Seems like a lockup to me.
Does the attached patch help?
It does, thanks!
dri-devel@lists.freedesktop.org