[PATCH 00/10] libdrm amdgpu patches

List overview All Threads
Download

newer

older

[patch] drm/nouveau/pci: reversed...

[PATCH] drm/nouveau/device: ensure...

Marek Olšák

12 Jan 2016 12 Jan '16

9:23 p.m.

Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, sequence_mutex should be renamed to a more appropriate name. That can be done as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the kernel contains (or will contain) the changes too, so that I don't push something that doesn't exist in the kernel.

Please let me know if these are okay to push.

Thanks,

Chunming Zhou (3): amdgpu: add semaphore support tests/amdgpu: add semaphore test amdgpu: validate user memory for userptr

Junwei Zhang (3): amdgpu: add the interface of waiting multiple fences amdgpu/tests: add multi-fence test in base test amdgpu: list each entry safely for sw semaphore when submit ib

Marek Olšák (1): Revert "amdgpu: remove sequence mutex"

Michel Dänzer (1): amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

monk.liu (2): amdgpu: drop address patching logics amdgpu: cs_wait_fences now can return the first signaled fence index

amdgpu/amdgpu.h | 88 +++++++++++++++++++++++++++++++++++++++++ amdgpu/amdgpu_bo.c | 14 ++----- amdgpu/amdgpu_cs.c | 253 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- amdgpu/amdgpu_internal.h | 15 +++++++ include/drm/amdgpu_drm.h | 28 +++++++++++++ tests/amdgpu/basic_tests.c | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 616 insertions(+), 15 deletions(-)

Marek

Show replies by date

Marek Olšák

12 Jan 12 Jan

9:23 p.m.

New subject: [PATCH 01/10] Revert "amdgpu: remove sequence mutex"

From: Marek Olšák marek.olsak@amd.com

This reverts commit f6f25d67a9c0d26be9b8021a45f2acf3a4042ade.

Required by the new semaphore patches. --- amdgpu/amdgpu_cs.c | 10 ++++++++++ amdgpu/amdgpu_internal.h | 3 +++ 2 files changed, 13 insertions(+)

diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index 6747158..511d53f 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -66,6 +66,10 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev,

gpu_context->dev = dev;

+ r = pthread_mutex_init(&gpu_context->sequence_mutex, NULL); + if (r) + goto error; + /* Create the context */ memset(&args, 0, sizeof(args)); args.in.op = AMDGPU_CTX_OP_ALLOC_CTX; @@ -79,6 +83,7 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev, return 0;

error: + pthread_mutex_destroy(&gpu_context->sequence_mutex); free(gpu_context); return r; } @@ -99,6 +104,8 @@ int amdgpu_cs_ctx_free(amdgpu_context_handle context) if (NULL == context) return -EINVAL;

+ pthread_mutex_destroy(&context->sequence_mutex); + /* now deal with kernel side */ memset(&args, 0, sizeof(args)); args.in.op = AMDGPU_CTX_OP_FREE_CTX; @@ -196,6 +203,8 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, chunk_data[i].ib_data.flags = ib->flags; }

+ pthread_mutex_lock(&context->sequence_mutex); + if (user_fence) { i = cs.in.num_chunks++;

@@ -248,6 +257,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, ibs_request->seq_no = cs.out.handle;

error_unlock: + pthread_mutex_unlock(&context->sequence_mutex); free(dependencies); return r; } diff --git a/amdgpu/amdgpu_internal.h b/amdgpu/amdgpu_internal.h index 7dd5c1c..5d86603 100644 --- a/amdgpu/amdgpu_internal.h +++ b/amdgpu/amdgpu_internal.h @@ -111,6 +111,9 @@ struct amdgpu_bo_list {

struct amdgpu_context { struct amdgpu_device *dev; + /** Mutex for accessing fences and to maintain command submissions + in good sequence. */ + pthread_mutex_t sequence_mutex; /* context id*/ uint32_t id; };

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 02/10] amdgpu: add the interface of waiting multiple fences

From: Junwei Zhang Jerry.Zhang@amd.com

Signed-off-by: Junwei Zhang Jerry.Zhang@amd.com Reviewed-by: Christian König christian.koenig@amd.com Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com --- amdgpu/amdgpu.h | 22 +++++++++++++++ amdgpu/amdgpu_cs.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++ include/drm/amdgpu_drm.h | 27 ++++++++++++++++++ 3 files changed, 120 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index e44d802..9ae6ca3 100644 --- a/amdgpu/amdgpu.h +++ b/amdgpu/amdgpu.h @@ -902,6 +902,28 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence *fence, uint64_t flags, uint32_t *expired);

+/** + * Wait for multiple fences + * + * \param fences - \c [in] The fence array to wait + * \param fence_count - \c [in] The fence count + * \param wait_all - \c [in] If true, wait all fences to be signaled, + * otherwise, wait at least one fence + * \param timeout_ns - \c [in] The timeout to wait, in nanoseconds + * \param status - \c [out] '1' for signaled, '0' for timeout + * + * \return 0 on success + * <0 - Negative POSIX Error code + * + * \note Currently it supports only one amdgpu_device. All fences come from + * the same amdgpu_device with the same fd. +*/ +int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, + uint32_t fence_count, + bool wait_all, + uint64_t timeout_ns, + uint32_t *status); + /* * Query / Info API * diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index 511d53f..d5e4ea0 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -379,3 +379,74 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence *fence, return r; }

+static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence *fences, + uint32_t fence_count, + bool wait_all, + uint64_t timeout_ns, + uint32_t *status) +{ + struct drm_amdgpu_fence *drm_fences; + amdgpu_device_handle dev = fences[0].context->dev; + union drm_amdgpu_wait_fences args; + int r; + uint32_t i; + + drm_fences = alloca(sizeof(struct drm_amdgpu_fence) * fence_count); + for (i = 0; i < fence_count; i++) { + drm_fences[i].ctx_id = fences[i].context->id; + drm_fences[i].ip_type = fences[i].ip_type; + drm_fences[i].ip_instance = fences[i].ip_instance; + drm_fences[i].ring = fences[i].ring; + drm_fences[i].seq_no = fences[i].fence; + } + + memset(&args, 0, sizeof(args)); + args.in.fences = (uint64_t)(uintptr_t)drm_fences; + args.in.fence_count = fence_count; + args.in.wait_all = wait_all; + args.in.timeout_ns = amdgpu_cs_calculate_timeout(timeout_ns); + + r = drmIoctl(dev->fd, DRM_IOCTL_AMDGPU_WAIT_FENCES, &args); + if (r) + return -errno; + + *status = args.out.status; + return 0; +} + +int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, + uint32_t fence_count, + bool wait_all, + uint64_t timeout_ns, + uint32_t *status) +{ + uint32_t ioctl_status = 0; + uint32_t i; + int r; + + /* Sanity check */ + if (NULL == fences) + return -EINVAL; + if (NULL == status) + return -EINVAL; + if (fence_count <= 0) + return -EINVAL; + for (i = 0; i < fence_count; i++) { + if (NULL == fences[i].context) + return -EINVAL; + if (fences[i].ip_type >= AMDGPU_HW_IP_NUM) + return -EINVAL; + if (fences[i].ring >= AMDGPU_CS_MAX_RINGS) + return -EINVAL; + } + + *status = 0; + + r = amdgpu_ioctl_wait_fences(fences, fence_count, wait_all, timeout_ns, + &ioctl_status); + + if (!r) + *status = ioctl_status; + + return r; +} diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h index fbdd118..2cbea72 100644 --- a/include/drm/amdgpu_drm.h +++ b/include/drm/amdgpu_drm.h @@ -46,6 +46,7 @@ #define DRM_AMDGPU_WAIT_CS 0x09 #define DRM_AMDGPU_GEM_OP 0x10 #define DRM_AMDGPU_GEM_USERPTR 0x11 +#define DRM_AMDGPU_WAIT_FENCES 0x12

#define DRM_IOCTL_AMDGPU_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create) #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap) @@ -59,6 +60,7 @@ #define DRM_IOCTL_AMDGPU_WAIT_CS DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_WAIT_CS, union drm_amdgpu_wait_cs) #define DRM_IOCTL_AMDGPU_GEM_OP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_OP, struct drm_amdgpu_gem_op) #define DRM_IOCTL_AMDGPU_GEM_USERPTR DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_USERPTR, struct drm_amdgpu_gem_userptr) +#define DRM_IOCTL_AMDGPU_WAIT_FENCES DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_WAIT_FENCES, union drm_amdgpu_wait_fences)

#define AMDGPU_GEM_DOMAIN_CPU 0x1 #define AMDGPU_GEM_DOMAIN_GTT 0x2 @@ -297,6 +299,31 @@ union drm_amdgpu_wait_cs { struct drm_amdgpu_wait_cs_out out; };

+struct drm_amdgpu_fence { + uint32_t ctx_id; + uint32_t ip_type; + uint32_t ip_instance; + uint32_t ring; + uint64_t seq_no; +}; + +struct drm_amdgpu_wait_fences_in { + /** This points to uint64_t * which points to fences */ + uint64_t fences; + uint32_t fence_count; + uint32_t wait_all; + uint64_t timeout_ns; +}; + +struct drm_amdgpu_wait_fences_out { + uint64_t status; +}; + +union drm_amdgpu_wait_fences { + struct drm_amdgpu_wait_fences_in in; + struct drm_amdgpu_wait_fences_out out; +}; + #define AMDGPU_GEM_OP_GET_GEM_CREATE_INFO 0 #define AMDGPU_GEM_OP_SET_PLACEMENT 1

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 03/10] amdgpu: drop address patching logics

From: "monk.liu" monk.liu@amd.com

we don't support non-page-aligned cpu pointer anymore

Signed-off-by: monk.liu monk.liu@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- amdgpu/amdgpu_bo.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index 1a5a401..61db58c 100644 --- a/amdgpu/amdgpu_bo.c +++ b/amdgpu/amdgpu_bo.c @@ -537,17 +537,8 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev, int r; struct amdgpu_bo *bo; struct drm_amdgpu_gem_userptr args; - uintptr_t cpu0; - uint32_t ps, off;

- memset(&args, 0, sizeof(args)); - ps = getpagesize(); - - cpu0 = ROUND_DOWN((uintptr_t)cpu, ps); - off = (uintptr_t)cpu - cpu0; - size = ROUND_UP(size + off, ps); - - args.addr = cpu0; + args.addr = cpu; args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER; args.size = size; r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 04/10] amdgpu/tests: add multi-fence test in base test

From: Junwei Zhang Jerry.Zhang@amd.com

Signed-off-by: Junwei Zhang Jerry.Zhang@amd.com Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com --- tests/amdgpu/basic_tests.c | 100 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index e489e6e..a666d32 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -46,6 +46,7 @@ static void amdgpu_memory_alloc(void); static void amdgpu_command_submission_gfx(void); static void amdgpu_command_submission_compute(void); static void amdgpu_command_submission_sdma(void); +static void amdgpu_command_submission_multi_fence(void); static void amdgpu_userptr_test(void);

CU_TestInfo basic_tests[] = { @@ -55,6 +56,7 @@ CU_TestInfo basic_tests[] = { { "Command submission Test (GFX)", amdgpu_command_submission_gfx }, { "Command submission Test (Compute)", amdgpu_command_submission_compute }, { "Command submission Test (SDMA)", amdgpu_command_submission_sdma }, + { "Command submission Test (Multi-fence)", amdgpu_command_submission_multi_fence }, CU_TEST_INFO_NULL, }; #define BUFFER_SIZE (8 * 1024) @@ -765,6 +767,104 @@ static void amdgpu_command_submission_sdma(void) amdgpu_command_submission_sdma_copy_linear(); }

+static void amdgpu_command_submission_multi_fence_wait_all(bool wait_all) +{ + amdgpu_context_handle context_handle; + amdgpu_bo_handle ib_result_handle, ib_result_ce_handle; + void *ib_result_cpu, *ib_result_ce_cpu; + uint64_t ib_result_mc_address, ib_result_ce_mc_address; + struct amdgpu_cs_request ibs_request[2] = {0}; + struct amdgpu_cs_ib_info ib_info[2]; + struct amdgpu_cs_fence fence_status[2] = {0}; + uint32_t *ptr; + uint32_t expired; + amdgpu_bo_list_handle bo_list; + amdgpu_va_handle va_handle, va_handle_ce; + int r; + int i, ib_cs_num = 2; + + r = amdgpu_cs_ctx_create(device_handle, &context_handle); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096, + AMDGPU_GEM_DOMAIN_GTT, 0, + &ib_result_handle, &ib_result_cpu, + &ib_result_mc_address, &va_handle); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096, + AMDGPU_GEM_DOMAIN_GTT, 0, + &ib_result_ce_handle, &ib_result_ce_cpu, + &ib_result_ce_mc_address, &va_handle_ce); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_get_bo_list(device_handle, ib_result_handle, + ib_result_ce_handle, &bo_list); + CU_ASSERT_EQUAL(r, 0); + + memset(ib_info, 0, 2 * sizeof(struct amdgpu_cs_ib_info)); + + /* IT_SET_CE_DE_COUNTERS */ + ptr = ib_result_ce_cpu; + ptr[0] = 0xc0008900; + ptr[1] = 0; + ptr[2] = 0xc0008400; + ptr[3] = 1; + ib_info[0].ib_mc_address = ib_result_ce_mc_address; + ib_info[0].size = 4; + ib_info[0].flags = AMDGPU_IB_FLAG_CE; + + /* IT_WAIT_ON_CE_COUNTER */ + ptr = ib_result_cpu; + ptr[0] = 0xc0008600; + ptr[1] = 0x00000001; + ib_info[1].ib_mc_address = ib_result_mc_address; + ib_info[1].size = 2; + + for (i = 0; i < ib_cs_num; i++) { + ibs_request[i].ip_type = AMDGPU_HW_IP_GFX; + ibs_request[i].number_of_ibs = 2; + ibs_request[i].ibs = ib_info; + ibs_request[i].resources = bo_list; + ibs_request[i].fence_info.handle = NULL; + } + + r = amdgpu_cs_submit(context_handle, 0,ibs_request, ib_cs_num); + + CU_ASSERT_EQUAL(r, 0); + + for (i = 0; i < ib_cs_num; i++) { + fence_status[i].context = context_handle; + fence_status[i].ip_type = AMDGPU_HW_IP_GFX; + fence_status[i].fence = ibs_request[i].seq_no; + } + + r = amdgpu_cs_wait_fences(fence_status, ib_cs_num, wait_all, + AMDGPU_TIMEOUT_INFINITE, + &expired); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_unmap_and_free(ib_result_handle, va_handle, + ib_result_mc_address, 4096); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_unmap_and_free(ib_result_ce_handle, va_handle_ce, + ib_result_ce_mc_address, 4096); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_list_destroy(bo_list); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_cs_ctx_free(context_handle); + CU_ASSERT_EQUAL(r, 0); +} + +static void amdgpu_command_submission_multi_fence(void) +{ + amdgpu_command_submission_multi_fence_wait_all(true); + amdgpu_command_submission_multi_fence_wait_all(false); +} + static void amdgpu_userptr_test(void) { int i, r, j;

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 05/10] amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

From: Michel Dänzer michel.daenzer@amd.com

CC amdgpu_bo.lo ../../amdgpu/amdgpu_bo.c: In function 'amdgpu_create_bo_from_user_mem': ../../amdgpu/amdgpu_bo.c:539:12: warning: assignment makes integer from pointer without a cast [-Wint-conversion] args.addr = cpu; ^

Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com --- amdgpu/amdgpu_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index 61db58c..2ae1c18 100644 --- a/amdgpu/amdgpu_bo.c +++ b/amdgpu/amdgpu_bo.c @@ -538,7 +538,7 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev, struct amdgpu_bo *bo; struct drm_amdgpu_gem_userptr args;

- args.addr = cpu; + args.addr = (uintptr_t)cpu; args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER; args.size = size; r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,

-- 2.1.4

Michel Dänzer

13 Jan 13 Jan

3:31 a.m.

New subject: [PATCH 05/10] amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

On 13.01.2016 06:23, Marek Olšák wrote:

...

From: Michel Dänzer michel.daenzer@amd.com

CC amdgpu_bo.lo ../../amdgpu/amdgpu_bo.c: In function 'amdgpu_create_bo_from_user_mem': ../../amdgpu/amdgpu_bo.c:539:12: warning: assignment makes integer from pointer without a cast [-Wint-conversion] args.addr = cpu; ^

Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com

amdgpu/amdgpu_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index 61db58c..2ae1c18 100644 --- a/amdgpu/amdgpu_bo.c +++ b/amdgpu/amdgpu_bo.c @@ -538,7 +538,7 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev, struct amdgpu_bo *bo; struct drm_amdgpu_gem_userptr args;

args.addr = cpu;

args.addr = (uintptr_t)cpu; args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER; args.size = size; r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,

This patch should be squashed into patch 3.

-- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer

Marek Olšák

12 Jan 12 Jan

9:23 p.m.

New subject: [PATCH 06/10] amdgpu: add semaphore support

From: Chunming Zhou david1.zhou@amd.com

the semaphore is a binary semaphore. the work flow is: 1. create sem 2. signal sem 3. wait sem, reset sem after signalled 4. destroy sem.

Signed-off-by: Chunming Zhou david1.zhou@amd.com Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- amdgpu/amdgpu.h | 65 +++++++++++++++++++ amdgpu/amdgpu_cs.c | 166 +++++++++++++++++++++++++++++++++++++++++++++-- amdgpu/amdgpu_internal.h | 12 ++++ 3 files changed, 239 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index 9ae6ca3..8822a0c 100644 --- a/amdgpu/amdgpu.h +++ b/amdgpu/amdgpu.h @@ -124,6 +124,11 @@ typedef struct amdgpu_bo_list *amdgpu_bo_list_handle; */ typedef struct amdgpu_va *amdgpu_va_handle;

+/** + * Define handle for semaphore + */ +typedef struct amdgpu_semaphore *amdgpu_semaphore_handle; + /*--------------------------------------------------------------------------*/ /* -------------------------- Structures ---------------------------------- */ /*--------------------------------------------------------------------------*/ @@ -1202,4 +1207,64 @@ int amdgpu_bo_va_op(amdgpu_bo_handle bo, uint64_t flags, uint32_t ops);

+/** + * create semaphore + * + * \param sem - \c [out] semaphore handle + * + * \return 0 on success\n + * <0 - Negative POSIX Error code + * +*/ +int amdgpu_cs_create_semaphore(amdgpu_semaphore_handle *sem); + +/** + * signal semaphore + * + * \param context - \c [in] GPU Context + * \param ip_type - \c [in] Hardware IP block type = AMDGPU_HW_IP_* + * \param ip_instance - \c [in] Index of the IP block of the same type + * \param ring - \c [in] Specify ring index of the IP + * \param sem - \c [in] semaphore handle + * + * \return 0 on success\n + * <0 - Negative POSIX Error code + * +*/ +int amdgpu_cs_signal_semaphore(amdgpu_context_handle ctx, + uint32_t ip_type, + uint32_t ip_instance, + uint32_t ring, + amdgpu_semaphore_handle sem); + +/** + * wait semaphore + * + * \param context - \c [in] GPU Context + * \param ip_type - \c [in] Hardware IP block type = AMDGPU_HW_IP_* + * \param ip_instance - \c [in] Index of the IP block of the same type + * \param ring - \c [in] Specify ring index of the IP + * \param sem - \c [in] semaphore handle + * + * \return 0 on success\n + * <0 - Negative POSIX Error code + * +*/ +int amdgpu_cs_wait_semaphore(amdgpu_context_handle ctx, + uint32_t ip_type, + uint32_t ip_instance, + uint32_t ring, + amdgpu_semaphore_handle sem); + +/** + * destroy semaphore + * + * \param sem - \c [in] semaphore handle + * + * \return 0 on success\n + * <0 - Negative POSIX Error code + * +*/ +int amdgpu_cs_destroy_semaphore(amdgpu_semaphore_handle sem); + #endif /* #ifdef _AMDGPU_H_ */ diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index d5e4ea0..d033f8e 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -40,6 +40,9 @@ #include "amdgpu_drm.h" #include "amdgpu_internal.h"

+static int amdgpu_cs_unreference_sem(amdgpu_semaphore_handle sem); +static int amdgpu_cs_reset_sem(amdgpu_semaphore_handle sem); + /** * Create command submission context * @@ -53,6 +56,7 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev, { struct amdgpu_context *gpu_context; union drm_amdgpu_ctx args; + int i, j, k; int r;

if (NULL == dev) @@ -78,6 +82,10 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev, goto error;

gpu_context->id = args.out.alloc.ctx_id; + for (i = 0; i < AMDGPU_HW_IP_NUM; i++) + for (j = 0; j < AMDGPU_HW_IP_INSTANCE_MAX_COUNT; j++) + for (k = 0; k < AMDGPU_CS_MAX_RINGS; k++) + list_inithead(&gpu_context->sem_list[i][j][k]); *context = (amdgpu_context_handle)gpu_context;

return 0; @@ -99,6 +107,7 @@ error: int amdgpu_cs_ctx_free(amdgpu_context_handle context) { union drm_amdgpu_ctx args; + int i, j, k; int r;

if (NULL == context) @@ -112,7 +121,18 @@ int amdgpu_cs_ctx_free(amdgpu_context_handle context) args.in.ctx_id = context->id; r = drmCommandWriteRead(context->dev->fd, DRM_AMDGPU_CTX, &args, sizeof(args)); - + for (i = 0; i < AMDGPU_HW_IP_NUM; i++) { + for (j = 0; j < AMDGPU_HW_IP_INSTANCE_MAX_COUNT; j++) { + for (k = 0; k < AMDGPU_CS_MAX_RINGS; k++) { + amdgpu_semaphore_handle sem; + LIST_FOR_EACH_ENTRY(sem, &context->sem_list[i][j][k], list) { + list_del(&sem->list); + amdgpu_cs_reset_sem(sem); + amdgpu_cs_unreference_sem(sem); + } + } + } + } free(context);

return r; @@ -157,7 +177,10 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, struct drm_amdgpu_cs_chunk *chunks; struct drm_amdgpu_cs_chunk_data *chunk_data; struct drm_amdgpu_cs_chunk_dep *dependencies = NULL; - uint32_t i, size; + struct drm_amdgpu_cs_chunk_dep *sem_dependencies = NULL; + struct list_head *sem_list; + amdgpu_semaphore_handle sem; + uint32_t i, size, sem_count = 0; bool user_fence; int r = 0;

@@ -169,7 +192,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, return -EINVAL; user_fence = (ibs_request->fence_info.handle != NULL);

- size = ibs_request->number_of_ibs + (user_fence ? 2 : 1); + size = ibs_request->number_of_ibs + (user_fence ? 2 : 1) + 1;

chunk_array = alloca(sizeof(uint64_t) * size); chunks = alloca(sizeof(struct drm_amdgpu_cs_chunk) * size); @@ -249,16 +272,49 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, chunks[i].chunk_data = (uint64_t)(uintptr_t)dependencies; }

+ sem_list = &context->sem_list[ibs_request->ip_type][ibs_request->ip_instance][ibs_request->ring]; + LIST_FOR_EACH_ENTRY(sem, sem_list, list) + sem_count++; + if (sem_count) { + sem_dependencies = malloc(sizeof(struct drm_amdgpu_cs_chunk_dep) * sem_count); + if (!sem_dependencies) { + r = -ENOMEM; + goto error_unlock; + } + sem_count = 0; + LIST_FOR_EACH_ENTRY(sem, sem_list, list) { + struct amdgpu_cs_fence *info = &sem->signal_fence; + struct drm_amdgpu_cs_chunk_dep *dep = &sem_dependencies[sem_count++]; + dep->ip_type = info->ip_type; + dep->ip_instance = info->ip_instance; + dep->ring = info->ring; + dep->ctx_id = info->context->id; + dep->handle = info->fence; + + list_del(&sem->list); + amdgpu_cs_reset_sem(sem); + amdgpu_cs_unreference_sem(sem); + } + i = cs.in.num_chunks++; + + /* dependencies chunk */ + chunk_array[i] = (uint64_t)(uintptr_t)&chunks[i]; + chunks[i].chunk_id = AMDGPU_CHUNK_ID_DEPENDENCIES; + chunks[i].length_dw = sizeof(struct drm_amdgpu_cs_chunk_dep) / 4 * sem_count; + chunks[i].chunk_data = (uint64_t)(uintptr_t)sem_dependencies; + } + r = drmCommandWriteRead(context->dev->fd, DRM_AMDGPU_CS, &cs, sizeof(cs)); if (r) goto error_unlock;

ibs_request->seq_no = cs.out.handle; - + context->last_seq[ibs_request->ip_type][ibs_request->ip_instance][ibs_request->ring] = ibs_request->seq_no; error_unlock: pthread_mutex_unlock(&context->sequence_mutex); free(dependencies); + free(sem_dependencies); return r; }

@@ -450,3 +506,105 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,

return r; } + +int amdgpu_cs_create_semaphore(amdgpu_semaphore_handle *sem) +{ + struct amdgpu_semaphore *gpu_semaphore; + + if (NULL == sem) + return -EINVAL; + + gpu_semaphore = calloc(1, sizeof(struct amdgpu_semaphore)); + if (NULL == gpu_semaphore) + return -ENOMEM; + + atomic_set(&gpu_semaphore->refcount, 1); + *sem = gpu_semaphore; + + return 0; +} + +int amdgpu_cs_signal_semaphore(amdgpu_context_handle ctx, + uint32_t ip_type, + uint32_t ip_instance, + uint32_t ring, + amdgpu_semaphore_handle sem) +{ + if (NULL == ctx) + return -EINVAL; + if (ip_type >= AMDGPU_HW_IP_NUM) + return -EINVAL; + if (ring >= AMDGPU_CS_MAX_RINGS) + return -EINVAL; + if (NULL == sem) + return -EINVAL; + /* sem has been signaled */ + if (sem->signal_fence.context) + return -EINVAL; + pthread_mutex_lock(&ctx->sequence_mutex); + sem->signal_fence.context = ctx; + sem->signal_fence.ip_type = ip_type; + sem->signal_fence.ip_instance = ip_instance; + sem->signal_fence.ring = ring; + sem->signal_fence.fence = ctx->last_seq[ip_type][ip_instance][ring]; + update_references(NULL, &sem->refcount); + pthread_mutex_unlock(&ctx->sequence_mutex); + return 0; +} + +int amdgpu_cs_wait_semaphore(amdgpu_context_handle ctx, + uint32_t ip_type, + uint32_t ip_instance, + uint32_t ring, + amdgpu_semaphore_handle sem) +{ + if (NULL == ctx) + return -EINVAL; + if (ip_type >= AMDGPU_HW_IP_NUM) + return -EINVAL; + if (ring >= AMDGPU_CS_MAX_RINGS) + return -EINVAL; + if (NULL == sem) + return -EINVAL; + /* must signal first */ + if (NULL == sem->signal_fence.context) + return -EINVAL; + + pthread_mutex_lock(&ctx->sequence_mutex); + list_add(&sem->list, &ctx->sem_list[ip_type][ip_instance][ring]); + pthread_mutex_unlock(&ctx->sequence_mutex); + return 0; +} + +static int amdgpu_cs_reset_sem(amdgpu_semaphore_handle sem) +{ + if (NULL == sem) + return -EINVAL; + if (NULL == sem->signal_fence.context) + return -EINVAL; + + sem->signal_fence.context = NULL;; + sem->signal_fence.ip_type = 0; + sem->signal_fence.ip_instance = 0; + sem->signal_fence.ring = 0; + sem->signal_fence.fence = 0; + + return 0; +} + +static int amdgpu_cs_unreference_sem(amdgpu_semaphore_handle sem) +{ + if (NULL == sem) + return -EINVAL; + + if (update_references(&sem->refcount, NULL)) + free(sem); + return 0; +} + +int amdgpu_cs_destroy_semaphore(amdgpu_semaphore_handle sem) +{ + return amdgpu_cs_unreference_sem(sem); +} + + diff --git a/amdgpu/amdgpu_internal.h b/amdgpu/amdgpu_internal.h index 5d86603..557ba1f 100644 --- a/amdgpu/amdgpu_internal.h +++ b/amdgpu/amdgpu_internal.h @@ -116,6 +116,18 @@ struct amdgpu_context { pthread_mutex_t sequence_mutex; /* context id*/ uint32_t id; + uint64_t last_seq[AMDGPU_HW_IP_NUM][AMDGPU_HW_IP_INSTANCE_MAX_COUNT][AMDGPU_CS_MAX_RINGS]; + struct list_head sem_list[AMDGPU_HW_IP_NUM][AMDGPU_HW_IP_INSTANCE_MAX_COUNT][AMDGPU_CS_MAX_RINGS]; +}; + +/** + * Structure describing sw semaphore based on scheduler + * + */ +struct amdgpu_semaphore { + atomic_t refcount; + struct list_head list; + struct amdgpu_cs_fence signal_fence; };

/**

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 07/10] tests/amdgpu: add semaphore test

From: Chunming Zhou david1.zhou@amd.com

Signed-off-by: Chunming Zhou david1.zhou@amd.com Reviewed-by: Jammy Zhou Jammy.Zhou@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- tests/amdgpu/basic_tests.c | 133 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 133 insertions(+)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index a666d32..56db935 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -48,6 +48,7 @@ static void amdgpu_command_submission_compute(void); static void amdgpu_command_submission_sdma(void); static void amdgpu_command_submission_multi_fence(void); static void amdgpu_userptr_test(void); +static void amdgpu_semaphore_test(void);

CU_TestInfo basic_tests[] = { { "Query Info Test", amdgpu_query_info_test }, @@ -57,6 +58,7 @@ CU_TestInfo basic_tests[] = { { "Command submission Test (Compute)", amdgpu_command_submission_compute }, { "Command submission Test (SDMA)", amdgpu_command_submission_sdma }, { "Command submission Test (Multi-fence)", amdgpu_command_submission_multi_fence }, + { "SW semaphore Test", amdgpu_semaphore_test }, CU_TEST_INFO_NULL, }; #define BUFFER_SIZE (8 * 1024) @@ -79,6 +81,9 @@ CU_TestInfo basic_tests[] = { #define SDMA_OPCODE_COPY 1 # define SDMA_COPY_SUB_OPCODE_LINEAR 0

+#define GFX_COMPUTE_NOP 0xffff1000 +#define SDMA_NOP 0x0 + int suite_basic_tests_init(void) { int r; @@ -335,6 +340,134 @@ static void amdgpu_command_submission_gfx(void) amdgpu_command_submission_gfx_shared_ib(); }

+static void amdgpu_semaphore_test(void) +{ + amdgpu_context_handle context_handle[2]; + amdgpu_semaphore_handle sem; + amdgpu_bo_handle ib_result_handle[2]; + void *ib_result_cpu[2]; + uint64_t ib_result_mc_address[2]; + struct amdgpu_cs_request ibs_request[2] = {0}; + struct amdgpu_cs_ib_info ib_info[2] = {0}; + struct amdgpu_cs_fence fence_status = {0}; + uint32_t *ptr; + uint32_t expired; + amdgpu_bo_list_handle bo_list[2]; + amdgpu_va_handle va_handle[2]; + int r, i; + + r = amdgpu_cs_create_semaphore(&sem); + CU_ASSERT_EQUAL(r, 0); + for (i = 0; i < 2; i++) { + r = amdgpu_cs_ctx_create(device_handle, &context_handle[i]); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096, + AMDGPU_GEM_DOMAIN_GTT, 0, + &ib_result_handle[i], &ib_result_cpu[i], + &ib_result_mc_address[i], &va_handle[i]); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_get_bo_list(device_handle, ib_result_handle[i], + NULL, &bo_list[i]); + CU_ASSERT_EQUAL(r, 0); + } + + /* 1. same context different engine */ + ptr = ib_result_cpu[0]; + ptr[0] = SDMA_NOP; + ib_info[0].ib_mc_address = ib_result_mc_address[0]; + ib_info[0].size = 1; + + ibs_request[0].ip_type = AMDGPU_HW_IP_DMA; + ibs_request[0].number_of_ibs = 1; + ibs_request[0].ibs = &ib_info[0]; + ibs_request[0].resources = bo_list[0]; + ibs_request[0].fence_info.handle = NULL; + r = amdgpu_cs_submit(context_handle[0], 0,&ibs_request[0], 1); + CU_ASSERT_EQUAL(r, 0); + r = amdgpu_cs_signal_semaphore(context_handle[0], AMDGPU_HW_IP_DMA, 0, 0, sem); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_cs_wait_semaphore(context_handle[0], AMDGPU_HW_IP_GFX, 0, 0, sem); + CU_ASSERT_EQUAL(r, 0); + ptr = ib_result_cpu[1]; + ptr[0] = GFX_COMPUTE_NOP; + ib_info[1].ib_mc_address = ib_result_mc_address[1]; + ib_info[1].size = 1; + + ibs_request[1].ip_type = AMDGPU_HW_IP_GFX; + ibs_request[1].number_of_ibs = 1; + ibs_request[1].ibs = &ib_info[1]; + ibs_request[1].resources = bo_list[1]; + ibs_request[1].fence_info.handle = NULL; + + r = amdgpu_cs_submit(context_handle[0], 0,&ibs_request[1], 1); + CU_ASSERT_EQUAL(r, 0); + + fence_status.context = context_handle[0]; + fence_status.ip_type = AMDGPU_HW_IP_GFX; + fence_status.fence = ibs_request[1].seq_no; + r = amdgpu_cs_query_fence_status(&fence_status, + 500000000, 0, &expired); + CU_ASSERT_EQUAL(r, 0); + CU_ASSERT_EQUAL(expired, true); + + /* 2. same engine different context */ + ptr = ib_result_cpu[0]; + ptr[0] = GFX_COMPUTE_NOP; + ib_info[0].ib_mc_address = ib_result_mc_address[0]; + ib_info[0].size = 1; + + ibs_request[0].ip_type = AMDGPU_HW_IP_GFX; + ibs_request[0].number_of_ibs = 1; + ibs_request[0].ibs = &ib_info[0]; + ibs_request[0].resources = bo_list[0]; + ibs_request[0].fence_info.handle = NULL; + r = amdgpu_cs_submit(context_handle[0], 0,&ibs_request[0], 1); + CU_ASSERT_EQUAL(r, 0); + r = amdgpu_cs_signal_semaphore(context_handle[0], AMDGPU_HW_IP_GFX, 0, 0, sem); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_cs_wait_semaphore(context_handle[1], AMDGPU_HW_IP_GFX, 0, 0, sem); + CU_ASSERT_EQUAL(r, 0); + ptr = ib_result_cpu[1]; + ptr[0] = GFX_COMPUTE_NOP; + ib_info[1].ib_mc_address = ib_result_mc_address[1]; + ib_info[1].size = 1; + + ibs_request[1].ip_type = AMDGPU_HW_IP_GFX; + ibs_request[1].number_of_ibs = 1; + ibs_request[1].ibs = &ib_info[1]; + ibs_request[1].resources = bo_list[1]; + ibs_request[1].fence_info.handle = NULL; + r = amdgpu_cs_submit(context_handle[1], 0,&ibs_request[1], 1); + + CU_ASSERT_EQUAL(r, 0); + + fence_status.context = context_handle[1]; + fence_status.ip_type = AMDGPU_HW_IP_GFX; + fence_status.fence = ibs_request[1].seq_no; + r = amdgpu_cs_query_fence_status(&fence_status, + 500000000, 0, &expired); + CU_ASSERT_EQUAL(r, 0); + CU_ASSERT_EQUAL(expired, true); + for (i = 0; i < 2; i++) { + r = amdgpu_bo_unmap_and_free(ib_result_handle[i], va_handle[i], + ib_result_mc_address[i], 4096); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_list_destroy(bo_list[i]); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_cs_ctx_free(context_handle[i]); + CU_ASSERT_EQUAL(r, 0); + } + + r = amdgpu_cs_destroy_semaphore(sem); + CU_ASSERT_EQUAL(r, 0); +} + static void amdgpu_command_submission_compute(void) { amdgpu_context_handle context_handle;

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 08/10] amdgpu: validate user memory for userptr

From: Chunming Zhou David1.Zhou@amd.com

Signed-off-by: Chunming Zhou David1.Zhou@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- amdgpu/amdgpu_bo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index 2ae1c18..d30fd1e 100644 --- a/amdgpu/amdgpu_bo.c +++ b/amdgpu/amdgpu_bo.c @@ -539,7 +539,8 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev, struct drm_amdgpu_gem_userptr args;

args.addr = (uintptr_t)cpu; - args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER; + args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER | + AMDGPU_GEM_USERPTR_VALIDATE; args.size = size; r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR, &args, sizeof(args));

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 09/10] amdgpu: cs_wait_fences now can return the first signaled fence index

From: "monk.liu" Monk.Liu@amd.com

Signed-off-by: monk.liu Monk.Liu@amd.com --- amdgpu/amdgpu.h | 3 ++- amdgpu/amdgpu_cs.c | 12 +++++++++--- include/drm/amdgpu_drm.h | 3 ++- tests/amdgpu/basic_tests.c | 2 +- 4 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index 8822a0c..d4be7fc 100644 --- a/amdgpu/amdgpu.h +++ b/amdgpu/amdgpu.h @@ -916,6 +916,7 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence *fence, * otherwise, wait at least one fence * \param timeout_ns - \c [in] The timeout to wait, in nanoseconds * \param status - \c [out] '1' for signaled, '0' for timeout + * \param first - \c [out] the index of the first signaled fence from @fences * * \return 0 on success * <0 - Negative POSIX Error code @@ -927,7 +928,7 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, uint32_t fence_count, bool wait_all, uint64_t timeout_ns, - uint32_t *status); + uint32_t *status, uint32_t *first);

/* * Query / Info API diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index d033f8e..5c7a3a3 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -439,7 +439,8 @@ static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence *fences, uint32_t fence_count, bool wait_all, uint64_t timeout_ns, - uint32_t *status) + uint32_t *status, + uint32_t *first) { struct drm_amdgpu_fence *drm_fences; amdgpu_device_handle dev = fences[0].context->dev; @@ -467,6 +468,10 @@ static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence *fences, return -errno;

*status = args.out.status; + + if (first) + *first = args.out.first_signaled; + return 0; }

@@ -474,7 +479,8 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, uint32_t fence_count, bool wait_all, uint64_t timeout_ns, - uint32_t *status) + uint32_t *status, + uint32_t *first) { uint32_t ioctl_status = 0; uint32_t i; @@ -499,7 +505,7 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, *status = 0;

r = amdgpu_ioctl_wait_fences(fences, fence_count, wait_all, timeout_ns, - &ioctl_status); + &ioctl_status, first);

if (!r) *status = ioctl_status; diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h index 2cbea72..194e1f9 100644 --- a/include/drm/amdgpu_drm.h +++ b/include/drm/amdgpu_drm.h @@ -316,7 +316,8 @@ struct drm_amdgpu_wait_fences_in { };

struct drm_amdgpu_wait_fences_out { - uint64_t status; + uint32_t status; + uint32_t first_signaled; };

union drm_amdgpu_wait_fences { diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index 56db935..47cd1db 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -974,7 +974,7 @@ static void amdgpu_command_submission_multi_fence_wait_all(bool wait_all)

r = amdgpu_cs_wait_fences(fence_status, ib_cs_num, wait_all, AMDGPU_TIMEOUT_INFINITE, - &expired); + &expired, NULL); CU_ASSERT_EQUAL(r, 0);

r = amdgpu_bo_unmap_and_free(ib_result_handle, va_handle,

-- 2.1.4

Marek Olšák

9:23 p.m.

New subject: [PATCH 10/10] amdgpu: list each entry safely for sw semaphore when submit ib

From: Junwei Zhang Jerry.Zhang@amd.com

Signed-off-by: Junwei Zhang Jerry.Zhang@amd.com Reviewed-by: Michel Dänzer michel.daenzer@amd.com Reviewed-by: David Zhou david1.zhou@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- amdgpu/amdgpu_cs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index 5c7a3a3..82fa805 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -179,7 +179,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, struct drm_amdgpu_cs_chunk_dep *dependencies = NULL; struct drm_amdgpu_cs_chunk_dep *sem_dependencies = NULL; struct list_head *sem_list; - amdgpu_semaphore_handle sem; + amdgpu_semaphore_handle sem, tmp; uint32_t i, size, sem_count = 0; bool user_fence; int r = 0; @@ -282,7 +282,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle context, goto error_unlock; } sem_count = 0; - LIST_FOR_EACH_ENTRY(sem, sem_list, list) { + LIST_FOR_EACH_ENTRY_SAFE(sem, tmp, sem_list, list) { struct amdgpu_cs_fence *info = &sem->signal_fence; struct drm_amdgpu_cs_chunk_dep *dep = &sem_dependencies[sem_count++]; dep->ip_type = info->ip_type;

-- 2.1.4

Alex Deucher

9:30 p.m.

On Tue, Jan 12, 2016 at 4:23 PM, Marek Olšák maraeo@gmail.com wrote:

...

Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, sequence_mutex should be renamed to a more appropriate name. That can be done as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the kernel contains (or will contain) the changes too, so that I don't push something that doesn't exist in the kernel.

We haven't pushed DRM_IOCTL_AMDGPU_WAIT_FENCES upstream yet so I would hold off on any changes that depend on that.

Alex

...

Please let me know if these are okay to push.

Thanks,

Chunming Zhou (3): amdgpu: add semaphore support tests/amdgpu: add semaphore test amdgpu: validate user memory for userptr

Junwei Zhang (3): amdgpu: add the interface of waiting multiple fences amdgpu/tests: add multi-fence test in base test amdgpu: list each entry safely for sw semaphore when submit ib

Marek Olšák (1): Revert "amdgpu: remove sequence mutex"

Michel Dänzer (1): amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

monk.liu (2): amdgpu: drop address patching logics amdgpu: cs_wait_fences now can return the first signaled fence index

amdgpu/amdgpu.h | 88 +++++++++++++++++++++++++++++++++++++++++ amdgpu/amdgpu_bo.c | 14 ++----- amdgpu/amdgpu_cs.c | 253 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- amdgpu/amdgpu_internal.h | 15 +++++++ include/drm/amdgpu_drm.h | 28 +++++++++++++ tests/amdgpu/basic_tests.c | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 616 insertions(+), 15 deletions(-)

Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Christian König

13 Jan 13 Jan

10:43 a.m.

Am 12.01.2016 um 22:30 schrieb Alex Deucher:

...

On Tue, Jan 12, 2016 at 4:23 PM, Marek Olšák maraeo@gmail.com wrote:

...
Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, sequence_mutex should be renamed to a more appropriate name. That can be done as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the kernel contains (or will contain) the changes too, so that I don't push something that doesn't exist in the kernel.

We haven't pushed DRM_IOCTL_AMDGPU_WAIT_FENCES upstream yet so I would hold off on any changes that depend on that.

Yeah, and do we really have patch #9 in our internal branch without an Review? Cause that one breaks the API.

Christian.

...

Alex

...
Please let me know if these are okay to push.

Thanks,

Chunming Zhou (3): amdgpu: add semaphore support tests/amdgpu: add semaphore test amdgpu: validate user memory for userptr

Junwei Zhang (3): amdgpu: add the interface of waiting multiple fences amdgpu/tests: add multi-fence test in base test amdgpu: list each entry safely for sw semaphore when submit ib

Marek Olšák (1): Revert "amdgpu: remove sequence mutex"

Michel Dänzer (1): amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

monk.liu (2): amdgpu: drop address patching logics amdgpu: cs_wait_fences now can return the first signaled fence index

amdgpu/amdgpu.h | 88 +++++++++++++++++++++++++++++++++++++++++ amdgpu/amdgpu_bo.c | 14 ++----- amdgpu/amdgpu_cs.c | 253 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- amdgpu/amdgpu_internal.h | 15 +++++++ include/drm/amdgpu_drm.h | 28 +++++++++++++ tests/amdgpu/basic_tests.c | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 616 insertions(+), 15 deletions(-)

Marek _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Marek Olšák

11:15 a.m.

On Wed, Jan 13, 2016 at 11:43 AM, Christian König deathsimple@vodafone.de wrote:

...

Am 12.01.2016 um 22:30 schrieb Alex Deucher:

...
On Tue, Jan 12, 2016 at 4:23 PM, Marek Olšák maraeo@gmail.com wrote:

...
Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, sequence_mutex should be renamed to a more appropriate name. That can be done as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the kernel contains (or will contain) the changes too, so that I don't push something that doesn't exist in the kernel.

We haven't pushed DRM_IOCTL_AMDGPU_WAIT_FENCES upstream yet so I would hold off on any changes that depend on that.

Yeah, and do we really have patch #9 in our internal branch without an Review? Cause that one breaks the API.

It doesn't break the API, because the API is added by this series in an earlier patch. The API would be broken if it was changed between two libdrm versions.

Marek

Christian König

11:34 a.m.

Am 13.01.2016 um 12:15 schrieb Marek Olšák:

...

On Wed, Jan 13, 2016 at 11:43 AM, Christian König deathsimple@vodafone.de wrote:

...
Am 12.01.2016 um 22:30 schrieb Alex Deucher:

...
On Tue, Jan 12, 2016 at 4:23 PM, Marek Olšák maraeo@gmail.com wrote:

...
Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, sequence_mutex should be renamed to a more appropriate name. That can be done as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the kernel contains (or will contain) the changes too, so that I don't push something that doesn't exist in the kernel.

We haven't pushed DRM_IOCTL_AMDGPU_WAIT_FENCES upstream yet so I would hold off on any changes that depend on that.

Yeah, and do we really have patch #9 in our internal branch without an Review? Cause that one breaks the API.

It doesn't break the API, because the API is added by this series in an earlier patch. The API would be broken if it was changed between two libdrm versions.

Ah, in this case please squash the two patches together for upstreaming.

Regards, Christian.

...

Marek

3395

Age (days ago)

3396

Last active (days ago)

dri-devel@lists.freedesktop.org

15 comments

4 participants

tags (0)

participants (4)

Alex Deucher
Christian König
Marek Olšák
Michel Dänzer