On 2021/1/28 5:50, Felix Kuehling wrote:
Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang:
Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption.
Thank you for pointing out that discrepancy. That's a good catch!
I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. We should probably also fix up the query_status and amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence values everywhere to be consistent.
Regards, Felix
Hi Felix, Thanks for your advice, please check v2 at https://lore.kernel.org/patchwork/patch/1372584/ Thanks, Qu.
Signed-off-by: Qu Huang jinsdb@126.com
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index e686ce2..8b38d0c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) pr_debug("Allocating fence memory\n");
/* allocate fence memory on the gart */
- retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), &dqm->fence_mem);
if (retval)