On 01/02/2013 05:42 PM, Terje Bergström wrote:
On 28.12.2012 11:14, Mark Zhang wrote:
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index a936902..c3ded60 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -131,6 +131,14 @@ static int gr2d_submit(struct tegra_drm_context *context, if (err) goto fail;
/* We define CMA as an temporary solution in host1x driver.
That's also why we have a CMA kernel config in it.
But seems here, in tegradrm, we hardcode the CMA here.
So what do you do when host1x change to IOMMU?
Do we also need to define a CMA config in tegradrm
driver,
then after host1x changes to IOMMU, we add another IOMMU
config in tegradrm? Or we should invent another more
generic wrapper layer here? */ cmdbuf.mem = handle_cma_to_host1x(drm, file_priv,
cmdbuf.mem);
Lucas is working on host1x allocator, so let's defer this comment until we have Lucas' code.
OK.
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cc8ca2f..0867b72 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -82,6 +82,26 @@ struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, mem += num_unpins * sizeof(dma_addr_t); job->pin_ids = num_unpins ? mem : NULL;
/* I think this is a somewhat ugly design.
Actually "addr_phys" is consisted by "reloc_addr_phys" and
"gather_addr_phys".
Why don't we just declare "reloc_addr_phys" and
"gather_addr_phys"?
In current design, let's say if one nvhost newbie changes the
order
of these 2 blocks of code in function "pin_job_mem":
for (i = 0; i < job->num_relocs; i++) {
struct host1x_reloc *reloc = &job->relocarray[i];
job->pin_ids[count] = reloc->target;
count++;
}
for (i = 0; i < job->num_gathers; i++) {
struct host1x_job_gather *g = &job->gathers[i];
job->pin_ids[count] = g->mem_id;
count++;
}
Then likely something weird occurs... */
We do this because this way we can implement batch pinning of memory handles. This way we can decrease memory handle management a lot as we need to do locking only once per submit.
Sorry I didn't get it. Yes, in current design, you can pin all mem handles in one time but I haven't found anything related with "locking only once per submit".
My idea is: - remove "job->addr_phys" - assign "job->reloc_addr_phys" & "job->gather_addr_phys" separately - In "pin_job_mem", just call "host1x_memmgr_pin_array_ids" twice to fill the "reloc_addr_phy" & "gather_addr_phys".
Anything I misunderstood?
Decreasing memory management overhead is really important, because in complex graphics cases, there are might be a hundreds of buffer references per submit, and several submits per frame. Any extra overhead relates directly to reduced performance.
job->reloc_addr_phys = job->addr_phys; job->gather_addr_phys = &job->addr_phys[num_relocs];
@@ -252,6 +272,10 @@ static int do_relocs(struct host1x_job *job, } }
/* I have question here. Does this mean the address info
which need to be relocated(according to the libdrm codes,
seems this address is "0xDEADBEEF") always staying at the
beginning of a page? */ __raw_writel( (job->reloc_addr_phys[i] + reloc->target_offset) >> reloc->shift,
No - the slot can be anywhere. That's why we have cmdbuf_offset in the reloc struct.
Yes. Sorry I must misread the code before.
@@ -565,6 +589,7 @@ int host1x_job_pin(struct host1x_job *job, struct platform_device *pdev) } }
/* if (host1x_firewall && !err) { */ if (host1x_firewall) { err = copy_gathers(job, pdev); if (err) {
Will add.
@@ -573,6 +598,9 @@ int host1x_job_pin(struct host1x_job *job, struct platform_device *pdev) } }
+/* Rename this label to "out" or something else.
- Because if everything goes right, the codes under this label also
- get executed. */
fail: wmb();
Will do.
diff --git a/drivers/gpu/host1x/memmgr.c b/drivers/gpu/host1x/memmgr.c index f3954f7..bb5763d 100644 --- a/drivers/gpu/host1x/memmgr.c +++ b/drivers/gpu/host1x/memmgr.c @@ -156,6 +156,9 @@ int host1x_memmgr_pin_array_ids(struct platform_device *dev, count, &unpin_data[pin_count], phys_addr);
/* I don't think the current "host1x_cma_pin_array_ids"
is able to return a negative value. So this "if" doesn't
make sense...*/ if (cma_count < 0) { /* clean up previous handles */ while (pin_count) {
It should return negative in case of error.
"host1x_cma_pin_array_ids" doesn't return negative value right now, so maybe you need to take a look at it.
Terje