On Wed, Feb 06, 2013 at 01:23:17PM -0800, Terje Bergström wrote:
On 05.02.2013 01:54, Thierry Reding wrote:
On Mon, Feb 04, 2013 at 09:17:45PM -0800, Terje Bergström wrote:
Yeah, it's actually working around the host1x duplicate naming. host1x_syncpt_get takes struct host1x as parameter, but that's different host1x than in this code.
So maybe a better way would be to rename the DRM host1x after all. If it avoids the need for workarounds such as this I think it justifies the additional churn.
Ok, I'll include that. Do you have a preference for the name? Something like "host1x_drm" might work?
Yes, that sounds good.
Also, how useful is it to create a context? Looking at the gr2d implementation for .open_channel(), it will return the same channel to whichever userspace process requests them. Can you explain why it is necessary at all? From the name I would have expected some kind of context switching to take place when different applications submit requests to the same client, but that doesn't seem to be the case.
Hardware context switching will be a later submit, and it'll actually create a new structure. Hardware context might live longer than the process that created it, so they'll need to be separate.
Why would it live longer than the process? Isn't the whole purpose of the context to keep per-process state? What use is that state if the process dies?
Hardware context has to be kept alive for as long as there's a job running from that process. If an app sends 10 jobs to 2D channel, and dies immediately, there's no sane way for host1x to remove the jobs from queue. The jobs will keep on running and kernel will need to track them.
Okay, I understand now. There was one additional thing that I wanted to point out, but the context is gone now. I'll go through the patch again and reply there.
I fail to see how dma_buf would require a separate mem_mgr_type. Can we perhaps postpone this to a later point and just go with CMA as the only alternative for now until we have an actual working implementation that we can use this for?
Each submit refers to a number of buffers. Some of them are the streams, some are textures or other input/output buffers. Each of these buffers might be passed as a GEM handle, or (when implemented) as a dma_buf fd. Thus we need a field to tell host1x which API to call to handle that handle.
Understood.
I think we can leave out the code for managing the type until we actually have separate memory managers. That'd make GEM handles effectively of type 0, as we don't set it.
I think that's a good idea. Let's start simple for now and who knows what else will have changed by the time we get to implement dma_buf. Maybe Lucas will have finished his work on the allocator and we will need to synchronize with that anyway.
+static int gr2d_submit(struct host1x_drm_context *context,
struct tegra_drm_submit_args *args,
struct drm_device *drm,
struct drm_file *file_priv)
+{
struct host1x_job *job;
int num_cmdbufs = args->num_cmdbufs;
int num_relocs = args->num_relocs;
int num_waitchks = args->num_waitchks;
struct tegra_drm_cmdbuf __user *cmdbufs =
(void * __user)(uintptr_t)args->cmdbufs;
struct tegra_drm_reloc __user *relocs =
(void * __user)(uintptr_t)args->relocs;
struct tegra_drm_waitchk __user *waitchks =
(void * __user)(uintptr_t)args->waitchks;
No need for all the uintptr_t casts.
Will try to remove - but I do remember getting compiler warnings without them.
I think you shouldn't even have to cast to void * first. Just cast to the target type directly. I don't see why the compiler should complain.
This is what I get without them:
drivers/gpu/host1x/drm/gr2d.c:108:3: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] drivers/gpu/host1x/drm/gr2d.c:110:3: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] drivers/gpu/host1x/drm/gr2d.c:112:3: warning: cast to pointer from integer of different size [-Wint-to-pointer-c
The problem is that the fields are __u64's and can't be cast directly into 32-bit pointers.
Alright.
That's the security firewall. It walks through each submit, and ensures that each register write that writes an address, goes through the host1x reloc mechanism. This way user space cannot ask 2D to write to arbitrary memory locations.
I see. Can this be made more generic? Perhaps adding a table of valid registers to the device and use a generic function to iterate over that instead of having to provide the same function for each client.
For which one does gcc generate more efficient code? I've thought a switch-case statement might get compiled into something more efficient than a table lookup.
But the rest of the code is generic - just the one function which compares against known address registers is specific to 2D.
Table lookup should be pretty fast. I wouldn't worry too much about performance at this stage, though. Readability is more important in my opinion. A lookup table is a lot more readable and reusable I think. If it turns out that using a function is actually faster we can always optimize later.
Thierry