On Thu, Feb 4, 2021 at 10:09 AM Chema Casanova jmcasanova@igalia.com wrote:
On 3/9/20 18:48, Yukimasa Sugizaki wrote:
From: Yukimasa Sugizaki ysugi@idein.jp
The default timeout is 500 ms which is too short for some workloads including Piglit. Adding this parameter will help users to run heavier tasks.
Signed-off-by: Yukimasa Sugizaki ysugi@idein.jp
drivers/gpu/drm/v3d/v3d_sched.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index feef0c749e68..983efb018560 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -19,11 +19,17 @@ */
#include <linux/kthread.h> +#include <linux/moduleparam.h>
#include "v3d_drv.h" #include "v3d_regs.h" #include "v3d_trace.h"
+static uint timeout = 500; +module_param(timeout, uint, 0444); +MODULE_PARM_DESC(timeout,
"Timeout for a job in ms (0 means infinity and default is 500 ms)");
I think that parameter definition could be included at v3d_drv.c as other drivers do. Then we would have all future parameters together. In that case we would need also to include the extern definition at v3d_drv.h for the driver variable.
The param name could be more descriptive like "sched_timeout_ms" in the lima driver.
I wonder if it would make sense to have different timeouts parameters for different type of jobs we have. At least this series come from the need additional time to complete compute jobs. This is what amdgpu does, but we probably don't need something such complex.
I think it would also make sense to increase our default value for the compute jobs.
What do you think?
In any case I think that having this general scheduling timeout parameter as other drivers do is a good idea.
Having a param for being able to test if the job completes eventually is a good idea, but if tests are triggering timeouts, then you probably need to investigate what's going on in the driver -- it's not like you want .5second unpreemptible jobs to be easy to produce.
Also, for CS, I wonder if we have another reg besides CSD_CURRENT_CFG4 we could be looking at for "we're making progress on the job". Half a second with no discernible progress sounds like a bug.