Den 15.06.2021 11.17, skrev Peter Stuge:
Hi Noralf,
Noralf Trønnes wrote:
+static int gud_usb_bulk(struct gud_device *gdrm, size_t len)
..
timer_setup_on_stack(&ctx.timer, gud_usb_bulk_timeout, 0);
mod_timer(&ctx.timer, jiffies + msecs_to_jiffies(3000));
usb_sg_wait(&ctx.sgr);
if (!del_timer_sync(&ctx.timer))
ret = -ETIMEDOUT;
..
Mention in the commit message that sending USB bulk transfers with an sglist could be unstable
Can you explain a bit about /how/ it is unstable?
As you write, usb_bulk_msg() (as used before) has a timeout which is passed to the host controller hardware and implemented there.
I haven't used SG with kernel USB but I would expect such a timeout to still be available with SG?
I have taken a closer look and usb_bulk_msg() calls usb_start_wait_urb() which uses wait_for_completion_timeout() so the timeout isn't handled by the hardware.
usb_sg_wait() on the other hand uses plain wait_for_completion() without the timeout. So ideally usb_sg_wait() should have had a timeout parameter and used wait_for_completion_timeout().
usb_bulk_msg() has builtin timeout handling and during development of a microcontroller gadget implementation I've triggered this timeout several times when the uC usb interrupts stopped firing.
The device not responding to bulk packets scheduled and sent by the host is a real error /in the device/ and thus not neccessarily something the kernel must handle gracefully.. I think it's quite nice to do so, but one can argue that it's not strictly required.
But more importantly: Remember that bulk transfer has no delivery time guarantee. It can take indefinitely long until a bulk transfer is scheduled by the host on a busy bus which is starved with more important things (control, interrupt, iso transfers) - that's not an error at all, and may be indistinguishable from the device not responding to packets actually sent by the host.
Having a timeout is important, I just expect the USB SG interface to support it since it is the hardware that times out in the non-SG case.
And since this is essentially real time data maybe a shorter timeout is better? 3 seconds seems really long.
I have looked at what the others are using: - rtsx_usb uses 10 seconds in one place - vub300 uses 2 seconds plus a length based addition - usbtest uses 10 seconds.
The other USB DRM driver gm12u320 uses a 1 second timeout per block, so a worst case timeout of 20 seconds per frame.
3 seconds is a "long time", but compared to the default control request timeout USB_CTRL_GET_TIMEOUT which is 5 seconds, and which gud uses, it's not that long. I don't want to put too much limitation on the device, but ofc can't allow it to hang the driver.
And a timeout is an exception so hitting that probably means something is seriously wrong. I though of adding some kind of usb bus reset handling to the driver that kicks in after the device has been unresponsive for some time, but dropped that since I have so limited understanding of things USB.
Noralf.
The timeout must include all latency for a frame, so e.g. 16ms (60 Hz) is too short for sure. But maybe something like 500ms?
//Peter