Re: Report 2 in ext4 and journal based on v5.17-rc1

3 Mar 2022


      Ted wrote:
...
On Thu, Mar 03, 2022 at 10:00:33AM +0900, Byungchul Park wrote:
...
Unfortunately, it's neither perfect nor safe without another wakeup
source - rescue wakeup source.
consumer			producer
		lock L
		(too much work queued == true)
		unlock L
		--- preempted

lock L
   unlock L
   do work
   lock L
   unlock L
   do work
   ...
   (no work == true)
   sleep
   			--- scheduled in
   			sleep
That's not how things work in ext4.  It's **way** more complicated
You seem to get it wrong. This example is what Jan Kara gave me. I just
tried to explain things based on Jan Kara's example so leaving all
statements that Jan Kara wrote. Plus the example was so helpful. Thanks,
Jan Kara.
...
than that.  We have multiple wait channels, one wake up the consumer
(read: the commit thread), and one which wakes up any processes
waiting for commit thread to have made forward progress.  We also have
two spin-lock protected sequence number, one which indicates the
current commited transaction #, and one indicating the transaction #
that needs to be committed.
On the commit thread, it will sleep on j_wait_commit, and when it is
woken up, it will check to see if there is work to be done
(j_commit_sequence != j_commit_request), and if so, do the work, and
then wake up processes waiting on the wait_queue j_wait_done_commit.
(Again, all of this uses the pattern, "prepare to wait", then check to
see if we should sleep, if we do need to sleep, unlock j_state_lock,
then sleep.   So this prevents any races leading to lost wakeups.
On the start_this_handle() thread, if we current transaction is too
full, we set j_commit_request to its transaction id to indicate that
we want the current transaction to be committed, and then we wake up
the j_wait_commit wait queue and then we enter a loop where do a
prepare_to_wait in j_wait_done_commit, check to see if
j_commit_sequence == the transaction id that we want to be completed,
and if it's not done yet, we unlock the j_state_lock spinlock, and go
to sleep.  Again, because of the prepare_to_wait, there is no chance
of a lost wakeup.
The above explantion gives me a clear view about synchronization of
journal things. I appreciate it.
...
So there really is no "consumer" and "producer" here.  If you really
insist on using this model, which really doesn't apply, for one
Dept does not assume "consumer" and "producer" model at all, but Dept
works with general waits and events. *That model is just one of them.*
...
thread, it's the consumer with respect to one wait queue, and the
producer with respect to the *other* wait queue.  For the other
thread, the consumer and producer roles are reversed.
And of course, this is a highly simplified model, since we also have a
wait queue used by the commit thread to wait for the number of active
handles on a particular transaction to go to zero, and
stop_this_handle() will wake up commit thread via this wait queue when
the last active handle on a particular transaction is retired.  (And
yes, that parameter is also protected by a different spin lock which
is per-transaction).
This one also gives me a clear view. Thanks a lot.
...
So it seems to me that a fundamental flaw in DEPT's model is assuming
that the only waiting paradigm that can be used is consumer/producer,
No, Dept does not.
...
and that's simply not true.  The fact that you use the term "lock" is
also going to lead a misleading line of reasoning, because properly
"lock/unlock L" comes from the Jan Kara's example. It has almost nothing
to do with the explanation. I just left "lock/unlock L" as a statement
that comes from the Jan Kara's example.
...
speaking, they aren't really locks.  We are simply using wait channels
I totally agree with you. *They aren't really locks but it's just waits
and wakeups.* That's exactly why I decided to develop Dept. Dept is not
interested in locks unlike Lockdep, but fouces on waits and wakeup
sources itself. I think you get Dept wrong a lot. Please ask me more if
you have things you doubt about Dept.
Thanks,
Byungchul

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Report 2 in ext4 and journal based on v5.17-rc1