From: Kent Overstreet <kent.overstreet@linux.dev>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>,
Miklos Szeredi <miklos@szeredi.hu>,
lsf-pc@lists.linux-foundation.org,
Matthew Wilcox <willy@infradead.org>,
dwmw2@infradead.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, Dave Chinner <dchinner@redhat.com>
Subject: Re: [LSF/MM/BPF TOPIC] Replacing TASK_(UN)INTERRUPTIBLE with regions of uninterruptibility
Date: Sat, 3 Feb 2024 12:27:26 -0500 [thread overview]
Message-ID: <xnbhx2wbnsso2vzexs2fzit7xxzal2qriphent3pojexvwquni@gkho3q7eho6n> (raw)
In-Reply-To: <20240202162346.GB2087318@ZenIV>
On Fri, Feb 02, 2024 at 04:23:46PM +0000, Al Viro wrote:
> On Fri, Feb 02, 2024 at 11:22:15AM +0000, David Howells wrote:
> > Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > > Just making inode_lock() interruptible would break everything.
> >
> > Why? Obviously, you'd need to check the result of the inode_lock(), which I
> > didn't put in my very rough example code, but why would taking the lock at the
> > front of a vfs op like mkdir be a problem?
>
> Plenty of new failure exits to maintain?
I don't currently see a reason to go around converting existing
uninterruptible sleeps; the main benefit of the proposal as I see it
would be that we could mark sleeps as either interruptible or killable
correctly, since that really depends on what syscall we're in and what
userspace is expecting. If kernel code can correctly do one it can do
both, so this is a pretty straightforward change.
But it is an interesting idea, I'd be curious to see what comes out of
playing around with some refactorings.
There's some other wait_event() related ideas kicking around too...
Willy and Dave and I were talking about the "asynchronous waits" that
io_uring is wanting to do - I believe this is currently just done in an
ad-hoc way for waiting on a folio lock.
It seemed like it might be possible to do this in a more generic way by
simply dynamically allocating the waitlist entry, and signalling via
task_struct the wait/wakeup should be delivered to a kiocb, instead of
to a thread.
Another thing I've been wanting to do is embed a sequence number in
wait_queue_head_t, which would be incremented on wakeup. This would
change prepare_to_wait() to "read current sequence number", then later
we sleep until the sequence number has changed from what we initially
read.
This would let us fix double expansion of the wait condition in the
wait_event() macros, and it would also mean we're not flipping task
state before running the cond expression...
next prev parent reply other threads:[~2024-02-03 17:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-02 8:51 David Howells
2024-02-02 9:08 ` Miklos Szeredi
2024-02-02 9:43 ` Kent Overstreet
2024-02-02 10:30 ` David Howells
2024-02-02 10:46 ` Miklos Szeredi
2024-02-02 11:22 ` David Howells
2024-02-02 12:06 ` Miklos Szeredi
2024-02-02 12:44 ` Kent Overstreet
2024-02-02 16:23 ` Al Viro
2024-02-03 17:27 ` Kent Overstreet [this message]
2024-02-02 13:28 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xnbhx2wbnsso2vzexs2fzit7xxzal2qriphent3pojexvwquni@gkho3q7eho6n \
--to=kent.overstreet@linux.dev \
--cc=dchinner@redhat.com \
--cc=dhowells@redhat.com \
--cc=dwmw2@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=miklos@szeredi.hu \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox