From: Andreas Gruenbacher <agruenba@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: Andy Lutomirski <luto@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
cluster-devel <cluster-devel@redhat.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 6/6] gfs2: Fix mmap + page fault deadlocks (part 2)
Date: Fri, 21 May 2021 17:46:04 +0200 [thread overview]
Message-ID: <CAHc6FU6df7cBbjmYOZE35v_FALWRO62cYjg2Y9rY+Hd6x5yeyw@mail.gmail.com> (raw)
In-Reply-To: <20210521152352.GQ18952@quack2.suse.cz>
On Fri, May 21, 2021 at 5:23 PM Jan Kara <jack@suse.cz> wrote:
> On Thu 20-05-21 16:07:56, Andreas Gruenbacher wrote:
> > On Thu, May 20, 2021 at 3:30 PM Jan Kara <jack@suse.cz> wrote:
> > > On Thu 20-05-21 14:25:36, Andreas Gruenbacher wrote:
> > > > Now that we handle self-recursion on the inode glock in gfs2_fault and
> > > > gfs2_page_mkwrite, we need to take care of more complex deadlock
> > > > scenarios like the following (example by Jan Kara):
> > > >
> > > > Two independent processes P1, P2. Two files F1, F2, and two mappings M1,
> > > > M2 where M1 is a mapping of F1, M2 is a mapping of F2. Now P1 does DIO
> > > > to F1 with M2 as a buffer, P2 does DIO to F2 with M1 as a buffer. They
> > > > can race like:
> > > >
> > > > P1 P2
> > > > read() read()
> > > > gfs2_file_read_iter() gfs2_file_read_iter()
> > > > gfs2_file_direct_read() gfs2_file_direct_read()
> > > > locks glock of F1 locks glock of F2
> > > > iomap_dio_rw() iomap_dio_rw()
> > > > bio_iov_iter_get_pages() bio_iov_iter_get_pages()
> > > > <fault in M2> <fault in M1>
> > > > gfs2_fault() gfs2_fault()
> > > > tries to grab glock of F2 tries to grab glock of F1
> > > >
> > > > Those kinds of scenarios are much harder to reproduce than
> > > > self-recursion.
> > > >
> > > > We deal with such situations by using the LM_FLAG_OUTER flag to mark
> > > > "outer" glock taking. Then, when taking an "inner" glock, we use the
> > > > LM_FLAG_TRY flag so that locking attempts that don't immediately succeed
> > > > will be aborted. In case of a failed locking attempt, we "unroll" to
> > > > where the "outer" glock was taken, drop the "outer" glock, and fault in
> > > > the first offending user page. This will re-trigger the "inner" locking
> > > > attempt but without the LM_FLAG_TRY flag. Once that has happened, we
> > > > re-acquire the "outer" glock and retry the original operation.
> > > >
> > > > Reported-by: Jan Kara <jack@suse.cz>
> > > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > >
> > > ...
> > >
> > > > diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
> > > > index 7d88abb4629b..8b26893f8dc6 100644
> > > > --- a/fs/gfs2/file.c
> > > > +++ b/fs/gfs2/file.c
> > > > @@ -431,21 +431,30 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf)
> > > > vm_fault_t ret = VM_FAULT_LOCKED;
> > > > struct gfs2_holder gh;
> > > > unsigned int length;
> > > > + u16 flags = 0;
> > > > loff_t size;
> > > > int err;
> > > >
> > > > sb_start_pagefault(inode->i_sb);
> > > >
> > > > - gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
> > > > + if (current_holds_glock())
> > > > + flags |= LM_FLAG_TRY;
> > > > +
> > > > + gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, flags, &gh);
> > > > if (likely(!outer_gh)) {
> > > > err = gfs2_glock_nq(&gh);
> > > > if (err) {
> > > > ret = block_page_mkwrite_return(err);
> > > > + if (err == GLR_TRYFAILED) {
> > > > + set_current_needs_retry(true);
> > > > + ret = VM_FAULT_SIGBUS;
> > > > + }
> > >
> > > I've checked to make sure but do_user_addr_fault() indeed calls do_sigbus()
> > > which raises the SIGBUS signal. So if the application does not ignore
> > > SIGBUS, your retry will be visible to the application and can cause all
> > > sorts of interesting results...
> >
> > I would have noticed that, but no SIGBUS signals were actually
> > delivered. So we probably end up in kernelmode_fixup_or_oops() when in
> > kernel mode, which just does nothing in that case.
>
> Hum, but how would we get there? I don't think fatal_signal_pending() would
> return true yet...
Hmm, right ...
> > > So you probably need to add a new VM_FAULT_
> > > return code that will behave like VM_FAULT_SIGBUS except it will not raise
> > > the signal.
> >
> > A new VM_FAULT_* flag might make the code easier to read, but I don't
> > know if we can have one.
>
> Well, this is kernel-internal API and there's still plenty of space in
> vm_fault_reason.
That's in the context of the page fault. The other issue is how to
propagate that out through iov_iter_fault_in_readable ->
fault_in_pages_readable -> __get_user, for example. I don't think
there's much of a chance to get an additional error code out of
__get_user and __put_user.
Thanks,
Andreas
next prev parent reply other threads:[~2021-05-21 15:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-20 12:25 [PATCH 0/6] gfs2: Handle page faults during read and write Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 1/6] gfs2: Fix mmap + page fault deadlocks (part 1) Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 2/6] iov_iter: Add iov_iter_fault_in_writeable() Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 3/6] gfs2: Add wrappers for accessing journal_info Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 4/6] gfs2: Encode glock holding and retry flags in journal_info Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 5/6] gfs2: Add LM_FLAG_OUTER glock holder flag Andreas Gruenbacher
2021-05-20 12:25 ` [PATCH 6/6] gfs2: Fix mmap + page fault deadlocks (part 2) Andreas Gruenbacher
2021-05-20 13:30 ` Jan Kara
2021-05-20 14:07 ` Andreas Gruenbacher
2021-05-21 15:23 ` Jan Kara
2021-05-21 15:46 ` Andreas Gruenbacher [this message]
2021-05-24 8:39 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHc6FU6df7cBbjmYOZE35v_FALWRO62cYjg2Y9rY+Hd6x5yeyw@mail.gmail.com \
--to=agruenba@redhat.com \
--cc=cluster-devel@redhat.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox