linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	 syzbot <syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com>,
	 Andreas Dilger <adilger.kernel@dilger.ca>,
	 Ext4 Developers List <linux-ext4@vger.kernel.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	"Theodore Ts'o" <tytso@mit.edu>,  Linux-MM <linux-mm@kvack.org>,
	Oleg Nesterov <oleg@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	 Nicholas Piggin <npiggin@gmail.com>,
	Alex Shi <alex.shi@linux.alibaba.com>, Qian Cai <cai@lca.pw>,
	 Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	 William Kucharski <william.kucharski@oracle.com>,
	Jens Axboe <axboe@kernel.dk>,
	 linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>
Subject: Re: kernel BUG at fs/ext4/inode.c:LINE!
Date: Wed, 25 Nov 2020 13:30:20 -0800	[thread overview]
Message-ID: <CAHk-=wix0YNq1U8iroRLpx+fCUGE8RG3asY8Zm4vyH-g4UhbPg@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wjiVtroOvNkuptH0GofVUvOMw4wmmaXdnGPPT8y8+MbyQ@mail.gmail.com>

On Tue, Nov 24, 2020 at 3:24 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I've applied your second patch (the smaller one that just takes a ref
> around the critical section). If somebody comes up with some great
> alternative, we can always revisit this.

Hmm.

I'm not sure about "great alternative", but it strikes me that we
*could* move the clearing of the PG_writeback bit _into_
wake_up_page_bit(), under the page waitqueue lock.

IOW, we could make the rule be that the bit isn't actually cleared
before calling wake_up_page() at all, and we'd clear it with something
like

    unsigned long flags = READ_ONCE(page->flags);

    // We can clear PG_writeback directly if PG_waiters isn't set
    while (!(flags & (1ul << PG_waiters))) {
        unsigned long new = flags & ~(1ul << PG_writeback);
        // PG_writeback was already clear??!!?
        if (WARN_ON_ONCE(new == flags))
            return;
        new = cmpxchg(&page->flags, flags, new);
        if (likely(flags == new))
            return;
        flags = new;
    }

    // Otherwise, clear the bit at the end - but under the
    // page waitqueue lock - inside wake_up_page_bit()
    return wake_up_page_bit(..);

instead.

That would basically make the bit clearing atomic wrt the PG_waiters
flags - either using that atomic cmpxchg, or by doing it under the
page queue lock so that it's atomic wrt any new waiters.

This seems conceptually like the right thing to do - and if would also
make the (fair) exclusive lock hand-off case atomic too, because the
bit we're waking up on would never be cleared if it gets handed off
directly.

The above is entirely untested crap written in my MUA, and obviously
requires that all callers of wake_up_page() be moved to that new world
order, but I think we only have two cases: unlock_page() and
end_page_writeback().

And unlock_page() already has that
"clear_bit_unlock_is_negative_byte()" special case that is an ugly
special case of PG_waiters atomicity. So we'd get rid of that, because
the cmpxchg loop would be the better model.

I'm not sure I'm willing to write and test the real patch, but it
doesn't look _too_ nasty from just looking at the code. The bookmark
thing makes it important to only actually clear the bit at the end (as
does the handoff case anyway), but the way wake_up_page_bit() is
written, that's actually very straightforward - just after the
while-loop. That's when we've woken up everybody.

So I'm sending this idea out to see if somebody can shoot it down, or
even wants to possibly even try to do it..

                Linus


  reply	other threads:[~2020-11-25 21:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <000000000000d3a33205add2f7b2@google.com>
2020-08-28 10:07 ` Jan Kara
2020-08-31 10:03   ` Jan Kara
2020-08-31 18:21     ` Linus Torvalds
2020-11-24  4:07       ` Hugh Dickins
2020-11-24  4:26         ` Linus Torvalds
2020-11-24  4:53         ` Linus Torvalds
2020-11-24  6:34           ` Hugh Dickins
2020-11-24 16:46             ` Hugh Dickins
2020-11-24 12:19         ` Matthew Wilcox
2020-11-24 16:28           ` Hugh Dickins
2020-11-24 18:33             ` Matthew Wilcox
2020-11-24 19:00               ` Linus Torvalds
2020-11-24 20:15                 ` Matthew Wilcox
2020-11-24 20:34                   ` Linus Torvalds
2020-11-24 21:46                     ` Hugh Dickins
2020-11-24 23:24                       ` Linus Torvalds
2020-11-25 21:30                         ` Linus Torvalds [this message]
2020-11-25 22:01                           ` Linus Torvalds
2020-11-25  9:20           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wix0YNq1U8iroRLpx+fCUGE8RG3asY8Zm4vyH-g4UhbPg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=cai@lca.pw \
    --cc=darrick.wong@oracle.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=oleg@redhat.com \
    --cc=syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tytso@mit.edu \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox