From: David Stevens <stevensd@chromium.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Yang Shi <shy828301@gmail.com>,
David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/khugepaged: skip shmem with userfaultfd
Date: Tue, 7 Feb 2023 10:37:06 +0900 [thread overview]
Message-ID: <CAD=HUj4yhMLnBNpumxC4urSY2Js5XuekzGP+UOXJmUV=k5nx=A@mail.gmail.com> (raw)
In-Reply-To: <Y+F2IdXhqc5187s+@casper.infradead.org>
On Tue, Feb 7, 2023 at 6:50 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Feb 06, 2023 at 03:52:19PM -0500, Peter Xu wrote:
> > On Mon, Feb 06, 2023 at 07:01:39PM +0000, Matthew Wilcox wrote:
> > > On Mon, Feb 06, 2023 at 08:28:56PM +0900, David Stevens wrote:
> > > > This change first makes sure that the intermediate page cache state
> > > > during collapse is not visible by moving when gaps are filled to after
> > > > the page cache lock is acquired for the final time. This is necessary
> > > > because the synchronization provided by locking hpage is insufficient
> > > > for functions which operate on the page cache without actually locking
> > > > individual pages to examine their content (e.g. shmem_mfill_atomic_pte).
> > >
> > > I've been a little scared of touching khugepaged because, well, look at
> > > that function. But if we are going to touch it, how about this patch
> > > first? It does _part_ of what you need by not filling in the holes,
> > > but obviously not the part that looks at uffd.
> > >
> > > It leaves the old pages in-place and frozen. I think this should be
> > > safe, but I haven't booted it (not entirely sure what test I'd run
> > > to prove that it's not broken)
> >
> > That logic existed since Kirill's original commit to add shmem thp support
> > on khugepaged, so Kirill should be the best to tell.. but so far it seems
> > reasonalbe to me to have that extra operation.
> >
> > The problem is khugepaged will release pgtable lock during collapsing, so
> > AFAICT there can be a race where some other thread tries to insert pages
> > into page cache in parallel with khugepaged right after khugepaged released
> > the page cache lock.
> >
> > For example, it seems to me new page cache can be inserted when khugepaged
> > is copying small page content to the new hpage.
This particular race can't happen with either patch, since the missing
page cache entries are filled when we create the multi-index entry for
hpage.
> Mmm, yes, we need to have _something_ in the page cache to block new
> pages from being added. It can be either the new or the old pages,
> but it can't be NULL. It could even be a RETRY entry, since that'll
> have the same effect as a frozen page.
>
> But both David's patch and mine are wrong. Not sure what to do for
> David's problem -- maybe it's OK to have the holes temporarily filled
> with frozen / RETRY entries until we get to the point where we check
> for an uffd marker?
My patch re-counts the holes after acquiring the page cache lock for
the final time, right before creating the final hpage multi-index
entry. Since we lock present pages while iterating over the target
range, they can't have been truncated before our re-validation of
nr_none. So if the number of missing pages is still equal to nr_none,
then we know that nothing has come along and filled in a missing page.
Compared to adding some sort of marker for missing pages, this does
add another failure path for collapse, but I don't think there is any
race.
-David
next prev parent reply other threads:[~2023-02-07 1:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-06 11:28 David Stevens
2023-02-06 14:25 ` Matthew Wilcox
2023-02-06 19:01 ` Matthew Wilcox
2023-02-06 20:52 ` Peter Xu
2023-02-06 21:50 ` Matthew Wilcox
2023-02-07 1:37 ` David Stevens [this message]
2023-02-07 2:29 ` Matthew Wilcox
2023-02-07 4:14 ` David Stevens
2023-02-06 21:02 ` Peter Xu
2023-02-07 3:56 ` David Stevens
2023-02-07 16:34 ` Peter Xu
2023-02-08 2:42 ` David Stevens
2023-02-08 17:24 ` Peter Xu
2023-02-09 5:10 ` David Stevens
2023-02-09 18:50 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAD=HUj4yhMLnBNpumxC4urSY2Js5XuekzGP+UOXJmUV=k5nx=A@mail.gmail.com' \
--to=stevensd@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=shy828301@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox