From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: mm: fix BUG in __split_huge_page_pmd
Date: Tue, 15 Oct 2013 17:58:15 +0200 [thread overview]
Message-ID: <20131015155815.GG3479@redhat.com> (raw)
In-Reply-To: <20131015144827.C45DDE0090@blue.fi.intel.com>
On Tue, Oct 15, 2013 at 05:48:27PM +0300, Kirill A. Shutemov wrote:
> Andrea Arcangeli wrote:
> > Hi Hugh,
> >
> > On Tue, Oct 15, 2013 at 04:08:28AM -0700, Hugh Dickins wrote:
> > > Occasionally we hit the BUG_ON(pmd_trans_huge(*pmd)) at the end of
> > > __split_huge_page_pmd(): seen when doing madvise(,,MADV_DONTNEED).
> > >
> > > It's invalid: we don't always have down_write of mmap_sem there:
> > > a racing do_huge_pmd_wp_page() might have copied-on-write to another
> > > huge page before our split_huge_page() got the anon_vma lock.
> > >
> >
> > I don't get exactly the scenario with do_huge_pmd_wp_page(), could you
> > elaborate?
>
> I think the scenario is follow:
>
> CPU0: CPU1
>
> __split_huge_page_pmd()
> page = pmd_page(*pmd);
> do_huge_pmd_wp_page() copy the
> page and changes pmd (the same as on CPU0)
> to point to newly copied page.
> split_huge_page(page)
> where page is original page,
> not allocated on COW.
> pmd still points on huge page.
>
>
> Hugh, have I got it correctly?
So MADV_DONTNEED runs with with a "end" not 2m aligned (requiring 4k
subpage zapping) on a wrprotected trans-huge page that is hitting a
COW. So this scenario would be deterministic (the thread may write
beyond the "end" of the MADV_DONTNEED) and it only requires two
specific events.
With my other scenario with two concurrent MADV_DONTNEED plus a page
fault, you could still lead to split_huge_page_pmd returning with
pmd_trans_huge(*pmd) == true, despite of the loop introduced.
But for the above case, the loop makes a meaningful difference. So I
see the good reason for looping now.
It wouldn't be ok to miss a partial MADV_DONTNEED zapping because of a
concurrent COW, while it would be ok in my other scenario (and the
loop in fact cannot do anything to prevent split_huge_page_pmd return
with the pmd still huge).
My other scenario with two concurrent MADV_DONTNEED and a page fault
is non deterministic so looping was meaningless.
In both scenario, the kernel wouldn't run into stability issues, even
if we only removed the BUG_ON. But the COW scenario, without the loop,
we'd silently miss a partial MADV_DONTNEED on the 4k subpages before
the "end" (or after the "start").
And we still need pmd_none_or_trans_huge_or_clear_bad in
zap_pmd_range, to deal with the non deterministic cases that the loop
won't help (the two MADV_DONTNEED + page fault), in addition to the
loop to deal with the deterministic COW scenario above.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-10-15 15:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-15 11:08 Hugh Dickins
2013-10-15 11:32 ` Kirill A. Shutemov
2013-10-15 14:41 ` Andrea Arcangeli
2013-10-15 14:34 ` Andrea Arcangeli
2013-10-15 14:48 ` Kirill A. Shutemov
2013-10-15 15:58 ` Andrea Arcangeli [this message]
2013-10-15 17:53 ` Hugh Dickins
2013-10-15 18:55 ` Andrea Arcangeli
2013-10-15 19:28 ` Naoya Horiguchi
2013-10-15 19:44 ` Andrea Arcangeli
2013-10-15 20:16 ` Naoya Horiguchi
2013-10-15 20:30 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131015155815.GG3479@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox