linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: mm: fix BUG in __split_huge_page_pmd
Date: Tue, 15 Oct 2013 16:41:28 +0200	[thread overview]
Message-ID: <20131015144128.GF3479@redhat.com> (raw)
In-Reply-To: <20131015113254.14E88E0090@blue.fi.intel.com>

On Tue, Oct 15, 2013 at 02:32:54PM +0300, Kirill A. Shutemov wrote:
> Hugh Dickins wrote:
> > Occasionally we hit the BUG_ON(pmd_trans_huge(*pmd)) at the end of
> > __split_huge_page_pmd(): seen when doing madvise(,,MADV_DONTNEED).
> > 
> > It's invalid: we don't always have down_write of mmap_sem there:
> > a racing do_huge_pmd_wp_page() might have copied-on-write to another
> > huge page before our split_huge_page() got the anon_vma lock.
> > 
> > Forget the BUG_ON, just go back and try again if this happens.
> >     
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > Cc: stable@vger.kernel.org
> 
> Looks reasonable to me.
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> madvise(MADV_DONTNEED) was aproblematic with THP before. Is a big win having
> mmap_sem taken on read rather than on write for it?

Yeah it caused all those pmd_trans_unstable and
pmd_none_or_trans_huge_or_clear_bad and pmd_read_atomic in common
code. But I didn't want to regress the scalability of
MADV_DONTNEED... I think various apps use MADV_DONTNEED to free memory
(including very KVM in the balloon driver and probably JVM and other JIT).

none or huge pmds are unstable without mmap_sem for writing and
without page_table_lock (or in general pmd_trans_huge_lock).

It's identical to the pte being unstable if mmap_sem is held for
reading and we don't hold the PT lock, except the pte can only have
two states and they're both unstable.

hugepmds have three states, and the only stable state of the tree is
when it points to a regular pte (the third state that 4k ptes cannot have).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-10-15 14:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-15 11:08 Hugh Dickins
2013-10-15 11:32 ` Kirill A. Shutemov
2013-10-15 14:41   ` Andrea Arcangeli [this message]
2013-10-15 14:34 ` Andrea Arcangeli
2013-10-15 14:48   ` Kirill A. Shutemov
2013-10-15 15:58     ` Andrea Arcangeli
2013-10-15 17:53     ` Hugh Dickins
2013-10-15 18:55       ` Andrea Arcangeli
2013-10-15 19:28         ` Naoya Horiguchi
2013-10-15 19:44           ` Andrea Arcangeli
2013-10-15 20:16             ` Naoya Horiguchi
2013-10-15 20:30               ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131015144128.GF3479@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox