From: Linus Torvalds <torvalds@linux-foundation.org>
To: Sasha Levin <sasha.levin@oracle.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
shli@kernel.org, Hugh Dickins <hughd@google.com>,
Rik van Riel <riel@redhat.com>, Shaohua Li <shli@fusionio.com>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [patch 019/154] mm: make madvise(MADV_WILLNEED) support swap file prefetch
Date: Sun, 15 Dec 2013 14:58:32 -0800 [thread overview]
Message-ID: <CA+55aFw+-EB0J5v-1LMg1aiDZQJ-Mm0fzdbN312_nyBCVs+Fvw@mail.gmail.com> (raw)
In-Reply-To: <52AE271C.4040805@oracle.com>
On Sun, Dec 15, 2013 at 2:03 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> On 12/15/2013 02:16 PM, Linus Torvalds wrote:
>>
>> Can anybody see what's wrong with that code? It all seems to happen
>> with mmap_sem held for reading, so there is no mmap activity going on,
>> but what about concurrent pmd splitting due to page faults etc?
>
> There's one thing that seems odd to me: the only place that allocated
> the ptl is in pgtable_page_ctor() called from pte_alloc_one().
>
> However, I don't see how ptl is allocated through all the
> mk_pmd()/mk_huge_pmd()
> calls in mm/huge_memory.c .
>
> I've added some debug output, and it seems that indeed the results of
> mk_pmd() are
> with ptl == NULL and one of them ends up getting to swapin_walk_pmd_entry
> where it NULL
> ptr derefs.
Hmm. I don't see that in my tree either, so that doesn't seem to be a
linux-next issue.
How are we not hitting this left and right? Sure, you need spinlock
debugging or something like that to trigger the BLOATED_SPINLOCKS
case, and you'd need the USE_SPLIT_PTE_PTLOCKS case to have this at
all, but that shouldn't be *that* unusual. And afaik, we should hit
this on just about any page table traversal.
So I *think* the rule is that largepages don't have ptl entries (since
they don't have page tables associated with them), and they need to be
handled specially.
But it's also possibly just that maybe nothing really uses
large-pages. And afaik, we used to disable USE_SPLIT_PTE_PTLOCKS
entirely with big spinlocks until Kirill added that indirection
pointer, so that would explain why we just never noticed this issue
before (although I'd have expected that the spinlock still needs to be
initialized, even if it doesn't need allocating - otherwise we'd
possibly just hang on a "spin_lock()" that never succeeds).
Adding Kirill to the participants, since he did the
pgtable_pmd_page_ctor/dtor stuff and enabled split PTE locks even with
BLOATED_SPINLOCKS. And Andrea, since largepages are involved. And
linux-mm just to have *some* list cc'd.
Kirill? Sasha seems to trigger this problem with
madvise(MADV_WILLNEED), possibly on a hugepage mapping (but see
below..) The
orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl);
in swapin_walk_pmd_entry() ends up taking a NULL ptr fault because the
pmd doesn't have a ptl pointer..
But why would we trigger this bug then, since we have:
if (pmd_none_or_trans_huge_or_clear_bad(pmd))
return 0;
in swapin_walk_pmd_entry(). Possibly racing with a page-in? Should we
check the "vma->vm_flags" for VM_HUGETLB?
Let's hope the new people have more answers than questions ;)
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next parent reply other threads:[~2013-12-15 22:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130223003232.4CDDB5A41B6@corp2gmr1-2.hot.corp.google.com>
[not found] ` <52AA0613.2000908@oracle.com>
[not found] ` <CA+55aFw3_0_Et9bbfWgGLXEUaGQW1HE8j=oGBqFG_8j+h6jmvQ@mail.gmail.com>
[not found] ` <CA+55aFyRZW=Uy9w+bZR0vMOFNPqV-yW2Xs9N42qEwTQ3AY0fDw@mail.gmail.com>
[not found] ` <52AE271C.4040805@oracle.com>
2013-12-15 22:58 ` Linus Torvalds [this message]
2013-12-16 12:47 ` Kirill A. Shutemov
2013-12-16 15:18 ` Sasha Levin
2013-12-16 20:52 ` Andrea Arcangeli
2013-12-17 0:18 ` Sasha Levin
2013-12-17 12:44 ` Kirill A. Shutemov
2013-12-17 14:09 ` Sasha Levin
2013-12-20 13:10 ` Kirill A. Shutemov
2013-12-20 13:31 ` Kirill A. Shutemov
2013-12-20 13:36 ` Kirill A. Shutemov
2013-12-20 17:42 ` Andrea Arcangeli
2013-12-23 10:25 ` Mel Gorman
2013-12-23 10:54 ` Kirill A. Shutemov
2013-12-23 11:15 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+55aFw+-EB0J5v-1LMg1aiDZQJ-Mm0fzdbN312_nyBCVs+Fvw@mail.gmail.com \
--to=torvalds@linux-foundation.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=sasha.levin@oracle.com \
--cc=shli@fusionio.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox