From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com,
akpm@linux-foundation.org, hannes@cmpxchg.org,
iamjoonsoo.kim@lge.com, mgorman@techsingularity.net,
tony.luck@intel.com, vbabka@suse.cz, mhocko@kernel.org,
aarcange@redhat.com, hillf.zj@alibaba-inc.com, hughd@google.com,
oleg@redhat.com, peterz@infradead.org, riel@redhat.com,
srikar@linux.vnet.ibm.com, vdavydov.dev@gmail.com,
dave.hansen@linux.intel.com, mingo@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.
Date: Thu, 18 Jan 2018 17:34:10 +0300 [thread overview]
Message-ID: <20180118143410.sozfsbmb3liumn3x@node.shutemov.name> (raw)
In-Reply-To: <20180118131210.456oyh6fw4scwv53@node.shutemov.name>
On Thu, Jan 18, 2018 at 04:12:10PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > > Tetsuo Handa wrote:
> > > > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> > > >
> > > > I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> > > >
> > > > I haven't completed bisecting between b4fb8f66f1ae2e16 and c470abd4fde40ea6, but
> > > > b4fb8f66f1ae2e16 ("mm, page_alloc: Add missing check for memory holes") and
> > > > 13ad59df67f19788 ("mm, page_alloc: avoid page_to_pfn() when merging buddies")
> > > > are talking about memory holes, which matches the situation that I'm trivially
> > > > hitting the bug if CONFIG_SPARSEMEM=y .
> > > >
> > > > Thus, I call for an attention by speculative execution. ;-)
> > >
> > > Speculative execution failed. I was confused by jiffies precision bug.
> > > The final culprit is c7ab0d2fdc840266 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()").
> >
> > I think I've tracked it down. check_pte() in mm/page_vma_mapped.c doesn't
> > work as intended.
> >
> > I've added instrumentation below to prove it.
> >
> > The BUG() triggers with following output:
> >
> > [ 10.084024] diff: -858690919
> > [ 10.084258] hpage_nr_pages: 1
> > [ 10.084386] check1: 0
> > [ 10.084478] check2: 0
> >
> > Basically, pte_page(*pvmw->pte) is below pvmw->page, but
> > (pte_page(*pvmw->pte) < pvmw->page) doesn't catch it.
> >
> > Well, I can see how C lawyer can argue that you can only compare pointers
> > of the same memory object which is not the case here. But this is kinda
> > insane.
> >
> > Any suggestions how to rewrite it in a way that compiler would
> > understand?
>
> The patch below makes the crash go away for me.
>
> But this is situation is scary. So we cannot compare arbitrary pointers in
> kernel?
>
> Don't we rely on this for lock ordering in some cases? Like in
> mutex_lock_double()?
>
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index d22b84310f6d..1f0f512fd127 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -51,6 +51,8 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
> WARN_ON_ONCE(1);
> #endif
> } else {
> + unsigned long ptr1, ptr2;
> +
> if (is_swap_pte(*pvmw->pte)) {
> swp_entry_t entry;
>
> @@ -63,12 +65,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
> if (!pte_present(*pvmw->pte))
> return false;
>
> - /* THP can be referenced by any subpage */
> - if (pte_page(*pvmw->pte) - pvmw->page >=
> - hpage_nr_pages(pvmw->page)) {
> + ptr1 = (unsigned long)pte_page(*pvmw->pte);
> + ptr2 = (unsigned long)pvmw->page;
> +
> + if (ptr1 < ptr2)
> return false;
> - }
> - if (pte_page(*pvmw->pte) < pvmw->page)
> +
> + /* THP can be referenced by any subpage */
> + if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page))
Arghhh.. It has to be
if (ptr1 - ptr2 >= hpage_nr_pages(pvmw->page) * sizeof(*pvmw->page))
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-18 14:34 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-05 14:45 [x86? mm? fs? 4.15-rc6] Random oopses by simple write " Tetsuo Handa
2018-01-09 10:39 ` [mm? 4.15-rc7] " Tetsuo Handa
2018-01-10 11:49 ` [mm? 4.15-rc7] Random oopses " Tetsuo Handa
2018-01-10 12:45 ` Michal Hocko
2018-01-10 13:37 ` Tetsuo Handa
2018-01-11 13:57 ` Michal Hocko
2018-01-11 14:11 ` Tetsuo Handa
2018-01-11 14:21 ` Michal Hocko
2018-01-11 14:37 ` Tetsuo Handa
2018-01-12 1:31 ` [mm " Tetsuo Handa
2018-01-12 1:42 ` Linus Torvalds
2018-01-12 11:22 ` Tetsuo Handa
2018-01-14 11:54 ` Tetsuo Handa
2018-01-15 23:05 ` Linus Torvalds
2018-01-16 1:15 ` [mm 4.15-rc8] " Tetsuo Handa
2018-01-16 2:14 ` Linus Torvalds
2018-01-16 8:06 ` Dave Hansen
2018-01-16 8:37 ` Ingo Molnar
2018-01-16 19:30 ` Linus Torvalds
2018-01-16 17:33 ` Tetsuo Handa
2018-01-16 19:34 ` Linus Torvalds
2018-01-17 11:08 ` Tetsuo Handa
2018-01-17 21:39 ` Linus Torvalds
2018-01-17 21:51 ` Linus Torvalds
2018-01-17 22:04 ` Dave Hansen
2018-01-17 22:00 ` Dave Hansen
2018-01-17 22:15 ` Linus Torvalds
2018-01-18 8:12 ` Tetsuo Handa
2018-01-18 12:25 ` Kirill A. Shutemov
2018-01-18 13:12 ` Kirill A. Shutemov
2018-01-18 14:34 ` Kirill A. Shutemov [this message]
2018-01-18 14:38 ` Dave Hansen
2018-01-18 14:45 ` Kirill A. Shutemov
2018-01-18 14:51 ` Dave Hansen
2018-01-18 16:58 ` Linus Torvalds
2018-01-18 14:45 ` Dave Hansen
2018-01-18 14:58 ` Andrea Arcangeli
2018-01-18 16:56 ` Kirill A. Shutemov
2018-01-18 17:26 ` Luck, Tony
2018-01-18 17:28 ` Linus Torvalds
2018-01-18 17:26 ` Linus Torvalds
2018-01-18 23:49 ` Kirill A. Shutemov
2018-01-19 12:55 ` Matthew Wilcox
2018-01-19 18:42 ` Linus Torvalds
2018-01-19 22:12 ` Al Viro
2018-01-19 22:53 ` Linus Torvalds
2018-01-20 2:02 ` Al Viro
2018-01-20 5:24 ` Al Viro
2018-01-20 9:38 ` Luc Van Oostenryck
2018-01-18 15:40 ` Kirill A. Shutemov
2018-01-18 17:22 ` Michal Hocko
2018-01-19 10:02 ` Kirill A. Shutemov
2018-01-19 10:33 ` Michal Hocko
2018-01-19 11:49 ` Kirill A. Shutemov
2018-01-19 12:07 ` Michal Hocko
2018-01-19 12:30 ` Kirill A. Shutemov
2018-01-19 2:01 ` Tetsuo Handa
2018-01-11 18:11 ` [mm? 4.15-rc7] " Linus Torvalds
2018-01-11 20:59 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180118143410.sozfsbmb3liumn3x@node.shutemov.name \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@linux.intel.com \
--cc=hannes@cmpxchg.org \
--cc=hillf.zj@alibaba-inc.com \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=vdavydov.dev@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox