From: Hugh Dickins <hughd@google.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>,
Hugh Dickins <hughd@google.com>,
Ingo Korb <ingo.korb@tu-dortmund.de>, Ning Qu <quning@google.com>,
Dave Jones <davej@redhat.com>,
Sasha Levin <sasha.levin@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
Date: Tue, 15 Jul 2014 13:46:26 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.1407151346000.3571@eggly.anvils> (raw)
In-Reply-To: <20140715115456.32886E00A3@blue.fi.intel.com>
On Tue, 15 Jul 2014, Kirill A. Shutemov wrote:
> Konstantin Khlebnikov wrote:
> > On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > > Konstantin Khlebnikov wrote:
> > >> It seems boundng logic in do_fault_around is wrong:
> > >>
> > >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> > >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> > >> pte -= off;
> > >> pgoff -= off;
> > >>
> > >> Ok, off <= 511, but it might be bigger than pte offset in pte table.
> > >
> > > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > > boundary in this case which is start of the page table pte belong to.
> > >
> > > Do I miss something?
> >
> > Nope, you're right. This fixes kernel crash but not the original problem.
> >
> > Problem is caused by calling do_fault_around for _non-linear_ faiult.
> > In this case pgoff is shifted and might become negative during calculation.
> > I'll send another patch.
>
> I've got to the same conclusion. My patch is below.
Many thanks to Ingo and Konstantin and Kirill for nailing this.
So now we have two not-quite-identical patches to fix it.
I feel I have to judge a beauty contest.
I think my slight preference is for Kirill's below, because it has
a better description (mentions "kernel BUG at mm/filemap.c:202!" and
Ccs stable) and uses the familiar VM_NONLINEAR flag rather than the
never-heard-of-before-and-otherwise-unused FAULT_FLAG_NONLINEAR.
But please please add a credit to Ingo, who made the breakthrough for
us, and to Konstantin who analysed what was going on. Ingo, this is
not quite the version you tested...
... ah, forget it, Andrew has just now gone for Konstantin's,
adding in more info from Kirill's: that's fine.
Thanks all,
Hugh
>
> From dd761b693cd06c649499e913713ae5bc7c029f6e Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 15 Jul 2014 14:40:02 +0300
> Subject: [PATCH] mm: avoid do_fault_around() on non-linear mappings
>
> Originally, I've wrongly assumed that non-linear mapping are always
> populated at least with pte_file() entries there, so !pte_none() check
> will catch them. It's not always the case: we can get there from
> __mm_populte in remap_file_pages() and pte will be clear.
__mm_populate
>
> Let's put explicit check for non-linear mapping.
>
> This is a root cause of recent "kernel BUG at mm/filemap.c:202!".
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 3.15+
> ---
> mm/memory.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index d67fd9fcf1f2..440ad48266d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> * if page by the offset is not ready to be mapped (cold cache or
> * something).
> */
> - if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> + if (vma->vm_ops->map_pages && fault_around_pages() > 1 &&
> + !(vma->vm_flags & VM_NONLINEAR)) {
> pte = pte_offset_map_lock(mm, pmd, address, &ptl);
> do_fault_around(vma, address, pte, pgoff, flags);
> if (!pte_same(*pte, orig_pte))
> --
> 2.0.1
>
> --
> Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-07-15 20:48 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-14 12:58 Ingo Korb
2014-07-14 19:22 ` Hugh Dickins
2014-07-14 20:13 ` Konstantin Khlebnikov
2014-07-15 9:55 ` [PATCH] mm: fix faulting range in do_fault_around Konstantin Khlebnikov
2014-07-15 10:55 ` PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit Kirill A. Shutemov
2014-07-15 11:33 ` Konstantin Khlebnikov
2014-07-15 11:54 ` Kirill A. Shutemov
2014-07-15 20:46 ` Hugh Dickins [this message]
2014-07-15 11:58 ` [PATCH] mm: do not call do_fault_around for non-linear fault Konstantin Khlebnikov
2014-07-15 15:29 ` Ingo Korb
2014-07-15 20:42 ` Andrew Morton
2014-07-15 21:07 ` Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1407151346000.3571@eggly.anvils \
--to=hughd@google.com \
--cc=akpm@linux-foundation.org \
--cc=davej@redhat.com \
--cc=ingo.korb@tu-dortmund.de \
--cc=kirill.shutemov@linux.intel.com \
--cc=koct9i@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quning@google.com \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox