linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	Matthew Wilcox <willy@infradead.org>,
	Rik van Riel <riel@surriel.com>, Chris Mason <clm@fb.com>
Subject: Re: [PATCH] mm: fix page leak with multiple threads mapping the same page
Date: Thu, 7 Jul 2022 01:46:57 +0300	[thread overview]
Message-ID: <20220706224657.3xbhbkflernezlxy@black.fi.intel.com> (raw)
In-Reply-To: <2b798acfd95c9ab9395fe85e8d5a835e2e10a920.1657051137.git.josef@toxicpanda.com>

On Tue, Jul 05, 2022 at 04:00:36PM -0400, Josef Bacik wrote:
> We have an application with a lot of threads that use a shared mmap
> backed by tmpfs mounted with -o huge=within_size.  This application
> started leaking loads of huge pages when we upgraded to a recent kernel.
> 
> Using the page ref tracepoints and a BPF program written by Tejun Heo we
> were able to determine that these pages would have multiple refcounts
> from the page fault path, but when it came to unmap time we wouldn't
> drop the number of refs we had added from the faults.
> 
> I wrote a reproducer that mmap'ed a file backed by tmpfs with -o
> huge=always, and then spawned 20 threads all looping faulting random
> offsets in this map, while using madvise(MADV_DONTNEED) randomly for
> huge page aligned ranges.  This very quickly reproduced the problem.
> 
> The problem here is that we check for the case that we have multiple
> threads faulting in a range that was previously unmapped.  One thread
> maps the PMD, the other thread loses the race and then returns 0.
> However at this point we already have the page, and we are no longer
> putting this page into the processes address space, and so we leak the
> page.  We actually did the correct thing prior to f9ce0be71d1f, however
> it looks like Kirill copied what we do in the anonymous page case.  In
> the anonymous page case we don't yet have a page, so we don't have to
> drop a reference on anything.  Previously we did the correct thing for
> file based faults by returning VM_FAULT_NOPAGE so we correctly drop the
> reference on the page we faulted in.
> 
> Fix this by returning VM_FAULT_NOPAGE in the pmd_devmap_trans_unstable()
> case, this makes us drop the ref on the page properly, and now my
> reproducer no longer leaks the huge pages.
> 
> Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Chris Mason <clm@fb.com>

Cc: stable@ ?

> ---
>  mm/memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7a089145cad4..f10724d7dca3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4371,7 +4371,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
>  
>  	/* See comment in handle_pte_fault() */
>  	if (pmd_devmap_trans_unstable(vmf->pmd))
> -		return 0;
> +		return VM_FAULT_NOPAGE;

Comment update would be nice.

Other instances of pmd_devmap_trans_unstable() return 0 in the fault path.
Explanation would be helpful.

Otherwise,

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov


  reply	other threads:[~2022-07-06 22:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-05 20:00 Josef Bacik
2022-07-06 22:46 ` Kirill A. Shutemov [this message]
2022-07-07  0:42   ` Rik van Riel
2022-07-08  0:58     ` Andrew Morton
2022-07-15 15:21       ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220706224657.3xbhbkflernezlxy@black.fi.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=clm@fb.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@surriel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox