Re: [RFC PATCH] mm/thp: check and bail out if page in deferred queue already

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Rientjes <rientjes@google.com>
To: Yin Fengwei <fengwei.yin@intel.com>
Cc: riel@surriel.com, nathan@kernel.org,
	 Andrew Morton <akpm@linux-foundation.org>,
	shy828301@gmail.com,  Matthew Wilcox <willy@infradead.org>,
	kirill.shutemov@intel.com,  ying.huang@intel.com,
	feng.tang@intel.com, zhengjun.xing@linux.intel.com,
	 linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm/thp: check and bail out if page in deferred queue already
Date: Tue, 27 Dec 2022 17:24:44 -0800 (PST)	[thread overview]
Message-ID: <677d82d8-f8c9-9a60-3af0-e7931c7728c6@google.com> (raw)
In-Reply-To: <20221223135207.2275317-1-fengwei.yin@intel.com>

On Fri, 23 Dec 2022, Yin Fengwei wrote:

> Kernel build regression with LLVM was reported here:
> https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/
> with commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
> boundaries"). And the commit f35b5d7d676e was reverted.
> 
> It turned out the regression is related with madvise(MADV_DONTNEED)
> was used by ld.lld. But with none PMD_SIZE aligned parameter len.
> trace-bpfcc captured:
> 531607  531732  ld.lld          do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
> 531607  531793  ld.lld          do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
> 
> If the underneath physical page is THP, the madvise(MADV_DONTNNED) can
> trigger split_queue_lock contention raised significantly. perf showed
> following data:
>     14.85%     0.00%  ld.lld           [kernel.kallsyms]           [k]
>        entry_SYSCALL_64_after_hwframe
>            11.52%
>                 entry_SYSCALL_64_after_hwframe
>                 do_syscall_64
>                 __x64_sys_madvise
>                 do_madvise.part.0
>                 zap_page_range
>                 unmap_single_vma
>                 unmap_page_range
>                 page_remove_rmap
>                 deferred_split_huge_page
>                 __lock_text_start
>                 native_queued_spin_lock_slowpath
> 
> If THP can't be removed from rmap as whole THP, partial THP will be
> removed from rmap by removing sub-pages from rmap. Even the THP
> head page is added to deferred queue already, the split_queue_lock
> will be acquired and check whether the THP head page is in the queue
> already. Thus, the contention of split_queue_lock is raised.
> 
> Before acquire split_queue_lock, check and bail out early if the THP
> head page is in the queue already. The checking without holding
> split_queue_lock could race with deferred_split_scan, but it doesn't
> impact the correctness here.
> 
> Test result of building kernel with ld.lld:
> commit 7b5a0b664ebe (parent commit of f35b5d7d676e):
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
>         6:07.99 real,   26367.77 user,  5063.35 sys
> 
> commit f35b5d7d676e:
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
>         7:22.15 real,   26235.03 user,  12504.55 sys
> 
> commit f35b5d7d676e with the fixing patch:
> time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
>         6:08.49 real,   26520.15 user,  5047.91 sys
> 
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>

Acked-by: David Rientjes <rientjes@google.com>

next prev parent reply	other threads:[~2022-12-28  1:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-23 13:52 Yin Fengwei
2022-12-27 20:15 ` Nathan Chancellor
2022-12-29  1:14   ` Yin, Fengwei
2022-12-28  1:24 ` David Rientjes [this message]
2022-12-29  1:15   ` Yin, Fengwei
2022-12-28  2:06 ` Huang, Ying
2022-12-29  1:15   ` Yin, Fengwei
2022-12-28 23:33 ` Andrew Morton
2022-12-29  1:29   ` Yin, Fengwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=677d82d8-f8c9-9a60-3af0-e7931c7728c6@google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=kirill.shutemov@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=nathan@kernel.org \
    --cc=riel@surriel.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox