From: "Christoph Lameter (Ampere)" <cl@linux.com>
To: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yang Shi <shy828301@gmail.com>,
kernel test robot <oliver.sang@intel.com>,
Rik van Riel <riel@surriel.com>,
oe-lkp@lists.linux.dev, lkp@intel.com,
Linux Memory Management List <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
ying.huang@intel.com, feng.tang@intel.com
Subject: Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression
Date: Wed, 20 Dec 2023 07:42:17 -0800 (PST) [thread overview]
Message-ID: <edb35574-e8be-adc8-a756-96bcbab2f0af@linux.com> (raw)
In-Reply-To: <ec48d168-284b-4376-97a7-090273a3ae5e@intel.com>
On Wed, 20 Dec 2023, Yin Fengwei wrote:
>> Interesting, wasn't the same regression seen last time? And I'm a
>> little bit confused about how pthread got regressed. I didn't see the
>> pthread benchmark do any intensive memory alloc/free operations. Do
>> the pthread APIs do any intensive memory operations? I saw the
>> benchmark does allocate memory for thread stack, but it should be just
>> 8K per thread, so it should not trigger what this patch does. With
>> 1024 threads, the thread stacks may get merged into one single VMA (8M
>> total), but it may do so even though the patch is not applied.
> stress-ng.pthread test code is strange here:
>
> https://github.com/ColinIanKing/stress-ng/blob/master/stress-pthread.c#L573
>
> Even it allocates its own stack, but that attr is not passed
> to pthread_create. So it's still glibc to allocate stack for
> pthread which is 8M size. This is why this patch can impact
> the stress-ng.pthread testing.
Hmmm... The use of calloc() for 8M triggers an mmap I guess.
Why is that memory slower if we align the adress to a 2M boundary? Because
THP can act faster and creates more overhead?
> while this time, the hotspot is in (pmd_lock from do_madvise I suppose):
> - 55.02% zap_pmd_range.isra.0
> - 53.42% __split_huge_pmd
> - 51.74% _raw_spin_lock
> - 51.73% native_queued_spin_lock_slowpath
> + 3.03% asm_sysvec_call_function
> - 1.67% __split_huge_pmd_locked
> - 0.87% pmdp_invalidate
> + 0.86% flush_tlb_mm_range
> - 1.60% zap_pte_range
> - 1.04% page_remove_rmap
> 0.55% __mod_lruvec_page_state
Ok so we have 2M mappings and they are split because of some action on 4K
segments? Guess because of the guard pages?
>> More time spent in madvise and munmap. but I'm not sure whether this
>> is caused by tearing down the address space when exiting the test. If
>> so it should not count in the regression.
> It's not for the whole address space tearing down. It's for pthread
> stack tearing down when pthread exit (can be treated as address space
> tearing down? I suppose so).
>
> https://github.com/lattera/glibc/blob/master/nptl/allocatestack.c#L384
> https://github.com/lattera/glibc/blob/master/nptl/pthread_create.c#L576
>
> Another thing is whether it's worthy to make stack use THP? It may be
> useful for some apps which need large stack size?
No can do since a calloc is used to allocate the stack. How can the kernel
distinguish the allocation?
next prev parent reply other threads:[~2023-12-20 15:42 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-19 15:41 kernel test robot
2023-12-20 5:27 ` Yang Shi
2023-12-20 8:29 ` Yin Fengwei
2023-12-20 15:42 ` Christoph Lameter (Ampere) [this message]
2023-12-20 20:14 ` Yang Shi
2023-12-20 20:09 ` Yang Shi
2023-12-21 0:26 ` Yang Shi
2023-12-21 0:58 ` Yin Fengwei
2023-12-21 1:02 ` Yin Fengwei
2023-12-21 4:49 ` Matthew Wilcox
2023-12-21 4:58 ` Yin Fengwei
2023-12-21 18:07 ` Yang Shi
2023-12-21 18:14 ` Matthew Wilcox
2023-12-22 1:06 ` Yin, Fengwei
2023-12-22 2:23 ` Huang, Ying
2023-12-21 13:39 ` Yin, Fengwei
2023-12-21 18:11 ` Yang Shi
2023-12-22 1:13 ` Yin, Fengwei
2024-01-04 1:32 ` Yang Shi
2024-01-04 8:18 ` Yin Fengwei
2024-01-04 8:39 ` Oliver Sang
2024-01-05 9:29 ` Oliver Sang
2024-01-05 14:52 ` Yin, Fengwei
2024-01-05 18:49 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=edb35574-e8be-adc8-a756-96bcbab2f0af@linux.com \
--to=cl@linux.com \
--cc=akpm@linux-foundation.org \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=riel@surriel.com \
--cc=shy828301@gmail.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox