From: Yang Shi <shy828301@gmail.com>
To: Helge Deller <deller@gmx.de>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
Helge Deller <deller@kernel.org>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-parisc@vger.kernel.org
Subject: Re: [PATCH] [RFC] mm: mmap: Allow mmap(MAP_STACK) to map growable stack
Date: Wed, 11 Sep 2024 18:08:23 -0700 [thread overview]
Message-ID: <CAHbLzkotBTOf0OrPSN4o=UEvRXjT=L=NSZn_=FBA6nG51ppjYg@mail.gmail.com> (raw)
In-Reply-To: <95c4efe9-e92a-46fe-bf41-9141e125332d@gmx.de>
On Wed, Sep 11, 2024 at 5:50 PM Helge Deller <deller@gmx.de> wrote:
>
> On 9/12/24 01:05, Liam R. Howlett wrote:
> > * Yang Shi <shy828301@gmail.com> [240911 18:16]:
> >> On Wed, Sep 11, 2024 at 12:49 PM Liam R. Howlett
> >> <Liam.Howlett@oracle.com> wrote:
> >>>
> >>> * Helge Deller <deller@kernel.org> [240911 15:20]:
> >>>> This is a RFC to change the behaviour of mmap(MAP_STACK) to be
> >>>> sufficient to map memory for usage as stack on all architectures.
> >>>> Currently MAP_STACK is a no-op on Linux, and instead MAP_GROWSDOWN
> >>>> has to be used.
> >>>> To clarify, here is the relevant info from the mmap() man page:
> >>>>
> >>>> MAP_GROWSDOWN
> >>>> This flag is used for stacks. It indicates to the kernel virtual
> >>>> memory system that the mapping should extend downward in memory. The
> >>>> return address is one page lower than the memory area that is
> >>>> actually created in the process's virtual address space. Touching an
> >>>> address in the "guard" page below the mapping will cause the mapping
> >>>> to grow by a page. This growth can be repeated until the mapping
> >>>> grows to within a page of the high end of the next lower mapping,
> >>>> at which point touching the "guard" page will result in a SIGSEGV
> >>>> signal.
> >>>>
> >>>> MAP_STACK (since Linux 2.6.27)
> >>>> Allocate the mapping at an address suitable for a process or thread
> >>>> stack.
> >>>>
> >>>> This flag is currently a no-op on Linux. However, by employing this
> >>>> flag, applications can ensure that they transparently obtain support
> >>>> if the flag is implemented in the future. Thus, it is used in the
> >>>> glibc threading implementation to allow for the fact that
> >>>> some architectures may (later) require special treatment for
> >>>> stack allocations. A further reason to employ this flag is
> >>>> portability: MAP_STACK exists (and has an effect) on some
> >>>> other systems (e.g., some of the BSDs).
> >>>>
> >>>> The reason to suggest this change is, that on the parisc architecture the
> >>>> stack grows upwards. As such, using solely the MAP_GROWSDOWN flag will not
> >>>> work. Note that there exists no MAP_GROWSUP flag.
> >>>> By changing the behaviour of MAP_STACK to mark the memory area with the
> >>>> VM_STACK bit (which is VM_GROWSUP or VM_GROWSDOWN depending on the
> >>>> architecture) the MAP_STACK flag does exactly what people would expect on
> >>>> all platforms.
> >>>>
> >>>> This change should have no negative side-effect, as all code which
> >>>> used mmap(MAP_GROWSDOWN | MAP_STACK) still work as before.
> >>>>
> >>>> Signed-off-by: Helge Deller <deller@gmx.de>
> >>>>
> >>>> diff --git a/include/linux/mman.h b/include/linux/mman.h
> >>>> index bcb201ab7a41..66bc72a0cb19 100644
> >>>> --- a/include/linux/mman.h
> >>>> +++ b/include/linux/mman.h
> >>>> @@ -156,6 +156,7 @@ calc_vm_flag_bits(unsigned long flags)
> >>>> return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
> >>>> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
> >>>> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
> >>>> + _calc_vm_trans(flags, MAP_STACK, VM_STACK ) |
> >>>
> >>> Right now MAP_STACK can be used to set VM_NOHUGEPAGE, but this will
> >>> change the user interface to create a vma that will grow. I'm not
> >>> entirely sure this is okay?
> >>
> >> AFAICT, I don't see this is a problem. Currently huge page also skips
> >> the VMAs with VM_GROWS* flags set. See vma_is_temporary_stack().
> >> __thp_vma_allowable_orders() returns 0 if the vma is a temporary
> >> stack.
> >
> > If someone is using MAP_STACK to avoid having a huge page, they will
> > also get a mapping that grows - which is different than what happens
> > today.
> >
> > I'm not saying that's right, but someone could be abusing the existing
> > flag and this will change the behaviour.
>
> Wouldn't a plain mmap() followed by madvise(MADV_NOHUGEPAGE) do exactly that?
> Why abusing MAP_STACK for that?
Different sources and reports showed having huge pages for stack
mapping hurts performance. A lot of applications, for example, pthread
lib, allocate stack with MAP_STACK and they don't call MADV_NOHUGEPAGE
on stack mapping.
>
> Helge
>
> >>> That is mmap(MAP_STACK) would set VM_NOHUGEPAGE right now, with this
> >>> change you'd get VM_NOHUGEPAGE | VM_GROWS<something>
> >>>
> >>>> _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
> >>>> arch_calc_vm_flag_bits(flags);
> >>>> }
>
next prev parent reply other threads:[~2024-09-12 1:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-11 19:20 Helge Deller
2024-09-11 19:48 ` Liam R. Howlett
2024-09-11 22:16 ` Yang Shi
2024-09-11 23:05 ` Liam R. Howlett
2024-09-12 0:50 ` Helge Deller
2024-09-12 1:08 ` Yang Shi [this message]
2024-09-12 1:42 ` Liam R. Howlett
2024-09-12 2:39 ` Yang Shi
2024-09-12 1:45 ` Helge Deller
2024-09-12 1:32 ` Liam R. Howlett
2024-09-12 2:09 ` Helge Deller
2024-09-12 15:43 ` Liam R. Howlett
2024-09-12 17:37 ` Helge Deller
2024-09-15 20:04 ` Helge Deller
2024-09-12 18:12 ` Matthew Wilcox
2024-09-12 1:05 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHbLzkotBTOf0OrPSN4o=UEvRXjT=L=NSZn_=FBA6nG51ppjYg@mail.gmail.com' \
--to=shy828301@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=deller@gmx.de \
--cc=deller@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox