* [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled
@ 2025-05-02 9:31 Ignacio Moreno Gonzalez via B4 Relay
2025-05-02 12:46 ` Lorenzo Stoakes
2025-05-02 13:03 ` Matthew Wilcox
0 siblings, 2 replies; 10+ messages in thread
From: Ignacio Moreno Gonzalez via B4 Relay @ 2025-05-02 9:31 UTC (permalink / raw)
To: Liam.Howlett, lorenzo.stoakes
Cc: linux-mm, linux-kernel, Ignacio Moreno Gonzalez
From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps
the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if
CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the
VM_NOHUGEPAGE does not make sense. For instance, when calling madvise()
with MADV_NOHUGEPAGE, an error is always returned.
Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
---
I discovered this issue when trying to use the tool CRIU to checkpoint
and restore a container. Our running kernel is compiled without
CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of
/proc/<pid>/smaps and saves the "nh" flag. When trying to restore the
container, CRIU fails to restore the "nh" mappings, since madvise()
MADV_NOHUGEPAGE always returns an error because
CONFIG_TRANSPARENT_HUGETABLES is not defined.
---
include/linux/mman.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/mman.h b/include/linux/mman.h
index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags)
return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
_calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
_calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE)
_calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
+#endif
arch_calc_vm_flag_bits(file, flags);
}
---
base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2
change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d
Best regards,
--
Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 9:31 [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled Ignacio Moreno Gonzalez via B4 Relay @ 2025-05-02 12:46 ` Lorenzo Stoakes 2025-05-02 20:53 ` Yang Shi 2025-05-02 13:03 ` Matthew Wilcox 1 sibling, 1 reply; 10+ messages in thread From: Lorenzo Stoakes @ 2025-05-02 12:46 UTC (permalink / raw) To: Ignacio.MorenoGonzalez Cc: Liam.Howlett, linux-mm, linux-kernel, Yang Shi, Andrew Morton +cc Andrew. Ignacio, you should always include Andrew in patch submissions to mm :) +cc Yang Shi who added this in the first place in commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE"). On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > with MADV_NOHUGEPAGE, an error is always returned. > > Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> I don't see how can this cause a problem, and it fixes one in practice, so LGTM. Though see note below about CRIU :) I also added a nit below, if you address this you can re-use my tag. Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Thanks! Do we want to back port this to stable kernels? If so we should have a: Fixes: c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") cc: stable@vger.kernel.org Appended here, and Greg's scripts should automagically backport, assuming no conflicts or such (I don't _think_ there would be...) > --- > I discovered this issue when trying to use the tool CRIU to checkpoint > and restore a container. Our running kernel is compiled without > CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of > /proc/<pid>/smaps and saves the "nh" flag. When trying to restore the > container, CRIU fails to restore the "nh" mappings, since madvise() > MADV_NOHUGEPAGE always returns an error because > CONFIG_TRANSPARENT_HUGETABLES is not defined. Yeah this is really not a stable or valid use of the /proc/$pid/[s]maps interface :P CRIU is sort of a blurry line of relying on internal implementation details so we're kinda not obligated to prevent breakages. CRIU is kinda relying on internal implementation details so debatable as to whether we should be bending over backwards to support. BUT, we also don't want to cause unwanted issues if there's a simple fix and this seems reasonable to me. > --- > include/linux/mman.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/include/linux/mman.h b/include/linux/mman.h > index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644 > --- a/include/linux/mman.h > +++ b/include/linux/mman.h > @@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags) > return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | > _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | > _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) NIT, but can we use ifdef here for consistency? Thanks. > _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | > +#endif > arch_calc_vm_flag_bits(file, flags); > } > > > --- > base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2 > change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d > > Best regards, > -- > Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 12:46 ` Lorenzo Stoakes @ 2025-05-02 20:53 ` Yang Shi 2025-05-03 9:50 ` Lorenzo Stoakes 0 siblings, 1 reply; 10+ messages in thread From: Yang Shi @ 2025-05-02 20:53 UTC (permalink / raw) To: Lorenzo Stoakes, Ignacio.MorenoGonzalez Cc: Liam.Howlett, linux-mm, linux-kernel, Andrew Morton On 5/2/25 5:46 AM, Lorenzo Stoakes wrote: > +cc Andrew. > > Ignacio, you should always include Andrew in patch submissions to mm :) > > +cc Yang Shi who added this in the first place in commit c4608d1bf7c6 ("mm: > mmap: map MAP_STACK to VM_NOHUGEPAGE"). Thanks for cc'ing me. > > On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: >> From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> >> >> commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps >> the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if >> CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the >> VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() >> with MADV_NOHUGEPAGE, an error is always returned. >> >> Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > I don't see how can this cause a problem, and it fixes one in practice, so > LGTM. Though see note below about CRIU :) > > I also added a nit below, if you address this you can re-use my tag. > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Yeah, I agree. Looks good to me too. Reviewed-by: Yang Shi <yang@os.amperecomputing.com> > > Thanks! > > Do we want to back port this to stable kernels? If so we should have a: > > Fixes: c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") > cc: stable@vger.kernel.org > > Appended here, and Greg's scripts should automagically backport, assuming > no conflicts or such (I don't _think_ there would be...) > >> --- >> I discovered this issue when trying to use the tool CRIU to checkpoint >> and restore a container. Our running kernel is compiled without >> CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of >> /proc/<pid>/smaps and saves the "nh" flag. When trying to restore the >> container, CRIU fails to restore the "nh" mappings, since madvise() >> MADV_NOHUGEPAGE always returns an error because >> CONFIG_TRANSPARENT_HUGETABLES is not defined. > Yeah this is really not a stable or valid use of the /proc/$pid/[s]maps > interface :P CRIU is sort of a blurry line of relying on internal > implementation details so we're kinda not obligated to prevent breakages. > > CRIU is kinda relying on internal implementation details so debatable as to > whether we should be bending over backwards to support. > > BUT, we also don't want to cause unwanted issues if there's a simple fix > and this seems reasonable to me. > >> --- >> include/linux/mman.h | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/include/linux/mman.h b/include/linux/mman.h >> index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644 >> --- a/include/linux/mman.h >> +++ b/include/linux/mman.h >> @@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags) >> return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | >> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | >> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | >> +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) > NIT, but can we use ifdef here for consistency? Thanks. > >> _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | >> +#endif >> arch_calc_vm_flag_bits(file, flags); >> } >> >> >> --- >> base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2 >> change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d >> >> Best regards, >> -- >> Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> >> >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 20:53 ` Yang Shi @ 2025-05-03 9:50 ` Lorenzo Stoakes 2025-05-05 17:53 ` Yang Shi 2025-05-06 12:26 ` Ignacio Moreno Gonzalez 0 siblings, 2 replies; 10+ messages in thread From: Lorenzo Stoakes @ 2025-05-03 9:50 UTC (permalink / raw) To: Yang Shi Cc: Ignacio.MorenoGonzalez, Liam.Howlett, linux-mm, linux-kernel, Andrew Morton, Matthew Wilcox +cc Matthew On Fri, May 02, 2025 at 01:53:15PM -0700, Yang Shi wrote: > > > On 5/2/25 5:46 AM, Lorenzo Stoakes wrote: > > +cc Andrew. > > > > Ignacio, you should always include Andrew in patch submissions to mm :) > > > > +cc Yang Shi who added this in the first place in commit c4608d1bf7c6 ("mm: > > mmap: map MAP_STACK to VM_NOHUGEPAGE"). > > Thanks for cc'ing me. No problem! > > > > > On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > > > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > > > > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > > > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > > > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > > > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > > > with MADV_NOHUGEPAGE, an error is always returned. > > > > > > Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > I don't see how can this cause a problem, and it fixes one in practice, so > > LGTM. Though see note below about CRIU :) > > > > I also added a nit below, if you address this you can re-use my tag. > > > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > Yeah, I agree. Looks good to me too. Reviewed-by: Yang Shi > <yang@os.amperecomputing.com> Thanks! I also believe Matthew's suggestion (see [0]) is very much worth doing here, in my view doing both is a sensible approach, would your R-b tag extend to that? In which case, Ignacio - could you apply both Matthew's change and address the nit below and send a v2? Thanks! [0]: https://lore.kernel.org/linux-mm/aBTCtOXBhUK_FLU6@casper.infradead.org/ > > > > > Thanks! > > > > Do we want to back port this to stable kernels? If so we should have a: > > > > Fixes: c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") > > cc: stable@vger.kernel.org > > > > Appended here, and Greg's scripts should automagically backport, assuming > > no conflicts or such (I don't _think_ there would be...) > > > > > --- > > > I discovered this issue when trying to use the tool CRIU to checkpoint > > > and restore a container. Our running kernel is compiled without > > > CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of > > > /proc/<pid>/smaps and saves the "nh" flag. When trying to restore the > > > container, CRIU fails to restore the "nh" mappings, since madvise() > > > MADV_NOHUGEPAGE always returns an error because > > > CONFIG_TRANSPARENT_HUGETABLES is not defined. > > Yeah this is really not a stable or valid use of the /proc/$pid/[s]maps > > interface :P CRIU is sort of a blurry line of relying on internal > > implementation details so we're kinda not obligated to prevent breakages. > > > > CRIU is kinda relying on internal implementation details so debatable as to > > whether we should be bending over backwards to support. > > > > BUT, we also don't want to cause unwanted issues if there's a simple fix > > and this seems reasonable to me. > > > > > --- > > > include/linux/mman.h | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/include/linux/mman.h b/include/linux/mman.h > > > index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644 > > > --- a/include/linux/mman.h > > > +++ b/include/linux/mman.h > > > @@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags) > > > return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | > > > _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | > > > _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | > > > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) > > NIT, but can we use ifdef here for consistency? Thanks. > > > > > _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | > > > +#endif > > > arch_calc_vm_flag_bits(file, flags); > > > } > > > > > > > > > --- > > > base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2 > > > change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d > > > > > > Best regards, > > > -- > > > Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-03 9:50 ` Lorenzo Stoakes @ 2025-05-05 17:53 ` Yang Shi 2025-05-06 12:26 ` Ignacio Moreno Gonzalez 1 sibling, 0 replies; 10+ messages in thread From: Yang Shi @ 2025-05-05 17:53 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Ignacio.MorenoGonzalez, Liam.Howlett, linux-mm, linux-kernel, Andrew Morton, Matthew Wilcox On 5/3/25 2:50 AM, Lorenzo Stoakes wrote: > +cc Matthew > > On Fri, May 02, 2025 at 01:53:15PM -0700, Yang Shi wrote: >> >> On 5/2/25 5:46 AM, Lorenzo Stoakes wrote: >>> +cc Andrew. >>> >>> Ignacio, you should always include Andrew in patch submissions to mm :) >>> >>> +cc Yang Shi who added this in the first place in commit c4608d1bf7c6 ("mm: >>> mmap: map MAP_STACK to VM_NOHUGEPAGE"). >> Thanks for cc'ing me. > No problem! > >>> On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: >>>> From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> >>>> >>>> commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps >>>> the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if >>>> CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the >>>> VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() >>>> with MADV_NOHUGEPAGE, an error is always returned. >>>> >>>> Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> >>> I don't see how can this cause a problem, and it fixes one in practice, so >>> LGTM. Though see note below about CRIU :) >>> >>> I also added a nit below, if you address this you can re-use my tag. >>> >>> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> >> Yeah, I agree. Looks good to me too. Reviewed-by: Yang Shi >> <yang@os.amperecomputing.com> > Thanks! > > I also believe Matthew's suggestion (see [0]) is very much worth doing > here, in my view doing both is a sensible approach, would your R-b tag > extend to that? Yes. VM_NOHUGEPAGE is a no-op on !THP kernel anyway. It is pointless to fail it. It should be unlikely any real life application rely on this behavior (returning -EINVAL). > > In which case, Ignacio - could you apply both Matthew's change and address > the nit below and send a v2? > > Thanks! > > [0]: https://lore.kernel.org/linux-mm/aBTCtOXBhUK_FLU6@casper.infradead.org/ > >>> Thanks! >>> >>> Do we want to back port this to stable kernels? If so we should have a: >>> >>> Fixes: c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") >>> cc: stable@vger.kernel.org >>> >>> Appended here, and Greg's scripts should automagically backport, assuming >>> no conflicts or such (I don't _think_ there would be...) >>> >>>> --- >>>> I discovered this issue when trying to use the tool CRIU to checkpoint >>>> and restore a container. Our running kernel is compiled without >>>> CONFIG_TRANSPARENT_HUGETABLES. CRIU parses the output of >>>> /proc/<pid>/smaps and saves the "nh" flag. When trying to restore the >>>> container, CRIU fails to restore the "nh" mappings, since madvise() >>>> MADV_NOHUGEPAGE always returns an error because >>>> CONFIG_TRANSPARENT_HUGETABLES is not defined. >>> Yeah this is really not a stable or valid use of the /proc/$pid/[s]maps >>> interface :P CRIU is sort of a blurry line of relying on internal >>> implementation details so we're kinda not obligated to prevent breakages. >>> >>> CRIU is kinda relying on internal implementation details so debatable as to >>> whether we should be bending over backwards to support. >>> >>> BUT, we also don't want to cause unwanted issues if there's a simple fix >>> and this seems reasonable to me. >>> >>>> --- >>>> include/linux/mman.h | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/include/linux/mman.h b/include/linux/mman.h >>>> index bce214fece16b9af3791a2baaecd6063d0481938..1e83bc0e3db670b04743f5208826e87455a05325 100644 >>>> --- a/include/linux/mman.h >>>> +++ b/include/linux/mman.h >>>> @@ -155,7 +155,9 @@ calc_vm_flag_bits(struct file *file, unsigned long flags) >>>> return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | >>>> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | >>>> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | >>>> +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) >>> NIT, but can we use ifdef here for consistency? Thanks. >>> >>>> _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | >>>> +#endif >>>> arch_calc_vm_flag_bits(file, flags); >>>> } >>>> >>>> >>>> --- >>>> base-commit: fc96b232f8e7c0a6c282f47726b2ff6a5fb341d2 >>>> change-id: 20250428-map-map_stack-to-vm_nohugepage-only-if-thp-is-enabled-ce40a1de095d >>>> >>>> Best regards, >>>> -- >>>> Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> >>>> >>>> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-03 9:50 ` Lorenzo Stoakes 2025-05-05 17:53 ` Yang Shi @ 2025-05-06 12:26 ` Ignacio Moreno Gonzalez 1 sibling, 0 replies; 10+ messages in thread From: Ignacio Moreno Gonzalez @ 2025-05-06 12:26 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Ignacio.MorenoGonzalez, Liam.Howlett, akpm, linux-kernel, linux-mm, willy, yang Hi all, Thanks for your comments and sorry for the late reply. On 5/3/2025 11:50 AM, Lorenzo Stoakes wrote: > In which case, Ignacio - could you apply both Matthew's change and address > the nit below and send a v2? > > Thanks! > > [0]: https://lore.kernel.org/linux-mm/aBTCtOXBhUK_FLU6@casper.infradead.org/ Sure! I will separate the changes into two patches, since the first one will include the 'Fixes' tag (as discussed here). I also think it is a good approach to keep the patches separated since they address different issues: - The mman.h (calc_vm_flag_bits) patch addresses mapping MAP_STACK to VM_NOHUGEPAGE on !THP kernels. - The huge_mm.h (madvice) patch addresses the issue that VM_NOHUGEPAGE should be a no-op on !THP kernels. Regards, Ignacio ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 9:31 [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled Ignacio Moreno Gonzalez via B4 Relay 2025-05-02 12:46 ` Lorenzo Stoakes @ 2025-05-02 13:03 ` Matthew Wilcox 2025-05-02 13:12 ` Lorenzo Stoakes 1 sibling, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2025-05-02 13:03 UTC (permalink / raw) To: Ignacio.MorenoGonzalez Cc: Liam.Howlett, lorenzo.stoakes, linux-mm, linux-kernel On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > with MADV_NOHUGEPAGE, an error is always returned. Isn't that the real problem though? +++ b/include/linux/huge_mm.h @@ -596,6 +596,8 @@ static inline bool unmap_huge_pmd_locked(struct vm_area_struct *vma, static inline int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice) { + if (advice == MADV_NOHUGEPAGE) + return 0; return -EINVAL; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 13:03 ` Matthew Wilcox @ 2025-05-02 13:12 ` Lorenzo Stoakes 2025-05-02 14:16 ` Matthew Wilcox 0 siblings, 1 reply; 10+ messages in thread From: Lorenzo Stoakes @ 2025-05-02 13:12 UTC (permalink / raw) To: Matthew Wilcox Cc: Ignacio.MorenoGonzalez, Liam.Howlett, linux-mm, linux-kernel On Fri, May 02, 2025 at 02:03:48PM +0100, Matthew Wilcox wrote: > On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > > with MADV_NOHUGEPAGE, an error is always returned. > > Isn't that the real problem though? Hmm, but wouldn't we want users who are trying to set MADV_[NO]HUGEPAGE to be made aware that it isn't going to do anything? And wouldn't changing this be a possibly 'breaking userspace' thing if somebody somewhere relies on this? Also this will make this inconsistent with e.g. MADV_COLLAPSE also? I guess you could argue MADV_NOHUGEPAGE with !THP is just a no-op... But definitely not MADV_HUGEPAGE. Another angle to this as well is - we are causing these VMAs to have a pointless VMA flag set that presumably can't be set any other way. So I do think the proposed solution is right, and is the least impactful one. > > +++ b/include/linux/huge_mm.h > @@ -596,6 +596,8 @@ static inline bool unmap_huge_pmd_locked(struct vm_area_struct *vma, > static inline int hugepage_madvise(struct vm_area_struct *vma, > unsigned long *vm_flags, int advice) > { > + if (advice == MADV_NOHUGEPAGE) > + return 0; > return -EINVAL; > } > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 13:12 ` Lorenzo Stoakes @ 2025-05-02 14:16 ` Matthew Wilcox 2025-05-02 14:24 ` Lorenzo Stoakes 0 siblings, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2025-05-02 14:16 UTC (permalink / raw) To: Lorenzo Stoakes Cc: Ignacio.MorenoGonzalez, Liam.Howlett, linux-mm, linux-kernel On Fri, May 02, 2025 at 02:12:16PM +0100, Lorenzo Stoakes wrote: > On Fri, May 02, 2025 at 02:03:48PM +0100, Matthew Wilcox wrote: > > On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > > > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > > > > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > > > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > > > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > > > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > > > with MADV_NOHUGEPAGE, an error is always returned. > > > > Isn't that the real problem though? > > Hmm, but wouldn't we want users who are trying to set MADV_[NO]HUGEPAGE to > be made aware that it isn't going to do anything? ... I thought the patch was clear. Only setting NOHUGEPAGE becomes a no-op. Setting HUGEPAGE remains EINVAL. > And wouldn't changing this be a possibly 'breaking userspace' thing if > somebody somewhere relies on this? I don't see what userspace could rely on it returning EINVAL, since it won't on a kernel which has THP enabled. > Also this will make this inconsistent with e.g. MADV_COLLAPSE also? Not sure how ... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled 2025-05-02 14:16 ` Matthew Wilcox @ 2025-05-02 14:24 ` Lorenzo Stoakes 0 siblings, 0 replies; 10+ messages in thread From: Lorenzo Stoakes @ 2025-05-02 14:24 UTC (permalink / raw) To: Matthew Wilcox Cc: Ignacio.MorenoGonzalez, Liam.Howlett, linux-mm, linux-kernel On Fri, May 02, 2025 at 03:16:52PM +0100, Matthew Wilcox wrote: > On Fri, May 02, 2025 at 02:12:16PM +0100, Lorenzo Stoakes wrote: > > On Fri, May 02, 2025 at 02:03:48PM +0100, Matthew Wilcox wrote: > > > On Fri, May 02, 2025 at 11:31:41AM +0200, Ignacio Moreno Gonzalez via B4 Relay wrote: > > > > From: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> > > > > > > > > commit c4608d1bf7c6 ("mm: mmap: map MAP_STACK to VM_NOHUGEPAGE") maps > > > > the mmap option MAP_STACK to VM_NOHUGEPAGE. This is also done if > > > > CONFIG_TRANSPARENT_HUGETABLES is not defined. But in that case, the > > > > VM_NOHUGEPAGE does not make sense. For instance, when calling madvise() > > > > with MADV_NOHUGEPAGE, an error is always returned. > > > > > > Isn't that the real problem though? > > > > Hmm, but wouldn't we want users who are trying to set MADV_[NO]HUGEPAGE to > > be made aware that it isn't going to do anything? > > ... I thought the patch was clear. Only setting NOHUGEPAGE becomes a > no-op. Setting HUGEPAGE remains EINVAL. Sorry I"m an idiot, glossed over the 'if (advice == MADV_NOHUGEPAGE)' bit... > > > And wouldn't changing this be a possibly 'breaking userspace' thing if > > somebody somewhere relies on this? > > I don't see what userspace could rely on it returning EINVAL, since it > won't on a kernel which has THP enabled. I mean for the purposes of detecting THP being disabled, but I mean it's far-fetched. > > > Also this will make this inconsistent with e.g. MADV_COLLAPSE also? > > Not sure how ... > Yup this was based on my misreading... But I'm still not a fan of us setting the VM_NOHUGEPAGE flag like this when it's meaningless and this is (probably) the only way you can do it on a non-THP kernel. It's user-visible in /proc/$pid/smaps (I mean, that's the whole problem here right?) and impacts merging (though MAP_STACK you'd think it'd probably not be a big deal for...) Or ifdef the VM_[NO]HUGEPAGE on CONFIG_TRANSPARENT_HUGEPAGE? So maybe, let's do both things? ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-05-06 12:26 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-05-02 9:31 [PATCH] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled Ignacio Moreno Gonzalez via B4 Relay 2025-05-02 12:46 ` Lorenzo Stoakes 2025-05-02 20:53 ` Yang Shi 2025-05-03 9:50 ` Lorenzo Stoakes 2025-05-05 17:53 ` Yang Shi 2025-05-06 12:26 ` Ignacio Moreno Gonzalez 2025-05-02 13:03 ` Matthew Wilcox 2025-05-02 13:12 ` Lorenzo Stoakes 2025-05-02 14:16 ` Matthew Wilcox 2025-05-02 14:24 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox