* [PATCH] mm: avoid poison consumption when splitting THP @ 2025-09-11 2:14 Andrew Zaborowski 2025-09-11 3:19 ` Zi Yan 2025-09-11 8:12 ` David Hildenbrand 0 siblings, 2 replies; 4+ messages in thread From: Andrew Zaborowski @ 2025-09-11 2:14 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Miaohe Lin Handling a memory failure pointing inside a huge page requires splitting the page. The splitting logic uses a mechanism, implemented in migrate.c:try_to_map_unused_to_zeropage(), that inspects contents of individual pages to find zero-filled pages. The read access to the contents may cause a new, synchronous exception like an x86 Machine Check, delivered before the initial memory_failure() finishes, ending in a crash. Luckily memory_failure() already sets the has_hwpoisoned flag on the folio right before try_to_split_thp_page(). Don't enable the shared zeropage mechanism (RMP_USE_SHARED_ZEROPAGE flag) down in __split_unmapped_folio() when the original folio has has_hwpoisoned. Note: we're disabling a potentially useful feature, some of the individual pages that aren't poisoned might be zero-filled. One argument for not trying to add a mechanism to maybe re-scan them later, apart from code cost, is that the owning process is likely being killed and the memory released. Signed-off-by: Andrew Zaborowski <balrogg+code@gmail.com> --- mm/huge_memory.c | 3 ++- mm/memory-failure.c | 6 ++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9c38a95e9f0..1568f0308b9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3588,6 +3588,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, struct list_head *list, bool uniform_split) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); + bool has_hwpoisoned = folio_test_has_hwpoisoned(folio); XA_STATE(xas, &folio->mapping->i_pages, folio->index); struct folio *end_folio = folio_next(folio); bool is_anon = folio_test_anon(folio); @@ -3858,7 +3859,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (nr_shmem_dropped) shmem_uncharge(mapping->host, nr_shmem_dropped); - if (!ret && is_anon) + if (!ret && is_anon && !has_hwpoisoned) remap_flags = RMP_USE_SHARED_ZEROPAGE; remap_page(folio, 1 << order, remap_flags); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fc30ca4804b..2d755493de9 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2352,8 +2352,10 @@ int memory_failure(unsigned long pfn, int flags) * otherwise it may race with THP split. * And the flag can't be set in get_hwpoison_page() since * it is called by soft offline too and it is just called - * for !MF_COUNT_INCREASED. So here seems to be the best - * place. + * for !MF_COUNT_INCREASED. + * It also tells __split_unmapped_folio() to not bother + * using the shared zeropage -- the all-zeros check would + * consume the poison. So here seems to be the best place. * * Don't need care about the above error handling paths for * get_hwpoison_page() since they handle either free page -- 2.45.2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid poison consumption when splitting THP 2025-09-11 2:14 [PATCH] mm: avoid poison consumption when splitting THP Andrew Zaborowski @ 2025-09-11 3:19 ` Zi Yan 2025-09-11 6:19 ` Lance Yang 2025-09-11 8:12 ` David Hildenbrand 1 sibling, 1 reply; 4+ messages in thread From: Zi Yan @ 2025-09-11 3:19 UTC (permalink / raw) To: Andrew Zaborowski Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Miaohe Lin On 10 Sep 2025, at 22:14, Andrew Zaborowski wrote: > Handling a memory failure pointing inside a huge page requires splitting > the page. The splitting logic uses a mechanism, implemented in > migrate.c:try_to_map_unused_to_zeropage(), that inspects contents of > individual pages to find zero-filled pages. The read access to the > contents may cause a new, synchronous exception like an x86 Machine > Check, delivered before the initial memory_failure() finishes, ending > in a crash. > > Luckily memory_failure() already sets the has_hwpoisoned flag on the > folio right before try_to_split_thp_page(). Don't enable the shared > zeropage mechanism (RMP_USE_SHARED_ZEROPAGE flag) down in > __split_unmapped_folio() when the original folio has has_hwpoisoned. > > Note: we're disabling a potentially useful feature, some of the > individual pages that aren't poisoned might be zero-filled. One > argument for not trying to add a mechanism to maybe re-scan them later, > apart from code cost, is that the owning process is likely being > killed and the memory released. Sounds reasonable to me. > > Signed-off-by: Andrew Zaborowski <balrogg+code@gmail.com> > --- > mm/huge_memory.c | 3 ++- > mm/memory-failure.c | 6 ++++-- > 2 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 9c38a95e9f0..1568f0308b9 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3588,6 +3588,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, > struct list_head *list, bool uniform_split) > { > struct deferred_split *ds_queue = get_deferred_split_queue(folio); > + bool has_hwpoisoned = folio_test_has_hwpoisoned(folio); The state needs to be stored here because __split_unmapped_folio() clears the flag. Maybe add a comment here to prevent people from “optimizing” it by calling folio_test_has_hwpoisoned(folio) in the code below. (I wanted to until I checked the definition of folio_test_has_hwpoisoned()) > XA_STATE(xas, &folio->mapping->i_pages, folio->index); > struct folio *end_folio = folio_next(folio); > bool is_anon = folio_test_anon(folio); > @@ -3858,7 +3859,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, > if (nr_shmem_dropped) > shmem_uncharge(mapping->host, nr_shmem_dropped); > > - if (!ret && is_anon) > + if (!ret && is_anon && !has_hwpoisoned) > remap_flags = RMP_USE_SHARED_ZEROPAGE; > remap_page(folio, 1 << order, remap_flags); > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index fc30ca4804b..2d755493de9 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -2352,8 +2352,10 @@ int memory_failure(unsigned long pfn, int flags) > * otherwise it may race with THP split. > * And the flag can't be set in get_hwpoison_page() since > * it is called by soft offline too and it is just called > - * for !MF_COUNT_INCREASED. So here seems to be the best > - * place. > + * for !MF_COUNT_INCREASED. > + * It also tells __split_unmapped_folio() to not bother s/__split_unmapped_folio/__folio_split/, since remap_page() is called in __folio_split(). > + * using the shared zeropage -- the all-zeros check would > + * consume the poison. So here seems to be the best place. > * > * Don't need care about the above error handling paths for > * get_hwpoison_page() since they handle either free page > -- > 2.45.2 Otherwise, Acked-by: Zi Yan <ziy@nvidia.com> Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid poison consumption when splitting THP 2025-09-11 3:19 ` Zi Yan @ 2025-09-11 6:19 ` Lance Yang 0 siblings, 0 replies; 4+ messages in thread From: Lance Yang @ 2025-09-11 6:19 UTC (permalink / raw) To: Zi Yan, Andrew Zaborowski Cc: linux-mm, linux-kernel, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Miaohe Lin On Thu, Sep 11, 2025 at 12:11 PM Zi Yan <ziy@nvidia.com> wrote: > > On 10 Sep 2025, at 22:14, Andrew Zaborowski wrote: > > > Handling a memory failure pointing inside a huge page requires splitting > > the page. The splitting logic uses a mechanism, implemented in > > migrate.c:try_to_map_unused_to_zeropage(), that inspects contents of > > individual pages to find zero-filled pages. The read access to the > > contents may cause a new, synchronous exception like an x86 Machine > > Check, delivered before the initial memory_failure() finishes, ending > > in a crash. > > > > Luckily memory_failure() already sets the has_hwpoisoned flag on the > > folio right before try_to_split_thp_page(). Don't enable the shared > > zeropage mechanism (RMP_USE_SHARED_ZEROPAGE flag) down in > > __split_unmapped_folio() when the original folio has has_hwpoisoned. Nit: s/__split_unmapped_folio/__folio_split/ As Zi mentioned, remap_page() is called in __folio_split() ;) > > > > Note: we're disabling a potentially useful feature, some of the > > individual pages that aren't poisoned might be zero-filled. One > > argument for not trying to add a mechanism to maybe re-scan them later, > > apart from code cost, is that the owning process is likely being > > killed and the memory released. > > Sounds reasonable to me. Makes sense to me as well! > > > > > Signed-off-by: Andrew Zaborowski <balrogg+code@gmail.com> > > --- > > mm/huge_memory.c | 3 ++- > > mm/memory-failure.c | 6 ++++-- > > 2 files changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 9c38a95e9f0..1568f0308b9 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -3588,6 +3588,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, > > struct list_head *list, bool uniform_split) > > { > > struct deferred_split *ds_queue = get_deferred_split_queue(folio); > > + bool has_hwpoisoned = folio_test_has_hwpoisoned(folio); > > The state needs to be stored here because __split_unmapped_folio() > clears the flag. Maybe add a comment here to prevent people > from “optimizing” it by calling folio_test_has_hwpoisoned(folio) > in the code below. > > (I wanted to until I checked the definition of folio_test_has_hwpoisoned()) folio_test_has_hwpoisoned() requires a large folio. That is safe in this context, since this path is only ever called for large folios. Cheers, Lance > > > XA_STATE(xas, &folio->mapping->i_pages, folio->index); > > struct folio *end_folio = folio_next(folio); > > bool is_anon = folio_test_anon(folio); > > @@ -3858,7 +3859,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, > > if (nr_shmem_dropped) > > shmem_uncharge(mapping->host, nr_shmem_dropped); > > > > - if (!ret && is_anon) > > + if (!ret && is_anon && !has_hwpoisoned) > > remap_flags = RMP_USE_SHARED_ZEROPAGE; > > remap_page(folio, 1 << order, remap_flags); > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index fc30ca4804b..2d755493de9 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -2352,8 +2352,10 @@ int memory_failure(unsigned long pfn, int flags) > > * otherwise it may race with THP split. > > * And the flag can't be set in get_hwpoison_page() since > > * it is called by soft offline too and it is just called > > - * for !MF_COUNT_INCREASED. So here seems to be the best > > - * place. > > + * for !MF_COUNT_INCREASED. > > + * It also tells __split_unmapped_folio() to not bother > > s/__split_unmapped_folio/__folio_split/, since remap_page() is > called in __folio_split(). > > > + * using the shared zeropage -- the all-zeros check would > > + * consume the poison. So here seems to be the best place. > > * > > * Don't need care about the above error handling paths for > > * get_hwpoison_page() since they handle either free page > > -- > > 2.45.2 > > Otherwise, Acked-by: Zi Yan <ziy@nvidia.com> > > Best Regards, > Yan, Zi > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid poison consumption when splitting THP 2025-09-11 2:14 [PATCH] mm: avoid poison consumption when splitting THP Andrew Zaborowski 2025-09-11 3:19 ` Zi Yan @ 2025-09-11 8:12 ` David Hildenbrand 1 sibling, 0 replies; 4+ messages in thread From: David Hildenbrand @ 2025-09-11 8:12 UTC (permalink / raw) To: Andrew Zaborowski, linux-mm, linux-kernel Cc: Andrew Morton, Lorenzo Stoakes, Miaohe Lin On 11.09.25 04:14, Andrew Zaborowski wrote: > Handling a memory failure pointing inside a huge page requires splitting > the page. The splitting logic uses a mechanism, implemented in > migrate.c:try_to_map_unused_to_zeropage(), that inspects contents of > individual pages to find zero-filled pages. The read access to the > contents may cause a new, synchronous exception like an x86 Machine > Check, delivered before the initial memory_failure() finishes, ending > in a crash. > > Luckily memory_failure() already sets the has_hwpoisoned flag on the > folio right before try_to_split_thp_page(). Don't enable the shared > zeropage mechanism (RMP_USE_SHARED_ZEROPAGE flag) down in > __split_unmapped_folio() when the original folio has has_hwpoisoned. > > Note: we're disabling a potentially useful feature, some of the > individual pages that aren't poisoned might be zero-filled. One > argument for not trying to add a mechanism to maybe re-scan them later, > apart from code cost, is that the owning process is likely being > killed and the memory released. > > Signed-off-by: Andrew Zaborowski <balrogg+code@gmail.com> > --- I would suggest just checking whether the page (PageHWPoison()) is poisoned before doing the check for zero. If set, just treat it as non-zero. No need to stop the split. You'll have to do that in two locations. No need to mess with RMP_USE_SHARED_ZEROPAGE -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-09-11 8:12 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-11 2:14 [PATCH] mm: avoid poison consumption when splitting THP Andrew Zaborowski 2025-09-11 3:19 ` Zi Yan 2025-09-11 6:19 ` Lance Yang 2025-09-11 8:12 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox