linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
	ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev
Cc: linux-mm@kvack.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm/huge_memory: fix NULL pointer deference when splitting shmem folio in swap cache
Date: Wed, 19 Nov 2025 09:57:58 +0100	[thread overview]
Message-ID: <a5437eb1-0d5f-48eb-ba20-70ef9d02396b@kernel.org> (raw)
In-Reply-To: <20251119012630.14701-1-richard.weiyang@gmail.com>

On 19.11.25 02:26, Wei Yang wrote:
> Commit c010d47f107f ("mm: thp: split huge page to any lower order
> pages") introduced an early check on the folio's order via
> mapping->flags before proceeding with the split work.
> 
> This check introduced a bug: for shmem folios in the swap cache, the
> mapping pointer can be NULL. Accessing mapping->flags in this state
> leads directly to a NULL pointer dereference.

Under which circumstances would that be the case? Only for large shmem 
folios in the swapcache or also for truncated folios? So I'd assume this
would also affect truncated folios and we should spell that out here?

> 
> This commit fixes the issue by moving the check for mapping != NULL
> before any attempt to access mapping->flags.
> 
> This fix necessarily changes the return value from -EBUSY to -EINVAL
> when mapping is NULL. After reviewing current callers, they do not
> differentiate between these two error codes, making this change safe.

The doc of __split_huge_page_to_list_to_order() would now be outdated 
and has to be updated.

Also, take a look at s390_wiggle_split_folio(): returning -EINVAL 
instead of -EBUSY will make a difference on concurrent truncation. 
-EINVAL will be propagated and make the operation fail, while -EBUSY 
will be translated to -EAGAIN and the caller will simply lookup the 
folio again and retry.

So I think we should try to keep truncation return -EBUSY. For the shmem 
case, I think it's ok to return -EINVAL. I guess we can identify such 
folios by checking for folio_test_swapcache().


Probably worth mentioning that this was identified by code inspection?

> 
> Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages")
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: <stable@vger.kernel.org>

Hmm, what would this patch look like when based on current upstream? 
We'd likely want to get that upstream asap.

> 
> ---
> 
> This patch is based on current mm-new, latest commit:
> 
>      056b93566a35 mm/vmalloc: warn only once when vmalloc detect invalid gfp flags
> 
> Backport note:
> 
> Current code evolved from original commit with following four changes.
> We should do proper adjustment respectively on backporting.
> 
> commit c010d47f107f609b9f4d6a103b6dfc53889049e9
> Author: Zi Yan <ziy@nvidia.com>
> Date:   Mon Feb 26 15:55:33 2024 -0500
> 
>      mm: thp: split huge page to any lower order pages
> 
> commit 6a50c9b512f7734bc356f4bd47885a6f7c98491a
> Author: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> Date:   Fri Jun 7 17:40:48 2024 +0800
> 
>      mm: huge_memory: fix misused mapping_large_folio_support() for anon folios
> 
> commit 9b2f764933eb5e3ac9ebba26e3341529219c4401
> Author: Zi Yan <ziy@nvidia.com>
> Date:   Wed Jan 22 11:19:27 2025 -0500
> 
>      mm/huge_memory: allow split shmem large folio to any lower order
> 
> commit 58729c04cf1092b87aeef0bf0998c9e2e4771133
> Author: Zi Yan <ziy@nvidia.com>
> Date:   Fri Mar 7 12:39:57 2025 -0500
> 
>      mm/huge_memory: add buddy allocator like (non-uniform) folio_split()
> ---
>   mm/huge_memory.c | 68 +++++++++++++++++++++++++-----------------------
>   1 file changed, 35 insertions(+), 33 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 7c69572b6c3f..8701c3eef05f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3696,29 +3696,42 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>   				"Cannot split to order-1 folio");
>   		if (new_order == 1)
>   			return false;
> -	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
> -		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
> -		    !mapping_large_folio_support(folio->mapping)) {
> -			/*
> -			 * We can always split a folio down to a single page
> -			 * (new_order == 0) uniformly.
> -			 *
> -			 * For any other scenario
> -			 *   a) uniform split targeting a large folio
> -			 *      (new_order > 0)
> -			 *   b) any non-uniform split
> -			 * we must confirm that the file system supports large
> -			 * folios.
> -			 *
> -			 * Note that we might still have THPs in such
> -			 * mappings, which is created from khugepaged when
> -			 * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
> -			 * case, the mapping does not actually support large
> -			 * folios properly.
> -			 */
> -			VM_WARN_ONCE(warns,
> -				"Cannot split file folio to non-0 order");
> +	} else {


Why not simply

} else if (!folio->mapping) {
	/*
	 * Truncated?
	...
	return false;
} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
...

> +		const struct address_space *mapping = folio->mapping;
> +
> +		/* Truncated ? */
> +		/*
> +		 * TODO: add support for large shmem folio in swap cache.
> +		 * When shmem is in swap cache, mapping is NULL and
> +		 * folio_test_swapcache() is true.
> +		 */

While at it, can we merge the two comments into something, like:

/*
  * If there is no mapping that the folio was truncated and we cannot
  * split.
  *
  * TODO: large shmem folio in the swap cache also don't currently have a
  * mapping but folio_test_swapcache() is true for them.
  */

-- 
Cheers

David


  parent reply	other threads:[~2025-11-19  8:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-19  1:26 Wei Yang
2025-11-19  2:32 ` Zi Yan
2025-11-19  2:56   ` Wei Yang
2025-11-19  8:57 ` David Hildenbrand (Red Hat) [this message]
2025-11-19 12:23   ` Wei Yang
2025-11-19 12:54     ` David Hildenbrand (Red Hat)
2025-11-19 13:08       ` Zi Yan
2025-11-19 13:41         ` Wei Yang
2025-11-19 13:58           ` David Hildenbrand (Red Hat)
2025-11-19 14:09         ` David Hildenbrand (Red Hat)
2025-11-19 14:29           ` Zi Yan
2025-11-19 14:37             ` David Hildenbrand (Red Hat)
2025-11-19 14:46               ` David Hildenbrand (Red Hat)
2025-11-19 14:48                 ` Zi Yan
2025-11-19 14:50                   ` David Hildenbrand (Red Hat)
2025-11-19 23:18                 ` Wei Yang
2025-11-20  0:47                 ` Wei Yang
2025-11-20  3:00                   ` Zi Yan
2025-11-19 14:47               ` Zi Yan
2025-11-19 13:14       ` Wei Yang
2025-11-19 12:42   ` Wei Yang
2025-11-19 14:13     ` David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5437eb1-0d5f-48eb-ba20-70ef9d02396b@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox