[PATCH v2] mm: shmem: don't set large-order range for internal shmem mount

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
@ 2026-04-15  8:22 Baolin Wang
  2026-04-15  8:47 ` David Hildenbrand (Arm)
  2026-04-15 13:45 ` Matthew Wilcox
  0 siblings, 2 replies; 17+ messages in thread
From: Baolin Wang @ 2026-04-15  8:22 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, ziy, david, ljs, lance.yang, baolin.wang, linux-mm, linux-kernel

Anonymous shmem large order allocations are dynamically controlled via the
global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/shmem_enabled)
and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled).

Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
which large orders are allowed, rather than relying on mapping_max_folio_order().
Moreover, mapping_max_folio_order() is intended to control large order
allocations only for tmpfs mounts. Clarify this by not setting a large-order
range for internal shmem mount (e.g. anonymous shmem), to avoid confusion,
as discussed in the previous thread[1].

[1] https://lore.kernel.org/all/ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Changes from v1:
 - Update the comments and commit message, per Lance.
---
 mm/shmem.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 4ecefe02881d..568e1baee90d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
 	if (sbinfo->noswap)
 		mapping_set_unevictable(inode->i_mapping);
 
-	/* Don't consider 'deny' for emergencies and 'force' for testing */
-	if (sbinfo->huge)
+	/*
+	 * Only set the large order range for tmpfs mounts. The large order
+	 * selection for the internal shmem mount is configured dynamically
+	 * via the 'shmem_enabled' interfaces, so there is no need to set a
+	 * large order range for the internal shmem mount's mapping.
+	 *
+	 * Note: Don't consider 'deny' for emergencies and 'force' for
+	 * testing.
+	 */
+	if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
 		mapping_set_large_folios(inode->i_mapping);
 
 	switch (mode & S_IFMT) {
-- 
2.47.3



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  8:22 [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount Baolin Wang
@ 2026-04-15  8:47 ` David Hildenbrand (Arm)
  2026-04-15  9:04   ` Baolin Wang
  2026-04-15 13:45 ` Matthew Wilcox
  1 sibling, 1 reply; 17+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-15  8:47 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel

On 4/15/26 10:22, Baolin Wang wrote:
> Anonymous shmem large order allocations are dynamically controlled via the
> global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/shmem_enabled)
> and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled).
> 
> Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
> which large orders are allowed, rather than relying on mapping_max_folio_order().
> Moreover, mapping_max_folio_order() is intended to control large order
> allocations only for tmpfs mounts. Clarify this by not setting a large-order
> range for internal shmem mount (e.g. anonymous shmem), to avoid confusion,
> as discussed in the previous thread[1].
> 
> [1] https://lore.kernel.org/all/ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> Changes from v1:
>  - Update the comments and commit message, per Lance.
> ---
>  mm/shmem.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 4ecefe02881d..568e1baee90d 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
>  	if (sbinfo->noswap)
>  		mapping_set_unevictable(inode->i_mapping);
>  
> -	/* Don't consider 'deny' for emergencies and 'force' for testing */
> -	if (sbinfo->huge)
> +	/*
> +	 * Only set the large order range for tmpfs mounts. The large order
> +	 * selection for the internal shmem mount is configured dynamically
> +	 * via the 'shmem_enabled' interfaces, so there is no need to set a
> +	 * large order range for the internal shmem mount's mapping.
> +	 *
> +	 * Note: Don't consider 'deny' for emergencies and 'force' for
> +	 * testing.
> +	 */
> +	if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>  		mapping_set_large_folios(inode->i_mapping);

I don't like that special casing. In an ideal world, any mapping that
supports large folios would indicate that.

Now, which large folios to allocate is a different question.

What's the problem with indicating for all shmem mappings that support
large folios that support, but handling *which* folio sizes to allocate
elsewhere?

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  8:47 ` David Hildenbrand (Arm)
@ 2026-04-15  9:04   ` Baolin Wang
  2026-04-15  9:19     ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-15  9:04 UTC (permalink / raw)
  To: David Hildenbrand (Arm), akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel



On 4/15/26 4:47 PM, David Hildenbrand (Arm) wrote:
> On 4/15/26 10:22, Baolin Wang wrote:
>> Anonymous shmem large order allocations are dynamically controlled via the
>> global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/shmem_enabled)
>> and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled).
>>
>> Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
>> which large orders are allowed, rather than relying on mapping_max_folio_order().
>> Moreover, mapping_max_folio_order() is intended to control large order
>> allocations only for tmpfs mounts. Clarify this by not setting a large-order
>> range for internal shmem mount (e.g. anonymous shmem), to avoid confusion,
>> as discussed in the previous thread[1].
>>
>> [1] https://lore.kernel.org/all/ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>> Changes from v1:
>>   - Update the comments and commit message, per Lance.
>> ---
>>   mm/shmem.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 4ecefe02881d..568e1baee90d 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
>>   	if (sbinfo->noswap)
>>   		mapping_set_unevictable(inode->i_mapping);
>>   
>> -	/* Don't consider 'deny' for emergencies and 'force' for testing */
>> -	if (sbinfo->huge)
>> +	/*
>> +	 * Only set the large order range for tmpfs mounts. The large order
>> +	 * selection for the internal shmem mount is configured dynamically
>> +	 * via the 'shmem_enabled' interfaces, so there is no need to set a
>> +	 * large order range for the internal shmem mount's mapping.
>> +	 *
>> +	 * Note: Don't consider 'deny' for emergencies and 'force' for
>> +	 * testing.
>> +	 */
>> +	if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>>   		mapping_set_large_folios(inode->i_mapping);
> 
> I don't like that special casing. In an ideal world, any mapping that
> supports large folios would indicate that.
> 
> Now, which large folios to allocate is a different question.
> 
> What's the problem with indicating for all shmem mappings that support
> large folios that support, but handling *which* folio sizes to allocate
> elsewhere?

Thanks for taking a look.

As I mentioned, the original logic has several issues for anonymous shmem:

1. Whether anonymous shmem supports large folios can be dynamically 
configured via sysfs interfaces, so mapping_set_large_folios() set 
during initialization cannot accurately reflect whether anonymous shmem 
actually supports large folios.

2. Calling mapping_set_large_folios() here by default makes anonymous 
shmem support 'MAX_PAGECACHE_ORDER' by default. However, the range of 
large orders supported by anonymous shmem is also dynamically 
configurable via sysfs interfaces, which could cause more confusion.

3. Currently, no users will call mapping_large_folio_support() related 
functions to determine whether large folios are supported for anonymous 
shmem.

Therefore, rather than having anonymous shmem call 
mapping_set_large_folios() and introduce so much confusion, I'd prefer 
to exclude anonymous shmem from calling mapping_set_large_folios().


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  9:04   ` Baolin Wang
@ 2026-04-15  9:19     ` David Hildenbrand (Arm)
  2026-04-15  9:41       ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-15  9:19 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel

On 4/15/26 11:04, Baolin Wang wrote:
> 
> 
> On 4/15/26 4:47 PM, David Hildenbrand (Arm) wrote:
>> On 4/15/26 10:22, Baolin Wang wrote:
>>> Anonymous shmem large order allocations are dynamically controlled
>>> via the
>>> global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/
>>> shmem_enabled)
>>> and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/
>>> hugepages-<size>kB/shmem_enabled).
>>>
>>> Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
>>> which large orders are allowed, rather than relying on
>>> mapping_max_folio_order().
>>> Moreover, mapping_max_folio_order() is intended to control large order
>>> allocations only for tmpfs mounts. Clarify this by not setting a
>>> large-order
>>> range for internal shmem mount (e.g. anonymous shmem), to avoid
>>> confusion,
>>> as discussed in the previous thread[1].
>>>
>>> [1] https://lore.kernel.org/all/
>>> ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> ---
>>> Changes from v1:
>>>   - Update the comments and commit message, per Lance.
>>> ---
>>>   mm/shmem.c | 12 ++++++++++--
>>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 4ecefe02881d..568e1baee90d 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct
>>> mnt_idmap *idmap,
>>>       if (sbinfo->noswap)
>>>           mapping_set_unevictable(inode->i_mapping);
>>>   -    /* Don't consider 'deny' for emergencies and 'force' for
>>> testing */
>>> -    if (sbinfo->huge)
>>> +    /*
>>> +     * Only set the large order range for tmpfs mounts. The large order
>>> +     * selection for the internal shmem mount is configured dynamically
>>> +     * via the 'shmem_enabled' interfaces, so there is no need to set a
>>> +     * large order range for the internal shmem mount's mapping.
>>> +     *
>>> +     * Note: Don't consider 'deny' for emergencies and 'force' for
>>> +     * testing.
>>> +     */
>>> +    if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>>>           mapping_set_large_folios(inode->i_mapping);
>>
>> I don't like that special casing. In an ideal world, any mapping that
>> supports large folios would indicate that.
>>
>> Now, which large folios to allocate is a different question.
>>
>> What's the problem with indicating for all shmem mappings that support
>> large folios that support, but handling *which* folio sizes to allocate
>> elsewhere?
> 
> Thanks for taking a look.

Sorry for the late feedback.

> 
> As I mentioned, the original logic has several issues for anonymous shmem:
> 
> 1. Whether anonymous shmem supports large folios can be dynamically
> configured via sysfs interfaces, so mapping_set_large_folios() set
> during initialization cannot accurately reflect whether anonymous shmem
> actually supports large folios.

Well, the mapping does support large folios, just the folio allocations
are currently disable.

It feels cleaner to say "there might be large folios in this mapping"
than saying "there are no large folios in the mapping as the mapping
does not support it", no?

> 
> 2. Calling mapping_set_large_folios() here by default makes anonymous
> shmem support 'MAX_PAGECACHE_ORDER' by default. However, the range of
> large orders supported by anonymous shmem is also dynamically
> configurable via sysfs interfaces, which could cause more confusion.

Fair enough. The mapping supports it, we just don't want to allocate
some orders (right now).

> 
> 3. Currently, no users will call mapping_large_folio_support() related
> functions to determine whether large folios are supported for anonymous
> shmem.

Right, we special-case shmem all over the place :) For example, in
khugepaged. I wonder if that could help with Zi's changes to get rid of
some shmem checks.

What if we say:

shmem that *will never have*/*does never allow* large folios never sets
mapping_set_large_folios().

shmem that *might* have large folios (in the past, now, or in the
future) sets mapping_set_large_folios().

> 
> Therefore, rather than having anonymous shmem call
> mapping_set_large_folios() and introduce so much confusion, I'd prefer
> to exclude anonymous shmem from calling mapping_set_large_folios().

I think it's more confusing to end up with large folios in a mapping
that claims to not support large folios?

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  9:19     ` David Hildenbrand (Arm)
@ 2026-04-15  9:41       ` Baolin Wang
  2026-04-15  9:54         ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-15  9:41 UTC (permalink / raw)
  To: David Hildenbrand (Arm), akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel



On 4/15/26 5:19 PM, David Hildenbrand (Arm) wrote:
> On 4/15/26 11:04, Baolin Wang wrote:
>>
>>
>> On 4/15/26 4:47 PM, David Hildenbrand (Arm) wrote:
>>> On 4/15/26 10:22, Baolin Wang wrote:
>>>> Anonymous shmem large order allocations are dynamically controlled
>>>> via the
>>>> global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/
>>>> shmem_enabled)
>>>> and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/
>>>> hugepages-<size>kB/shmem_enabled).
>>>>
>>>> Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
>>>> which large orders are allowed, rather than relying on
>>>> mapping_max_folio_order().
>>>> Moreover, mapping_max_folio_order() is intended to control large order
>>>> allocations only for tmpfs mounts. Clarify this by not setting a
>>>> large-order
>>>> range for internal shmem mount (e.g. anonymous shmem), to avoid
>>>> confusion,
>>>> as discussed in the previous thread[1].
>>>>
>>>> [1] https://lore.kernel.org/all/
>>>> ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---
>>>> Changes from v1:
>>>>    - Update the comments and commit message, per Lance.
>>>> ---
>>>>    mm/shmem.c | 12 ++++++++++--
>>>>    1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>>> index 4ecefe02881d..568e1baee90d 100644
>>>> --- a/mm/shmem.c
>>>> +++ b/mm/shmem.c
>>>> @@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct
>>>> mnt_idmap *idmap,
>>>>        if (sbinfo->noswap)
>>>>            mapping_set_unevictable(inode->i_mapping);
>>>>    -    /* Don't consider 'deny' for emergencies and 'force' for
>>>> testing */
>>>> -    if (sbinfo->huge)
>>>> +    /*
>>>> +     * Only set the large order range for tmpfs mounts. The large order
>>>> +     * selection for the internal shmem mount is configured dynamically
>>>> +     * via the 'shmem_enabled' interfaces, so there is no need to set a
>>>> +     * large order range for the internal shmem mount's mapping.
>>>> +     *
>>>> +     * Note: Don't consider 'deny' for emergencies and 'force' for
>>>> +     * testing.
>>>> +     */
>>>> +    if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>>>>            mapping_set_large_folios(inode->i_mapping);
>>>
>>> I don't like that special casing. In an ideal world, any mapping that
>>> supports large folios would indicate that.
>>>
>>> Now, which large folios to allocate is a different question.
>>>
>>> What's the problem with indicating for all shmem mappings that support
>>> large folios that support, but handling *which* folio sizes to allocate
>>> elsewhere?
>>
>> Thanks for taking a look.
> 
> Sorry for the late feedback.

No worries:)

> 
>>
>> As I mentioned, the original logic has several issues for anonymous shmem:
>>
>> 1. Whether anonymous shmem supports large folios can be dynamically
>> configured via sysfs interfaces, so mapping_set_large_folios() set
>> during initialization cannot accurately reflect whether anonymous shmem
>> actually supports large folios.
> 
> Well, the mapping does support large folios, just the folio allocations
> are currently disable.
> 
> It feels cleaner to say "there might be large folios in this mapping"
> than saying "there are no large folios in the mapping as the mapping
> does not support it", no?

Yes, that makes sense.

However, it’s also possible that the mapping does not support large 
folios, yet anonymous shmem can still allocate large folios via the 
sysfs interfaces. That doesn't make sense, right?


>> 2. Calling mapping_set_large_folios() here by default makes anonymous
>> shmem support 'MAX_PAGECACHE_ORDER' by default. However, the range of
>> large orders supported by anonymous shmem is also dynamically
>> configurable via sysfs interfaces, which could cause more confusion.
> 
> Fair enough. The mapping supports it, we just don't want to allocate
> some orders (right now).

OK. Make sense.

>> 3. Currently, no users will call mapping_large_folio_support() related
>> functions to determine whether large folios are supported for anonymous
>> shmem.
> 
> Right, we special-case shmem all over the place :) For example, in
> khugepaged. I wonder if that could help with Zi's changes to get rid of
> some shmem checks.

Sure. I'm also reviewing Zi's series.

> What if we say:
> 
> shmem that *will never have*/*does never allow* large folios never sets
> mapping_set_large_folios().
> 
> shmem that *might* have large folios (in the past, now, or in the
> future) sets mapping_set_large_folios().

For the current anonymous shmem (tmpfs is already clear, no questions), 
I don’t think there will be any "will never have/does never allow" 
cases, because it can be changed dynamically via the sysfs interfaces.

If we still want that logic, then for anonymous shmem we can treat it as 
always "might have large folios".

>> Therefore, rather than having anonymous shmem call
>> mapping_set_large_folios() and introduce so much confusion, I'd prefer
>> to exclude anonymous shmem from calling mapping_set_large_folios().
> 
> I think it's more confusing to end up with large folios in a mapping
> that claims to not support large folios?

As for 1, it still doesn’t make sense to me.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  9:41       ` Baolin Wang
@ 2026-04-15  9:54         ` David Hildenbrand (Arm)
  2026-04-15 10:05           ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-15  9:54 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel

>>> As I mentioned, the original logic has several issues for anonymous
>>> shmem:
>>>
>>> 1. Whether anonymous shmem supports large folios can be dynamically
>>> configured via sysfs interfaces, so mapping_set_large_folios() set
>>> during initialization cannot accurately reflect whether anonymous shmem
>>> actually supports large folios.
>>
>> Well, the mapping does support large folios, just the folio allocations
>> are currently disable.
>>
>> It feels cleaner to say "there might be large folios in this mapping"
>> than saying "there are no large folios in the mapping as the mapping
>> does not support it", no?
> 
> Yes, that makes sense.
> 
> However, it’s also possible that the mapping does not support large
> folios, yet anonymous shmem can still allocate large folios via the
> sysfs interfaces. That doesn't make sense, right?

That's what I am saying: if there could be large folios in there, then
let's tell the world.

Getting in a scenario where the mapping claims to not support large
folios, but then we have large folios in there is inconsistent, not?

[...]

>> What if we say:
>>
>> shmem that *will never have*/*does never allow* large folios never sets
>> mapping_set_large_folios().
>>
>> shmem that *might* have large folios (in the past, now, or in the
>> future) sets mapping_set_large_folios().
> 
> For the current anonymous shmem (tmpfs is already clear, no questions),
> I don’t think there will be any "will never have/does never allow"
> cases, because it can be changed dynamically via the sysfs interfaces.

Right. It's about non-anon shmem with huge=off.

> 
> If we still want that logic, then for anonymous shmem we can treat it as
> always "might have large folios".

Exactly.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  9:54         ` David Hildenbrand (Arm)
@ 2026-04-15 10:05           ` Baolin Wang
  2026-04-15 14:36             ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-15 10:05 UTC (permalink / raw)
  To: David Hildenbrand (Arm), akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel



On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>> As I mentioned, the original logic has several issues for anonymous
>>>> shmem:
>>>>
>>>> 1. Whether anonymous shmem supports large folios can be dynamically
>>>> configured via sysfs interfaces, so mapping_set_large_folios() set
>>>> during initialization cannot accurately reflect whether anonymous shmem
>>>> actually supports large folios.
>>>
>>> Well, the mapping does support large folios, just the folio allocations
>>> are currently disable.
>>>
>>> It feels cleaner to say "there might be large folios in this mapping"
>>> than saying "there are no large folios in the mapping as the mapping
>>> does not support it", no?
>>
>> Yes, that makes sense.
>>
>> However, it’s also possible that the mapping does not support large
>> folios, yet anonymous shmem can still allocate large folios via the
>> sysfs interfaces. That doesn't make sense, right?
> 
> That's what I am saying: if there could be large folios in there, then
> let's tell the world.
> 
> Getting in a scenario where the mapping claims to not support large
> folios, but then we have large folios in there is inconsistent, not?
> 
> [...]
> 
>>> What if we say:
>>>
>>> shmem that *will never have*/*does never allow* large folios never sets
>>> mapping_set_large_folios().
>>>
>>> shmem that *might* have large folios (in the past, now, or in the
>>> future) sets mapping_set_large_folios().
>>
>> For the current anonymous shmem (tmpfs is already clear, no questions),
>> I don’t think there will be any "will never have/does never allow"
>> cases, because it can be changed dynamically via the sysfs interfaces.
> 
> Right. It's about non-anon shmem with huge=off.
> 
>>
>> If we still want that logic, then for anonymous shmem we can treat it as
>> always "might have large folios".

OK. To resolve the confusion about 1, the logic should be changed as 
follows. Does that make sense to you?

if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
	mapping_set_large_folios(inode->i_mapping);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15  8:22 [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount Baolin Wang
  2026-04-15  8:47 ` David Hildenbrand (Arm)
@ 2026-04-15 13:45 ` Matthew Wilcox
  2026-04-16  1:02   ` Baolin Wang
  1 sibling, 1 reply; 17+ messages in thread
From: Matthew Wilcox @ 2026-04-15 13:45 UTC (permalink / raw)
  To: Baolin Wang
  Cc: akpm, hughd, ziy, david, ljs, lance.yang, linux-mm, linux-kernel

On Wed, Apr 15, 2026 at 04:22:53PM +0800, Baolin Wang wrote:
> +	/*
> +	 * Only set the large order range for tmpfs mounts. The large order
> +	 * selection for the internal shmem mount is configured dynamically
> +	 * via the 'shmem_enabled' interfaces, so there is no need to set a
> +	 * large order range for the internal shmem mount's mapping.
> +	 *
> +	 * Note: Don't consider 'deny' for emergencies and 'force' for
> +	 * testing.
> +	 */
> +	if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>  		mapping_set_large_folios(inode->i_mapping);

This isn't how mapping_set_large_folios() is supposed to be used.
It's supposed to indicate "does the filesystem support large folios".
shmem should be setting it unconditionally and if there needs to be some
other way to prevent large folios from being created, we should do that
instead.

The current code is wrong too.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15 10:05           ` Baolin Wang
@ 2026-04-15 14:36             ` David Hildenbrand (Arm)
  2026-04-16  1:05               ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-15 14:36 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel

On 4/15/26 12:05, Baolin Wang wrote:
> 
> 
> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>
>>> Yes, that makes sense.
>>>
>>> However, it’s also possible that the mapping does not support large
>>> folios, yet anonymous shmem can still allocate large folios via the
>>> sysfs interfaces. That doesn't make sense, right?
>>
>> That's what I am saying: if there could be large folios in there, then
>> let's tell the world.
>>
>> Getting in a scenario where the mapping claims to not support large
>> folios, but then we have large folios in there is inconsistent, not?
>>
>> [...]
>>
>>>
>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>> I don’t think there will be any "will never have/does never allow"
>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>
>> Right. It's about non-anon shmem with huge=off.
>>
>>>
>>> If we still want that logic, then for anonymous shmem we can treat it as
>>> always "might have large folios".
> 
> OK. To resolve the confusion about 1, the logic should be changed as
> follows. Does that make sense to you?
> 
> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>     mapping_set_large_folios(inode->i_mapping);

I think that's better. But has Willy says, maybe we can just
unconditionally set it and have it even simpler.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15 13:45 ` Matthew Wilcox
@ 2026-04-16  1:02   ` Baolin Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Baolin Wang @ 2026-04-16  1:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, hughd, ziy, david, ljs, lance.yang, linux-mm, linux-kernel



On 4/15/26 9:45 PM, Matthew Wilcox wrote:
> On Wed, Apr 15, 2026 at 04:22:53PM +0800, Baolin Wang wrote:
>> +	/*
>> +	 * Only set the large order range for tmpfs mounts. The large order
>> +	 * selection for the internal shmem mount is configured dynamically
>> +	 * via the 'shmem_enabled' interfaces, so there is no need to set a
>> +	 * large order range for the internal shmem mount's mapping.
>> +	 *
>> +	 * Note: Don't consider 'deny' for emergencies and 'force' for
>> +	 * testing.
>> +	 */
>> +	if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>>   		mapping_set_large_folios(inode->i_mapping);
> 
> This isn't how mapping_set_large_folios() is supposed to be used.
> It's supposed to indicate "does the filesystem support large folios".
> shmem should be setting it unconditionally and if there needs to be some
> other way to prevent large folios from being created, we should do that
> instead.

As discussed with David, we’ve agreed that for anonymous shmem we should 
set mapping_set_large_folios() unconditionally.

However, for tmpfs mounts, we should still respect the 'huge=' mount 
option. This was a previous performance fix, see commit 5a90c155defa 
("tmpfs: don't enable large folios if not supported").


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-15 14:36             ` David Hildenbrand (Arm)
@ 2026-04-16  1:05               ` Baolin Wang
  2026-04-16  1:11                 ` Zi Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-16  1:05 UTC (permalink / raw)
  To: David Hildenbrand (Arm), akpm, hughd
  Cc: willy, ziy, ljs, lance.yang, linux-mm, linux-kernel



On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
> On 4/15/26 12:05, Baolin Wang wrote:
>>
>>
>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>
>>>> Yes, that makes sense.
>>>>
>>>> However, it’s also possible that the mapping does not support large
>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>> sysfs interfaces. That doesn't make sense, right?
>>>
>>> That's what I am saying: if there could be large folios in there, then
>>> let's tell the world.
>>>
>>> Getting in a scenario where the mapping claims to not support large
>>> folios, but then we have large folios in there is inconsistent, not?
>>>
>>> [...]
>>>
>>>>
>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>> I don’t think there will be any "will never have/does never allow"
>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>
>>> Right. It's about non-anon shmem with huge=off.
>>>
>>>>
>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>> always "might have large folios".
>>
>> OK. To resolve the confusion about 1, the logic should be changed as
>> follows. Does that make sense to you?
>>
>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>      mapping_set_large_folios(inode->i_mapping);
> 
> I think that's better.

Thanks for your valuable input.

But has Willy says, maybe we can just
> unconditionally set it and have it even simpler.

However, for tmpfs mounts, we should still respect the 'huge=' mount 
option. See commit 5a90c155defa ("tmpfs: don't enable large folios if 
not supported").


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:05               ` Baolin Wang
@ 2026-04-16  1:11                 ` Zi Yan
  2026-04-16  1:22                   ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Zi Yan @ 2026-04-16  1:11 UTC (permalink / raw)
  To: Baolin Wang
  Cc: David Hildenbrand (Arm),
	akpm, hughd, willy, ljs, lance.yang, linux-mm, linux-kernel

On 15 Apr 2026, at 21:05, Baolin Wang wrote:

> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>> On 4/15/26 12:05, Baolin Wang wrote:
>>>
>>>
>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>
>>>>> Yes, that makes sense.
>>>>>
>>>>> However, it’s also possible that the mapping does not support large
>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>
>>>> That's what I am saying: if there could be large folios in there, then
>>>> let's tell the world.
>>>>
>>>> Getting in a scenario where the mapping claims to not support large
>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>
>>>> [...]
>>>>
>>>>>
>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>> I don’t think there will be any "will never have/does never allow"
>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>
>>>> Right. It's about non-anon shmem with huge=off.
>>>>
>>>>>
>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>> always "might have large folios".
>>>
>>> OK. To resolve the confusion about 1, the logic should be changed as
>>> follows. Does that make sense to you?
>>>
>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>      mapping_set_large_folios(inode->i_mapping);
>>
>> I think that's better.
>
> Thanks for your valuable input.
>
> But has Willy says, maybe we can just
>> unconditionally set it and have it even simpler.
>
> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").

Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
decide whether huge page will be allocated for a tmpfs?

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:11                 ` Zi Yan
@ 2026-04-16  1:22                   ` Baolin Wang
  2026-04-16  1:36                     ` Zi Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-16  1:22 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand (Arm),
	akpm, hughd, willy, ljs, lance.yang, linux-mm, linux-kernel



On 4/16/26 9:11 AM, Zi Yan wrote:
> On 15 Apr 2026, at 21:05, Baolin Wang wrote:
> 
>> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>>> On 4/15/26 12:05, Baolin Wang wrote:
>>>>
>>>>
>>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>>
>>>>>> Yes, that makes sense.
>>>>>>
>>>>>> However, it’s also possible that the mapping does not support large
>>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>>
>>>>> That's what I am saying: if there could be large folios in there, then
>>>>> let's tell the world.
>>>>>
>>>>> Getting in a scenario where the mapping claims to not support large
>>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>>
>>>>> [...]
>>>>>
>>>>>>
>>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>>> I don’t think there will be any "will never have/does never allow"
>>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>>
>>>>> Right. It's about non-anon shmem with huge=off.
>>>>>
>>>>>>
>>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>>> always "might have large folios".
>>>>
>>>> OK. To resolve the confusion about 1, the logic should be changed as
>>>> follows. Does that make sense to you?
>>>>
>>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>>       mapping_set_large_folios(inode->i_mapping);
>>>
>>> I think that's better.
>>
>> Thanks for your valuable input.
>>
>> But has Willy says, maybe we can just
>>> unconditionally set it and have it even simpler.
>>
>> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").
> 
> Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
> even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
> decide whether huge page will be allocated for a tmpfs?

Yes, of course. However, the issue isn’t whether tmpfs allows allocating 
large folios.

The problem commit 5a90c155defa tries to fix is that when tmpfs is 
mounted with the 'huge=never' option, we will not allocate large folios 
for it. Then when writing tmpfs files, generic_perform_write() will call 
mapping_max_folio_size() to get the chunk size and ends up with an 
order-9 size for writing tmpfs files. However, this tmpfs file is 
populated only with small folios, resulting in a performance regression.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:22                   ` Baolin Wang
@ 2026-04-16  1:36                     ` Zi Yan
  2026-04-16  1:45                       ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Zi Yan @ 2026-04-16  1:36 UTC (permalink / raw)
  To: Baolin Wang, David Hildenbrand (Arm), willy
  Cc: akpm, hughd, ljs, lance.yang, linux-mm, linux-kernel

On 15 Apr 2026, at 21:22, Baolin Wang wrote:

> On 4/16/26 9:11 AM, Zi Yan wrote:
>> On 15 Apr 2026, at 21:05, Baolin Wang wrote:
>>
>>> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>>>> On 4/15/26 12:05, Baolin Wang wrote:
>>>>>
>>>>>
>>>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>>>
>>>>>>> Yes, that makes sense.
>>>>>>>
>>>>>>> However, it’s also possible that the mapping does not support large
>>>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>>>
>>>>>> That's what I am saying: if there could be large folios in there, then
>>>>>> let's tell the world.
>>>>>>
>>>>>> Getting in a scenario where the mapping claims to not support large
>>>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>
>>>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>>>> I don’t think there will be any "will never have/does never allow"
>>>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>>>
>>>>>> Right. It's about non-anon shmem with huge=off.
>>>>>>
>>>>>>>
>>>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>>>> always "might have large folios".
>>>>>
>>>>> OK. To resolve the confusion about 1, the logic should be changed as
>>>>> follows. Does that make sense to you?
>>>>>
>>>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>>>       mapping_set_large_folios(inode->i_mapping);
>>>>
>>>> I think that's better.
>>>
>>> Thanks for your valuable input.
>>>
>>> But has Willy says, maybe we can just
>>>> unconditionally set it and have it even simpler.
>>>
>>> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").
>>
>> Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
>> even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
>> decide whether huge page will be allocated for a tmpfs?
>
> Yes, of course. However, the issue isn’t whether tmpfs allows allocating large folios.
>
> The problem commit 5a90c155defa tries to fix is that when tmpfs is mounted with the 'huge=never' option, we will not allocate large folios for it. Then when writing tmpfs files, generic_perform_write() will call mapping_max_folio_size() to get the chunk size and ends up with an order-9 size for writing tmpfs files. However, this tmpfs file is populated only with small folios, resulting in a performance regression.

IIUC, generic_perform_write() needs to use a small chunk if tmpfs denies huge.
It seems that Kefeng did that in the first try[1]. But willy suggested
the current fix.

I wonder if we should revisit Kefeng’s first version.

[1] https://lore.kernel.org/all/20240914140613.2334139-1-wangkefeng.wang@huawei.com/

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:36                     ` Zi Yan
@ 2026-04-16  1:45                       ` Baolin Wang
  2026-04-16  1:52                         ` Zi Yan
  0 siblings, 1 reply; 17+ messages in thread
From: Baolin Wang @ 2026-04-16  1:45 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand (Arm), willy
  Cc: akpm, hughd, ljs, lance.yang, linux-mm, linux-kernel



On 4/16/26 9:36 AM, Zi Yan wrote:
> On 15 Apr 2026, at 21:22, Baolin Wang wrote:
> 
>> On 4/16/26 9:11 AM, Zi Yan wrote:
>>> On 15 Apr 2026, at 21:05, Baolin Wang wrote:
>>>
>>>> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>>>>> On 4/15/26 12:05, Baolin Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>>>>
>>>>>>>> Yes, that makes sense.
>>>>>>>>
>>>>>>>> However, it’s also possible that the mapping does not support large
>>>>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>>>>
>>>>>>> That's what I am saying: if there could be large folios in there, then
>>>>>>> let's tell the world.
>>>>>>>
>>>>>>> Getting in a scenario where the mapping claims to not support large
>>>>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>>>
>>>>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>>>>> I don’t think there will be any "will never have/does never allow"
>>>>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>>>>
>>>>>>> Right. It's about non-anon shmem with huge=off.
>>>>>>>
>>>>>>>>
>>>>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>>>>> always "might have large folios".
>>>>>>
>>>>>> OK. To resolve the confusion about 1, the logic should be changed as
>>>>>> follows. Does that make sense to you?
>>>>>>
>>>>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>>>>        mapping_set_large_folios(inode->i_mapping);
>>>>>
>>>>> I think that's better.
>>>>
>>>> Thanks for your valuable input.
>>>>
>>>> But has Willy says, maybe we can just
>>>>> unconditionally set it and have it even simpler.
>>>>
>>>> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").
>>>
>>> Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
>>> even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
>>> decide whether huge page will be allocated for a tmpfs?
>>
>> Yes, of course. However, the issue isn’t whether tmpfs allows allocating large folios.
>>
>> The problem commit 5a90c155defa tries to fix is that when tmpfs is mounted with the 'huge=never' option, we will not allocate large folios for it. Then when writing tmpfs files, generic_perform_write() will call mapping_max_folio_size() to get the chunk size and ends up with an order-9 size for writing tmpfs files. However, this tmpfs file is populated only with small folios, resulting in a performance regression.
> 
> IIUC, generic_perform_write() needs to use a small chunk if tmpfs denies huge.
> It seems that Kefeng did that in the first try[1]. But willy suggested
> the current fix.
> 
> I wonder if we should revisit Kefeng’s first version.
> 
> [1] https://lore.kernel.org/all/20240914140613.2334139-1-wangkefeng.wang@huawei.com/

Personally, I still prefer the current fix (commit 5a90c155defa). We 
should honor the tmpfs mount option. If it explicitly says no large 
folios, we shouldn’t call mapping_set_large_folios(). Isn’t that more 
consistent with its semantics?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:45                       ` Baolin Wang
@ 2026-04-16  1:52                         ` Zi Yan
  2026-04-16  2:08                           ` Baolin Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Zi Yan @ 2026-04-16  1:52 UTC (permalink / raw)
  To: Baolin Wang
  Cc: David Hildenbrand (Arm),
	willy, akpm, hughd, ljs, lance.yang, linux-mm, linux-kernel

On 15 Apr 2026, at 21:45, Baolin Wang wrote:

> On 4/16/26 9:36 AM, Zi Yan wrote:
>> On 15 Apr 2026, at 21:22, Baolin Wang wrote:
>>
>>> On 4/16/26 9:11 AM, Zi Yan wrote:
>>>> On 15 Apr 2026, at 21:05, Baolin Wang wrote:
>>>>
>>>>> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>>>>>> On 4/15/26 12:05, Baolin Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>>>>>
>>>>>>>>> Yes, that makes sense.
>>>>>>>>>
>>>>>>>>> However, it’s also possible that the mapping does not support large
>>>>>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>>>>>
>>>>>>>> That's what I am saying: if there could be large folios in there, then
>>>>>>>> let's tell the world.
>>>>>>>>
>>>>>>>> Getting in a scenario where the mapping claims to not support large
>>>>>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>>
>>>>>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>>>>>> I don’t think there will be any "will never have/does never allow"
>>>>>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>>>>>
>>>>>>>> Right. It's about non-anon shmem with huge=off.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>>>>>> always "might have large folios".
>>>>>>>
>>>>>>> OK. To resolve the confusion about 1, the logic should be changed as
>>>>>>> follows. Does that make sense to you?
>>>>>>>
>>>>>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>>>>>        mapping_set_large_folios(inode->i_mapping);
>>>>>>
>>>>>> I think that's better.
>>>>>
>>>>> Thanks for your valuable input.
>>>>>
>>>>> But has Willy says, maybe we can just
>>>>>> unconditionally set it and have it even simpler.
>>>>>
>>>>> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").
>>>>
>>>> Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
>>>> even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
>>>> decide whether huge page will be allocated for a tmpfs?
>>>
>>> Yes, of course. However, the issue isn’t whether tmpfs allows allocating large folios.
>>>
>>> The problem commit 5a90c155defa tries to fix is that when tmpfs is mounted with the 'huge=never' option, we will not allocate large folios for it. Then when writing tmpfs files, generic_perform_write() will call mapping_max_folio_size() to get the chunk size and ends up with an order-9 size for writing tmpfs files. However, this tmpfs file is populated only with small folios, resulting in a performance regression.
>>
>> IIUC, generic_perform_write() needs to use a small chunk if tmpfs denies huge.
>> It seems that Kefeng did that in the first try[1]. But willy suggested
>> the current fix.
>>
>> I wonder if we should revisit Kefeng’s first version.
>>
>> [1] https://lore.kernel.org/all/20240914140613.2334139-1-wangkefeng.wang@huawei.com/
>
> Personally, I still prefer the current fix (commit 5a90c155defa). We should honor the tmpfs mount option. If it explicitly says no large folios, we shouldn’t call mapping_set_large_folios(). Isn’t that more consistent with its semantics?

Filesystems wishing to turn on large folios in the pagecache should call
``mapping_set_large_folios`` when initializing the incore inode.

You mean tmpfs with huge option set is a FS wishing to turn on large
folios in the pagecache, otherwise it is a FS wishing not to have large folio
in the pagecache. tmpfs with different options is seen as different FSes.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
  2026-04-16  1:52                         ` Zi Yan
@ 2026-04-16  2:08                           ` Baolin Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Baolin Wang @ 2026-04-16  2:08 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand (Arm),
	willy, akpm, hughd, ljs, lance.yang, linux-mm, linux-kernel



On 4/16/26 9:52 AM, Zi Yan wrote:
> On 15 Apr 2026, at 21:45, Baolin Wang wrote:
> 
>> On 4/16/26 9:36 AM, Zi Yan wrote:
>>> On 15 Apr 2026, at 21:22, Baolin Wang wrote:
>>>
>>>> On 4/16/26 9:11 AM, Zi Yan wrote:
>>>>> On 15 Apr 2026, at 21:05, Baolin Wang wrote:
>>>>>
>>>>>> On 4/15/26 10:36 PM, David Hildenbrand (Arm) wrote:
>>>>>>> On 4/15/26 12:05, Baolin Wang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/15/26 5:54 PM, David Hildenbrand (Arm) wrote:
>>>>>>>>>>
>>>>>>>>>> Yes, that makes sense.
>>>>>>>>>>
>>>>>>>>>> However, it’s also possible that the mapping does not support large
>>>>>>>>>> folios, yet anonymous shmem can still allocate large folios via the
>>>>>>>>>> sysfs interfaces. That doesn't make sense, right?
>>>>>>>>>
>>>>>>>>> That's what I am saying: if there could be large folios in there, then
>>>>>>>>> let's tell the world.
>>>>>>>>>
>>>>>>>>> Getting in a scenario where the mapping claims to not support large
>>>>>>>>> folios, but then we have large folios in there is inconsistent, not?
>>>>>>>>>
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For the current anonymous shmem (tmpfs is already clear, no questions),
>>>>>>>>>> I don’t think there will be any "will never have/does never allow"
>>>>>>>>>> cases, because it can be changed dynamically via the sysfs interfaces.
>>>>>>>>>
>>>>>>>>> Right. It's about non-anon shmem with huge=off.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If we still want that logic, then for anonymous shmem we can treat it as
>>>>>>>>>> always "might have large folios".
>>>>>>>>
>>>>>>>> OK. To resolve the confusion about 1, the logic should be changed as
>>>>>>>> follows. Does that make sense to you?
>>>>>>>>
>>>>>>>> if (sbinfo->huge || (sb->s_flags & SB_KERNMOUNT))
>>>>>>>>         mapping_set_large_folios(inode->i_mapping);
>>>>>>>
>>>>>>> I think that's better.
>>>>>>
>>>>>> Thanks for your valuable input.
>>>>>>
>>>>>> But has Willy says, maybe we can just
>>>>>>> unconditionally set it and have it even simpler.
>>>>>>
>>>>>> However, for tmpfs mounts, we should still respect the 'huge=' mount option. See commit 5a90c155defa ("tmpfs: don't enable large folios if not supported").
>>>>>
>>>>> Is it possible to get sbinfo->huge during tmpfs’s folio allocation time, so that
>>>>> even if all tmpfs has mapping_set_large_folios() but sbinfo->huge can still
>>>>> decide whether huge page will be allocated for a tmpfs?
>>>>
>>>> Yes, of course. However, the issue isn’t whether tmpfs allows allocating large folios.
>>>>
>>>> The problem commit 5a90c155defa tries to fix is that when tmpfs is mounted with the 'huge=never' option, we will not allocate large folios for it. Then when writing tmpfs files, generic_perform_write() will call mapping_max_folio_size() to get the chunk size and ends up with an order-9 size for writing tmpfs files. However, this tmpfs file is populated only with small folios, resulting in a performance regression.
>>>
>>> IIUC, generic_perform_write() needs to use a small chunk if tmpfs denies huge.
>>> It seems that Kefeng did that in the first try[1]. But willy suggested
>>> the current fix.
>>>
>>> I wonder if we should revisit Kefeng’s first version.
>>>
>>> [1] https://lore.kernel.org/all/20240914140613.2334139-1-wangkefeng.wang@huawei.com/
>>
>> Personally, I still prefer the current fix (commit 5a90c155defa). We should honor the tmpfs mount option. If it explicitly says no large folios, we shouldn’t call mapping_set_large_folios(). Isn’t that more consistent with its semantics?
> 
> Filesystems wishing to turn on large folios in the pagecache should call
> ``mapping_set_large_folios`` when initializing the incore inode.
> 
> You mean tmpfs with huge option set is a FS wishing to turn on large
> folios in the pagecache, otherwise it is a FS wishing not to have large folio
> in the pagecache. tmpfs with different options is seen as different FSes.

What I mean is that tmpfs is somewhat different from other filesystems. 
We have tried to make tmpfs behave like other FSes, but differences 
remain. For example, the previous fix to tmpfs’s large folio allocation 
policy, see commit 69e0a3b49003 ("mm: shmem: fix the strategy for the 
tmpfs 'huge=' options").

So the tmpfs specific 'huge=' mount option is another way it differs 
from other filesystems.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-04-16  2:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-04-15  8:22 [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount Baolin Wang
2026-04-15  8:47 ` David Hildenbrand (Arm)
2026-04-15  9:04   ` Baolin Wang
2026-04-15  9:19     ` David Hildenbrand (Arm)
2026-04-15  9:41       ` Baolin Wang
2026-04-15  9:54         ` David Hildenbrand (Arm)
2026-04-15 10:05           ` Baolin Wang
2026-04-15 14:36             ` David Hildenbrand (Arm)
2026-04-16  1:05               ` Baolin Wang
2026-04-16  1:11                 ` Zi Yan
2026-04-16  1:22                   ` Baolin Wang
2026-04-16  1:36                     ` Zi Yan
2026-04-16  1:45                       ` Baolin Wang
2026-04-16  1:52                         ` Zi Yan
2026-04-16  2:08                           ` Baolin Wang
2026-04-15 13:45 ` Matthew Wilcox
2026-04-16  1:02   ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox