Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
	akpm@linux-foundation.org, hughd@google.com
Cc: willy@infradead.org, ziy@nvidia.com, ljs@kernel.org,
	lance.yang@linux.dev, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: shmem: don't set large-order range for internal shmem mount
Date: Wed, 15 Apr 2026 17:41:20 +0800	[thread overview]
Message-ID: <2d138a3f-0006-4a01-852a-4570d7ba781d@linux.alibaba.com> (raw)
In-Reply-To: <c88ea567-6b03-4540-8173-7fceeac23f7d@kernel.org>



On 4/15/26 5:19 PM, David Hildenbrand (Arm) wrote:
> On 4/15/26 11:04, Baolin Wang wrote:
>>
>>
>> On 4/15/26 4:47 PM, David Hildenbrand (Arm) wrote:
>>> On 4/15/26 10:22, Baolin Wang wrote:
>>>> Anonymous shmem large order allocations are dynamically controlled
>>>> via the
>>>> global THP sysfs knob (/sys/kernel/mm/transparent_hugepage/
>>>> shmem_enabled)
>>>> and the per-size mTHP knobs (/sys/kernel/mm/transparent_hugepage/
>>>> hugepages-<size>kB/shmem_enabled).
>>>>
>>>> Therefore, anonymous shmem uses shmem_allowable_huge_orders() to check
>>>> which large orders are allowed, rather than relying on
>>>> mapping_max_folio_order().
>>>> Moreover, mapping_max_folio_order() is intended to control large order
>>>> allocations only for tmpfs mounts. Clarify this by not setting a
>>>> large-order
>>>> range for internal shmem mount (e.g. anonymous shmem), to avoid
>>>> confusion,
>>>> as discussed in the previous thread[1].
>>>>
>>>> [1] https://lore.kernel.org/all/
>>>> ec927492-4577-4192-8fad-85eb1bb43121@linux.alibaba.com/
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---
>>>> Changes from v1:
>>>>    - Update the comments and commit message, per Lance.
>>>> ---
>>>>    mm/shmem.c | 12 ++++++++++--
>>>>    1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>>> index 4ecefe02881d..568e1baee90d 100644
>>>> --- a/mm/shmem.c
>>>> +++ b/mm/shmem.c
>>>> @@ -3088,8 +3088,16 @@ static struct inode *__shmem_get_inode(struct
>>>> mnt_idmap *idmap,
>>>>        if (sbinfo->noswap)
>>>>            mapping_set_unevictable(inode->i_mapping);
>>>>    -    /* Don't consider 'deny' for emergencies and 'force' for
>>>> testing */
>>>> -    if (sbinfo->huge)
>>>> +    /*
>>>> +     * Only set the large order range for tmpfs mounts. The large order
>>>> +     * selection for the internal shmem mount is configured dynamically
>>>> +     * via the 'shmem_enabled' interfaces, so there is no need to set a
>>>> +     * large order range for the internal shmem mount's mapping.
>>>> +     *
>>>> +     * Note: Don't consider 'deny' for emergencies and 'force' for
>>>> +     * testing.
>>>> +     */
>>>> +    if (sbinfo->huge && !(sb->s_flags & SB_KERNMOUNT))
>>>>            mapping_set_large_folios(inode->i_mapping);
>>>
>>> I don't like that special casing. In an ideal world, any mapping that
>>> supports large folios would indicate that.
>>>
>>> Now, which large folios to allocate is a different question.
>>>
>>> What's the problem with indicating for all shmem mappings that support
>>> large folios that support, but handling *which* folio sizes to allocate
>>> elsewhere?
>>
>> Thanks for taking a look.
> 
> Sorry for the late feedback.

No worries:)

> 
>>
>> As I mentioned, the original logic has several issues for anonymous shmem:
>>
>> 1. Whether anonymous shmem supports large folios can be dynamically
>> configured via sysfs interfaces, so mapping_set_large_folios() set
>> during initialization cannot accurately reflect whether anonymous shmem
>> actually supports large folios.
> 
> Well, the mapping does support large folios, just the folio allocations
> are currently disable.
> 
> It feels cleaner to say "there might be large folios in this mapping"
> than saying "there are no large folios in the mapping as the mapping
> does not support it", no?

Yes, that makes sense.

However, it’s also possible that the mapping does not support large 
folios, yet anonymous shmem can still allocate large folios via the 
sysfs interfaces. That doesn't make sense, right?


>> 2. Calling mapping_set_large_folios() here by default makes anonymous
>> shmem support 'MAX_PAGECACHE_ORDER' by default. However, the range of
>> large orders supported by anonymous shmem is also dynamically
>> configurable via sysfs interfaces, which could cause more confusion.
> 
> Fair enough. The mapping supports it, we just don't want to allocate
> some orders (right now).

OK. Make sense.

>> 3. Currently, no users will call mapping_large_folio_support() related
>> functions to determine whether large folios are supported for anonymous
>> shmem.
> 
> Right, we special-case shmem all over the place :) For example, in
> khugepaged. I wonder if that could help with Zi's changes to get rid of
> some shmem checks.

Sure. I'm also reviewing Zi's series.

> What if we say:
> 
> shmem that *will never have*/*does never allow* large folios never sets
> mapping_set_large_folios().
> 
> shmem that *might* have large folios (in the past, now, or in the
> future) sets mapping_set_large_folios().

For the current anonymous shmem (tmpfs is already clear, no questions), 
I don’t think there will be any "will never have/does never allow" 
cases, because it can be changed dynamically via the sysfs interfaces.

If we still want that logic, then for anonymous shmem we can treat it as 
always "might have large folios".

>> Therefore, rather than having anonymous shmem call
>> mapping_set_large_folios() and introduce so much confusion, I'd prefer
>> to exclude anonymous shmem from calling mapping_set_large_folios().
> 
> I think it's more confusing to end up with large folios in a mapping
> that claims to not support large folios?

As for 1, it still doesn’t make sense to me.

next prev parent reply	other threads:[~2026-04-15  9:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  8:22 Baolin Wang
2026-04-15  8:47 ` David Hildenbrand (Arm)
2026-04-15  9:04   ` Baolin Wang
2026-04-15  9:19     ` David Hildenbrand (Arm)
2026-04-15  9:41       ` Baolin Wang [this message]
2026-04-15  9:54         ` David Hildenbrand (Arm)
2026-04-15 10:05           ` Baolin Wang
2026-04-15 14:36             ` David Hildenbrand (Arm)
2026-04-15 13:45 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d138a3f-0006-4a01-852a-4570d7ba781d@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=hughd@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox