From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>,
Matthew Wilcox <willy@infradead.org>,
"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Anna Schumaker <Anna.Schumaker@netapp.com>,
<linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH v2] tmpfs: fault in smaller chunks if large folio allocation not allowed
Date: Wed, 9 Oct 2024 15:09:50 +0800 [thread overview]
Message-ID: <7d76fe98-4f7f-4f3d-9e8e-79d836f945cb@huawei.com> (raw)
In-Reply-To: <72170ff2-f23d-4246-abe8-15270ad1bb39@linux.alibaba.com>
On 2024/9/30 14:48, Baolin Wang wrote:
>
>
> On 2024/9/30 11:15, Kefeng Wang wrote:
>>
>>
>> On 2024/9/30 10:52, Baolin Wang wrote:
>>>
>>>
>>> On 2024/9/30 10:30, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2024/9/30 10:02, Baolin Wang wrote:
>>>>>
>>>>>
>>>>> On 2024/9/26 21:52, Matthew Wilcox wrote:
>>>>>> On Thu, Sep 26, 2024 at 10:38:34AM +0200, Pankaj Raghav (Samsung)
>>>>>> wrote:
>>>>>>>> So this is why I don't use mapping_set_folio_order_range() here,
>>>>>>>> but
>>>>>>>> correct me if I am wrong.
>>>>>>>
>>>>>>> Yeah, the inode is active here as the max folio size is decided
>>>>>>> based on
>>>>>>> the write size, so probably mapping_set_folio_order_range() will
>>>>>>> not be
>>>>>>> a safe option.
>>>>>>
>>>>>> You really are all making too much of this. Here's the patch I
>>>>>> think we
>>>>>> need:
>>>>>>
>>>>>> +++ b/mm/shmem.c
>>>>>> @@ -2831,7 +2831,8 @@ static struct inode
>>>>>> *__shmem_get_inode(struct mnt_idmap *idmap,
>>>>>> cache_no_acl(inode);
>>>>>> if (sbinfo->noswap)
>>>>>> mapping_set_unevictable(inode->i_mapping);
>>>>>> - mapping_set_large_folios(inode->i_mapping);
>>>>>> + if (sbinfo->huge)
>>>>>> + mapping_set_large_folios(inode->i_mapping);
>>>>>>
>>>>>> switch (mode & S_IFMT) {
>>>>>> default:
>>>>>
>>>>> IMHO, we no longer need the the 'sbinfo->huge' validation after
>>>>> adding support for large folios in the tmpfs write and fallocate
>>>>> paths [1].
>>
>> Forget to mention, we still need to check sbinfo->huge, if mount with
>> huge=never, but we fault in large chunk, write is slower than without
>> 9aac777aaf94, the above changes or my patch could fix it.
>
> My patch will allow allocating large folios in the tmpfs write and
> fallocate paths though the 'huge' option is 'never'.
Yes, indeed after checking your patch,
The Writing intelligently from 'Bonnie -d /mnt/tmpfs/ -s 1024' based on
next-20241008,
1) huge=never
the base: 2016438 K/Sec
my v1/v2 or Matthew's patch : 2874504 K/Sec
your patch with filemap_get_order() fix: 6330604 K/Sec
2) huge=always
the write performance: 7168917 K/Sec
Since large folios supported in the tmpfs write, we do have better
performance shown above, that's great.
>
> My initial thought for supporting large folio is that, if the 'huge'
> option is enabled, to maintain backward compatibility, we only allow 2M
> PMD-sized order allocations. If the 'huge' option is
> disabled(huge=never), we still allow large folio allocations based on
> the write length.
>
> Another choice is to allow the different sized large folio allocation
> based on the write length when the 'huge' option is enabled, rather than
> just the 2M PMD sized. But will force the huge orders off if 'huge'
> option is disabled.
>
"huge=never Do not allocate huge pages. This is the default."
From the document, it's better not to allocate large folio, but we need
some special handle for huge=never or runtime deny/force.
> Still need some discussions to determine which method is preferable.
Personally. I like your current implementation, but it does not match
document.
next prev parent reply other threads:[~2024-10-09 7:10 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-14 14:06 [PATCH -next] " Kefeng Wang
2024-09-15 10:40 ` Matthew Wilcox
2024-09-18 3:55 ` Kefeng Wang
2024-09-20 14:36 ` [PATCH v2] " Kefeng Wang
2024-09-22 0:35 ` Matthew Wilcox
2024-09-23 1:39 ` Kefeng Wang
2024-09-26 8:38 ` Pankaj Raghav (Samsung)
2024-09-26 13:52 ` Matthew Wilcox
2024-09-26 14:20 ` Kefeng Wang
2024-09-26 14:58 ` Matthew Wilcox
2024-09-30 1:27 ` Kefeng Wang
2024-09-30 2:02 ` Baolin Wang
2024-09-30 2:30 ` Kefeng Wang
2024-09-30 2:52 ` Baolin Wang
2024-09-30 3:15 ` Kefeng Wang
2024-09-30 6:48 ` Baolin Wang
2024-10-09 7:09 ` Kefeng Wang [this message]
2024-10-09 8:52 ` Baolin Wang
2024-10-11 6:59 ` [PATCH v3] tmpfs: don't enable large folios if not supported Kefeng Wang
2024-10-12 3:59 ` Baolin Wang
2024-10-14 2:36 ` Kefeng Wang
2024-10-17 14:17 ` [PATCH v4] " Kefeng Wang
2024-10-18 1:48 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d76fe98-4f7f-4f3d-9e8e-79d836f945cb@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=Anna.Schumaker@netapp.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kernel@pankajraghav.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox