linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Anna Schumaker <Anna.Schumaker@netapp.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v2] tmpfs: fault in smaller chunks if large folio allocation not allowed
Date: Wed, 9 Oct 2024 16:52:48 +0800	[thread overview]
Message-ID: <796d33c3-f97d-41ad-9ba7-99ade5dcfcee@linux.alibaba.com> (raw)
In-Reply-To: <7d76fe98-4f7f-4f3d-9e8e-79d836f945cb@huawei.com>



On 2024/10/9 15:09, Kefeng Wang wrote:
> 
> 
> On 2024/9/30 14:48, Baolin Wang wrote:
>>
>>
>> On 2024/9/30 11:15, Kefeng Wang wrote:
>>>
>>>
>>> On 2024/9/30 10:52, Baolin Wang wrote:
>>>>
>>>>
>>>> On 2024/9/30 10:30, Kefeng Wang wrote:
>>>>>
>>>>>
>>>>> On 2024/9/30 10:02, Baolin Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 2024/9/26 21:52, Matthew Wilcox wrote:
>>>>>>> On Thu, Sep 26, 2024 at 10:38:34AM +0200, Pankaj Raghav (Samsung) 
>>>>>>> wrote:
>>>>>>>>> So this is why I don't use mapping_set_folio_order_range() 
>>>>>>>>> here, but
>>>>>>>>> correct me if I am wrong.
>>>>>>>>
>>>>>>>> Yeah, the inode is active here as the max folio size is decided 
>>>>>>>> based on
>>>>>>>> the write size, so probably mapping_set_folio_order_range() will 
>>>>>>>> not be
>>>>>>>> a safe option.
>>>>>>>
>>>>>>> You really are all making too much of this.  Here's the patch I 
>>>>>>> think we
>>>>>>> need:
>>>>>>>
>>>>>>> +++ b/mm/shmem.c
>>>>>>> @@ -2831,7 +2831,8 @@ static struct inode 
>>>>>>> *__shmem_get_inode(struct mnt_idmap *idmap,
>>>>>>>          cache_no_acl(inode);
>>>>>>>          if (sbinfo->noswap)
>>>>>>>                  mapping_set_unevictable(inode->i_mapping);
>>>>>>> -       mapping_set_large_folios(inode->i_mapping);
>>>>>>> +       if (sbinfo->huge)
>>>>>>> +               mapping_set_large_folios(inode->i_mapping);
>>>>>>>
>>>>>>>          switch (mode & S_IFMT) {
>>>>>>>          default:
>>>>>>
>>>>>> IMHO, we no longer need the the 'sbinfo->huge' validation after 
>>>>>> adding support for large folios in the tmpfs write and fallocate 
>>>>>> paths [1].
>>>
>>> Forget to mention, we still need to check sbinfo->huge, if mount with
>>> huge=never, but we fault in large chunk, write is slower than without
>>> 9aac777aaf94, the above changes or my patch could fix it.
>>
>> My patch will allow allocating large folios in the tmpfs write and 
>> fallocate paths though the 'huge' option is 'never'.
> 
> Yes, indeed after checking your patch,
> 
> The Writing intelligently from 'Bonnie -d /mnt/tmpfs/ -s 1024' based on 
> next-20241008,
> 
> 1) huge=never
>     the base:                                    2016438 K/Sec
>     my v1/v2 or Matthew's patch :                2874504 K/Sec
>     your patch with filemap_get_order() fix:     6330604 K/Sec
> 
> 2) huge=always
>     the write performance:                       7168917 K/Sec
> 
> Since large folios supported in the tmpfs write, we do have better 
> performance shown above, that's great.

Great. Thanks for testing.

>> My initial thought for supporting large folio is that, if the 'huge' 
>> option is enabled, to maintain backward compatibility, we only allow 
>> 2M PMD-sized order allocations. If the 'huge' option is 
>> disabled(huge=never), we still allow large folio allocations based on 
>> the write length.
>>
>> Another choice is to allow the different sized large folio allocation 
>> based on the write length when the 'huge' option is enabled, rather 
>> than just the 2M PMD sized. But will force the huge orders off if 
>> 'huge' option is disabled.
>>
> 
> "huge=never  Do not allocate huge pages. This is the default."
>  From the document, it's better not to allocate large folio, but we need
> some special handle for huge=never or runtime deny/force.

Yes. I'm thinking of adding a new option (something like 'huge=mTHP') to 
allocate large folios based on the write size.

I will resend the patchset, and we can discuss it there.


  reply	other threads:[~2024-10-09  8:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-14 14:06 [PATCH -next] " Kefeng Wang
2024-09-15 10:40 ` Matthew Wilcox
2024-09-18  3:55   ` Kefeng Wang
2024-09-20 14:36 ` [PATCH v2] " Kefeng Wang
2024-09-22  0:35   ` Matthew Wilcox
2024-09-23  1:39     ` Kefeng Wang
2024-09-26  8:38       ` Pankaj Raghav (Samsung)
2024-09-26 13:52         ` Matthew Wilcox
2024-09-26 14:20           ` Kefeng Wang
2024-09-26 14:58             ` Matthew Wilcox
2024-09-30  1:27               ` Kefeng Wang
2024-09-30  2:02           ` Baolin Wang
2024-09-30  2:30             ` Kefeng Wang
2024-09-30  2:52               ` Baolin Wang
2024-09-30  3:15                 ` Kefeng Wang
2024-09-30  6:48                   ` Baolin Wang
2024-10-09  7:09                     ` Kefeng Wang
2024-10-09  8:52                       ` Baolin Wang [this message]
2024-10-11  6:59   ` [PATCH v3] tmpfs: don't enable large folios if not supported Kefeng Wang
2024-10-12  3:59     ` Baolin Wang
2024-10-14  2:36       ` Kefeng Wang
2024-10-17 14:17   ` [PATCH v4] " Kefeng Wang
2024-10-18  1:48     ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=796d33c3-f97d-41ad-9ba7-99ade5dcfcee@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox