From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4C58CED617 for ; Wed, 9 Oct 2024 07:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E6AD6B00C3; Wed, 9 Oct 2024 03:10:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3967C6B00CF; Wed, 9 Oct 2024 03:10:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25E5A6B00D1; Wed, 9 Oct 2024 03:10:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 01B606B00C3 for ; Wed, 9 Oct 2024 03:09:59 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6117AABC23 for ; Wed, 9 Oct 2024 07:09:55 +0000 (UTC) X-FDA: 82653189318.16.A524F2A Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf23.hostedemail.com (Postfix) with ESMTP id 6E3DD140005 for ; Wed, 9 Oct 2024 07:09:56 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728457728; a=rsa-sha256; cv=none; b=HkLFuih1RJiwmD6HshmsXJs0oAwfKzHA49SkkFmyIcbSTtME9yHoQBQr1x57eI/U0/KTT4 HLCfOOmt7rtRHJJauwvASDoLYAru/+YcWvzf/7ljJtg9+jLACjmlb8us1e4Z84Uro67n1q tDXwQkN14tlk8ZgkZ0TS6S1g2S5QlMk= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728457728; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ttL7WD46xACCQ/+fj0ibkndnmcaYymP0kOJEnmvJtno=; b=aCLQMpcbX+Hmv5vKyegb7CjltqPlqr1gEsYfgyTgw1G0tB8Gi2AQ+vAYTN30d9NRxb0E6k v8l0WCsoHDC6qtUF5cT38EgAel6ZRhCMX1sHQlmfR30r28HJFBa6VIdgwPWF3vTHEgZRZC 5FhkjVT+VgATLmjh8/dhee1Tp7ebqhs= Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4XNkVw1rxRzZhkX; Wed, 9 Oct 2024 15:08:08 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id A60291800A5; Wed, 9 Oct 2024 15:09:51 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 9 Oct 2024 15:09:51 +0800 Message-ID: <7d76fe98-4f7f-4f3d-9e8e-79d836f945cb@huawei.com> Date: Wed, 9 Oct 2024 15:09:50 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Kefeng Wang Subject: Re: [PATCH v2] tmpfs: fault in smaller chunks if large folio allocation not allowed To: Baolin Wang , Matthew Wilcox , "Pankaj Raghav (Samsung)" CC: Andrew Morton , Hugh Dickins , Alexander Viro , Christian Brauner , Jan Kara , Anna Schumaker , , References: <20240914140613.2334139-1-wangkefeng.wang@huawei.com> <20240920143654.1008756-1-wangkefeng.wang@huawei.com> <1d4f98aa-f57d-4801-8510-5c44e027c4e4@huawei.com> <1e5357de-3356-4ae7-bc69-b50edca3852b@linux.alibaba.com> <8c5d01b2-f070-4395-aa72-5ad56d6423e5@huawei.com> <314f1320-43fd-45d5-a80c-b8ea90ae4b1b@linux.alibaba.com> <2769e603-d35e-4f3e-83cf-509127b1797e@huawei.com> <72170ff2-f23d-4246-abe8-15270ad1bb39@linux.alibaba.com> Content-Language: en-US In-Reply-To: <72170ff2-f23d-4246-abe8-15270ad1bb39@linux.alibaba.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemf100008.china.huawei.com (7.185.36.138) X-Stat-Signature: 68cgnx1h3wnoz949t3pha7ef71b9r4ub X-Rspamd-Queue-Id: 6E3DD140005 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1728457796-762632 X-HE-Meta: U2FsdGVkX1+RLhrpdeEXNm9KuHqxXwyxK4DVKeS5uj+1g9ZQRlwD3BlwdPJtlvLeQ/G689fUBkJRejwkAUtX5yZKeuMlL+Yvw1p/J6OBGpqY4Vqs1cXOzC9ZKtd5EcfIUsJYlDY32nUbH1bb7AsEXEIgYDNhdjRvTkdqo9+eXxl9j0nkODe9I4MJeqKQdcw06YTx+IWnWKgIv6UggyxWjHE8o3zw8DRHvkxJ706Mgb42uu56naWwcYaNMniGaeoiURRaMkwEeRYcXMW+pe/ibaYkhXKZCWORH+rmHUOSYfzgMkKEumHrEKI+wQgmjgNvk/ugMwMPUlMs1o6o9Qd7HnXwH8ORBdlgGE/2f1PFWeID06ZeP9f+jg97LrQwkbrtCJkG9HVChQ9SVE876EReWTFhFytcYNRps43W0Tk0PK2ZcGG31EbAeyGBQSmQAvyXrO31ANaiLf+AXJCDYo4t4m5Khlu2gC8YPYAH8XHt9jtuOHVjzAJ0Uskg3sPi76o3ehbU64gPkY3ahcZXdHiu1EvyaxeMZC+qCYIXuoOxgRH72Y+b9FJNIusflLILgrDS72xEwN6EGXg5TP601RgyqVDb3x1ScTbl3CmqMhY5KhjSF+3vKZMc6N7CfhPgbcMULGAq6+0SuIExohVoGdH/i6ofkgXHpKfqqn7+wcvDbGrnb8wuP+GAIxrDVg3NR49LjbToVIIG/inXyD0naYIBnNKDrXP4wZ6lP5epPitCN8H7ytWYcO5nujQbsLcz+IY9RT2MYUlE7Di2UgaSS6IQkDcQUvjw5EFw1C+cDVPnue1c299NKcG4Lj1tgnbUCQU8eM+wZKClmWGmPXKgKfz55SbLq8zhkxg9TPYsK/IGmpZsTJWhhYc+MEBpv9HdGja9LuYYfGDae7ftd9ICGaldZpi03cAS6p8+uIavnoOxAWQvljZYHsHW4t4XN/7uwG4PvpbuS2JzEJZmYdJbi2G 0ndxdcSd pYFhJFpOkcFnhlGj8frAahOXRvjX5BGZSmDj4ocVAysV6Fl7YWgnUf3GipolAPgcW7IFZH8fjEOXVilBhnVz5dVOSPKZpglWQbCp2Flv6I5NCj1/Y4mQwUQVTrfxJtjjApbb4EWSe+PJ9IQUpQ9eRDXbdE6GcqidDB42KJCgSjjqXjaFtw1zaM2ovUlBIT+k8MkRJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/9/30 14:48, Baolin Wang wrote: > > > On 2024/9/30 11:15, Kefeng Wang wrote: >> >> >> On 2024/9/30 10:52, Baolin Wang wrote: >>> >>> >>> On 2024/9/30 10:30, Kefeng Wang wrote: >>>> >>>> >>>> On 2024/9/30 10:02, Baolin Wang wrote: >>>>> >>>>> >>>>> On 2024/9/26 21:52, Matthew Wilcox wrote: >>>>>> On Thu, Sep 26, 2024 at 10:38:34AM +0200, Pankaj Raghav (Samsung) >>>>>> wrote: >>>>>>>> So this is why I don't use mapping_set_folio_order_range() here, >>>>>>>> but >>>>>>>> correct me if I am wrong. >>>>>>> >>>>>>> Yeah, the inode is active here as the max folio size is decided >>>>>>> based on >>>>>>> the write size, so probably mapping_set_folio_order_range() will >>>>>>> not be >>>>>>> a safe option. >>>>>> >>>>>> You really are all making too much of this.  Here's the patch I >>>>>> think we >>>>>> need: >>>>>> >>>>>> +++ b/mm/shmem.c >>>>>> @@ -2831,7 +2831,8 @@ static struct inode >>>>>> *__shmem_get_inode(struct mnt_idmap *idmap, >>>>>>          cache_no_acl(inode); >>>>>>          if (sbinfo->noswap) >>>>>>                  mapping_set_unevictable(inode->i_mapping); >>>>>> -       mapping_set_large_folios(inode->i_mapping); >>>>>> +       if (sbinfo->huge) >>>>>> +               mapping_set_large_folios(inode->i_mapping); >>>>>> >>>>>>          switch (mode & S_IFMT) { >>>>>>          default: >>>>> >>>>> IMHO, we no longer need the the 'sbinfo->huge' validation after >>>>> adding support for large folios in the tmpfs write and fallocate >>>>> paths [1]. >> >> Forget to mention, we still need to check sbinfo->huge, if mount with >> huge=never, but we fault in large chunk, write is slower than without >> 9aac777aaf94, the above changes or my patch could fix it. > > My patch will allow allocating large folios in the tmpfs write and > fallocate paths though the 'huge' option is 'never'. Yes, indeed after checking your patch, The Writing intelligently from 'Bonnie -d /mnt/tmpfs/ -s 1024' based on next-20241008, 1) huge=never the base: 2016438 K/Sec my v1/v2 or Matthew's patch : 2874504 K/Sec your patch with filemap_get_order() fix: 6330604 K/Sec 2) huge=always the write performance: 7168917 K/Sec Since large folios supported in the tmpfs write, we do have better performance shown above, that's great. > > My initial thought for supporting large folio is that, if the 'huge' > option is enabled, to maintain backward compatibility, we only allow 2M > PMD-sized order allocations. If the 'huge' option is > disabled(huge=never), we still allow large folio allocations based on > the write length. > > Another choice is to allow the different sized large folio allocation > based on the write length when the 'huge' option is enabled, rather than > just the 2M PMD sized. But will force the huge orders off if 'huge' > option is disabled. > "huge=never Do not allocate huge pages. This is the default." From the document, it's better not to allocate large folio, but we need some special handle for huge=never or runtime deny/force. > Still need some discussions to determine which method is preferable. Personally. I like your current implementation, but it does not match document.