From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45D0DC2BD09 for ; Mon, 1 Jul 2024 11:12:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A4BEE6B00AD; Mon, 1 Jul 2024 07:12:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FC266B00AE; Mon, 1 Jul 2024 07:12:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C3F36B00B0; Mon, 1 Jul 2024 07:12:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6E4116B00AD for ; Mon, 1 Jul 2024 07:12:41 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0ED4B121B8A for ; Mon, 1 Jul 2024 11:12:41 +0000 (UTC) X-FDA: 82290920922.19.9322A00 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf09.hostedemail.com (Postfix) with ESMTP id 32B7D140018 for ; Mon, 1 Jul 2024 11:12:36 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=LOIa7iBv; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719832337; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y+VMaG1F3anoAtn1Tdo4ZeCIuoG31jZY6NdwJKdEzqI=; b=70jM04sr/TtCtKYZUUMJxNHGXk+snHme1d13KD1dBRWkMZZFbYjlgGGt1P/sM0Cdy/fcvA S9DCwjcdLRMCqautEWP4/HOwSe34lR6QSb5QSyfwlfeqpm/SwoWdfc+AK7BR450LLQQ76P PkR0XdhtnzoyXu2sfKz3zSUcd4Itq5s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719832337; a=rsa-sha256; cv=none; b=8U14IT11PHP97JRqnLfyADh39bzl/j0ZJfATHU+C3ZHXKmCQxF8bnfWdGQtdmR5D9gPYi/ m0bBCjV+lXdW/ZDq+IcIdHSmlX9IzzCLri0P2nwPPr322ZBsC9o+0MEs+ROIM5sXRwa0uA /3VpUQCY+QlMIrTNE4SCOH15uXwWWrE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=LOIa7iBv; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1719832354; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=Y+VMaG1F3anoAtn1Tdo4ZeCIuoG31jZY6NdwJKdEzqI=; b=LOIa7iBvOixLtyQinVwYmz8D6co+EmAoadDWVcqVF86nJL97bLPFxwfnuNK7ovaSMJGpqvCIic91nzMqzVgLD6PSbSX2xBfZsaWo6DZE0z1ofJjShA1CJ7+Qbefd5ymQCUMVG2QWEYVTrl+vQV3j2gP8ilZtFY6glrj+mhFMQdw= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R281e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0W9dfsZN_1719832352; Received: from 30.97.56.67(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W9dfsZN_1719832352) by smtp.aliyun-inc.com; Mon, 01 Jul 2024 19:12:33 +0800 Message-ID: <2910ddc3-b05f-4394-8288-6e3c321fffee@linux.alibaba.com> Date: Mon, 1 Jul 2024 19:12:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] support "THPeligible" semantics for mTHP with anonymous shmem To: Bang Li , Ryan Roberts , hughd@google.com, akpm@linux-foundation.org Cc: david@redhat.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240628104926.34209-1-libang.li@antgroup.com> <4b38db15-0716-4ffb-a38b-bd6250eb93da@arm.com> <4d54880e-03f4-460a-94b9-e21b8ad13119@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 32B7D140018 X-Stat-Signature: ej6xu53oyckppngk6s7r99uupwxke4bk X-HE-Tag: 1719832356-890501 X-HE-Meta: U2FsdGVkX1/Z7QgUBD70NXAUhEt4g3ID3RHu2kvhVKVmTp0Q6VucPo0zOFTLvp/FtXOHK1M+7XeBKLugr3tLUsr4f7IJxIu2zlDvW6JAZaVREK/urVt52FwFhII6U9nvgi9vqC/fLta7faEd/6ypmJFZqIL5fU/39YHmFQpHc/tW3wPIdXAdGBpB09TvmSUfo+qlBUOx1hSiIIO0xCACo+M7YYKCWZIXALo0IMOOTMSmT+wBLEyy5L4U07Tk1NjdYRYtNeTDF1igB8YwG/qnbvHcXq6EiyXnQxG/kbmJc8nLygN9euFZ364Hn5zRs/PY7d234RMB5WHeq7ba6VXgUgoJCQtZTsI+Ce60HcuLDS1Wd3V81NjJy7eREtTvDDgq2iN4Ekkp4Jt1bpt9uXgYtKwIz6zkIBJLYRML+rzDmqA9svvFAKJdKDGkNTjQgZS0EbPLJ7O8S3AezeturixOJxeMWHVBFXUEQmBnlx/vpg/wXMLs7o/wnOr0kNEwFrUdocJ4WFRIMLFRsWTtVyDpcje5pwwPnJ0I1JEUBGjAIuRu0pWpDqoIC3lkU6spMgb2bIppmkkKntmSu0hZzI78E33VUkl16C1t5sAG1YIz2q+lHeYoCDrHxEMsWpg4Hxis9gTGu7vGVEgf/QC88s3vdfyaVCkx30H7A2TNcyew6ieC1V3EitHEgCadd1vSSglFYgpi8c9UkDcl/PqhElFByykihZj27XCiWEu781LEqs/FUq+gZehKPj4e/fJLnyVh/7MMIzSz46q18SrHipKLQwuQeW34X4ZfSKIoBH5yS4XFzEa4ye0f8d0KhX5FgpdT6a8aU1ngjaDxlyZYhEOTcRcb8I4GdhIzMiSQ4/M+1x/kLL7iUK5+VNaLuMo+Ax14w21CQMhJXORXbO74ZxCy7EtWjzSe+hAO+csg3QRZk/bHWMp63r8Z5uufadzWsA6MWeKjPB3JPvIC0eDQeKy WmVsLEhu GDvu/mzyS4jmGYKxI5T55hRUhvCTXBEWtvVUCEJL09U8eWkxJ1z8VXH/uaL7KjH76NU8P3SE5IZK1cMhGoPWYATxhmCvqg5MfAXyGzN6qcPpMCZH7fa3PRnhYg4MmEdZu6Vq3h2JaXVL0A4I7+P/mybRWF/Wv5w/C5p8Xe1e9nNKLaBUN82JZpwIj5oxGq7N2Ai0uumc1NOXItD7kNi7ggjAYVsFIX0PypxW+f5+jOjO51+aPIQljDXsI1WjZ5YbrOrjk6E7E4ab1zSZVFUcx1nAlEV+Ny+pnXiX0EhZ2z1Su14W9ypMMTxdUSKe+hmUE7Se/nlMFEBcs47zEVHzbC1wKQfFAo6B3hO+IbvbN2APkzq1ehUqZ4VBQ+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/1 17:43, Bang Li wrote: > Hi, Baolin > > On 2024/7/1 16:33, Baolin Wang wrote: >> >> >> On 2024/7/1 15:55, Ryan Roberts wrote: >>> On 28/06/2024 11:49, Bang Li wrote: >>>> After the commit 7fb1b252afb5 ("mm: shmem: add mTHP support for >>>> anonymous shmem"), we can configure different policies through >>>> the multi-size THP sysfs interface for anonymous shmem. But >>>> currently "THPeligible" indicates only whether the mapping is >>>> eligible for allocating THP-pages as well as the THP is PMD >>>> mappable or not for anonymous shmem, we need to support semantics >>>> for mTHP with anonymous shmem similar to those for mTHP with >>>> anonymous memory. >>>> >>>> Signed-off-by: Bang Li >>>> --- >>>>   fs/proc/task_mmu.c      | 10 +++++++--- >>>>   include/linux/huge_mm.h | 11 +++++++++++ >>>>   mm/shmem.c              |  9 +-------- >>>>   3 files changed, 19 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>>> index 93fb2c61b154..09b5db356886 100644 >>>> --- a/fs/proc/task_mmu.c >>>> +++ b/fs/proc/task_mmu.c >>>> @@ -870,6 +870,7 @@ static int show_smap(struct seq_file *m, void *v) >>>>   { >>>>       struct vm_area_struct *vma = v; >>>>       struct mem_size_stats mss = {}; >>>> +    bool thp_eligible; >>>>       smap_gather_stats(vma, &mss, 0); >>>> @@ -882,9 +883,12 @@ static int show_smap(struct seq_file *m, void *v) >>>>       __show_smap(m, &mss, false); >>>> -    seq_printf(m, "THPeligible:    %8u\n", >>>> -           !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>> -               TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL)); >>>> +    thp_eligible = !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>> +                        TVA_SMAPS | TVA_ENFORCE_SYSFS, >>>> THP_ORDERS_ALL); >>>> +    if (vma_is_anon_shmem(vma)) >>>> +        thp_eligible = >>>> !!shmem_allowable_huge_orders(file_inode(vma->vm_file), >>>> +                            vma, vma->vm_pgoff, thp_eligible); >>> >>> Afraid I haven't been following the shmem mTHP support work as much >>> as I would >>> have liked, but is there a reason why we need a separate function for >>> shmem? >> >> Since shmem_allowable_huge_orders() only uses shmem specific logic to >> determine if huge orders are allowable, there is no need to complicate >> the thp_vma_allowable_orders() function by adding more shmem related >> logic, making it more bloated. In my view, providing a dedicated >> helper shmem_allowable_huge_orders(), specifically for shmem, >> simplifies the logic. >> >> IIUC, I agree with David's suggestion that the >> shmem_allowable_huge_orders() helper function could be used in >> thp_vma_allowable_orders() to support shmem mTHP. Something like: >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index c7ce28f6b7f3..9677fe6cf478 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -151,10 +151,13 @@ unsigned long __thp_vma_allowable_orders(struct >> vm_area_struct *vma, >>           * Must be done before hugepage flags check since shmem has its >>           * own flags. >>           */ >> -       if (!in_pf && shmem_file(vma->vm_file)) >> -               return shmem_is_huge(file_inode(vma->vm_file), >> vma->vm_pgoff, >> -                                    !enforce_sysfs, vma->vm_mm, >> vm_flags) >> -                       ? orders : 0; >> +       if (!in_pf && shmem_file(vma->vm_file)) { >> +               bool global_huge = >> shmem_is_huge(file_inode(vma->vm_file), vma->vm_pgoff, >> +                                    !enforce_sysfs, vma->vm_mm, >> vm_flags); >> + >> +               return >> shmem_allowable_huge_orders(file_inode(vma->vm_file), >> +                                       vma, vma->vm_pgoff, global_huge); >> +       } >> >>          if (!vma_is_anonymous(vma)) { >>                  /* >> >>> Couldn't (shouldn't) thp_vma_allowable_orders() be taught to handle >>> shmem too? >>> >>>> +    seq_printf(m, "THPeligible:    %8u\n", thp_eligible); >>>>       if (arch_pkeys_enabled()) >>>>           seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma)); >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>> index 212cca384d7e..f87136f38aa1 100644 >>>> --- a/include/linux/huge_mm.h >>>> +++ b/include/linux/huge_mm.h >>>> @@ -267,6 +267,10 @@ unsigned long thp_vma_allowable_orders(struct >>>> vm_area_struct *vma, >>>>       return __thp_vma_allowable_orders(vma, vm_flags, tva_flags, >>>> orders); >>>>   } >>>> +unsigned long shmem_allowable_huge_orders(struct inode *inode, >>>> +                struct vm_area_struct *vma, pgoff_t index, >>>> +                bool global_huge); >>>> + >>>>   struct thpsize { >>>>       struct kobject kobj; >>>>       struct list_head node; >>>> @@ -460,6 +464,13 @@ static inline unsigned long >>>> thp_vma_allowable_orders(struct vm_area_struct *vma, >>>>       return 0; >>>>   } >>>> +static inline unsigned long shmem_allowable_huge_orders(struct >>>> inode *inode, >>>> +                struct vm_area_struct *vma, pgoff_t index, >>>> +                bool global_huge) >>>> +{ >>>> +    return 0; >>>> +} >>>> + >>>>   #define transparent_hugepage_flags 0UL >>>>   #define thp_get_unmapped_area    NULL >>>> diff --git a/mm/shmem.c b/mm/shmem.c >>>> index d495c0701a83..aa85df9c662a 100644 >>>> --- a/mm/shmem.c >>>> +++ b/mm/shmem.c >>>> @@ -1622,7 +1622,7 @@ static gfp_t limit_gfp_mask(gfp_t huge_gfp, >>>> gfp_t limit_gfp) >>>>   } >>>>   #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>> -static unsigned long shmem_allowable_huge_orders(struct inode *inode, >>>> +unsigned long shmem_allowable_huge_orders(struct inode *inode, >>>>                   struct vm_area_struct *vma, pgoff_t index, >>>>                   bool global_huge) >>>>   { >>>> @@ -1707,13 +1707,6 @@ static unsigned long >>>> shmem_suitable_orders(struct inode *inode, struct vm_fault >>>>       return orders; >>>>   } >>>>   #else >>>> -static unsigned long shmem_allowable_huge_orders(struct inode *inode, >>>> -                struct vm_area_struct *vma, pgoff_t index, >>>> -                bool global_huge) >>>> -{ >>>> -    return 0; >>>> -} >>>> - >>>>   static unsigned long shmem_suitable_orders(struct inode *inode, >>>> struct vm_fault *vmf, >>>>                          struct address_space *mapping, pgoff_t index, >>>>                          unsigned long orders) > > Thanks for the reference code. Currently, we only implement the mTHP of > anonymous shmem, so we only need to handle anonymous shmem specially. As > shown in the following code: > > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -151,10 +151,14 @@ unsigned long __thp_vma_allowable_orders(struct > vm_area_struct *vma, >          * Must be done before hugepage flags check since shmem has its >          * own flags. >          */ > -       if (!in_pf && shmem_file(vma->vm_file)) > -               return shmem_is_huge(file_inode(vma->vm_file), > vma->vm_pgoff, > -                                    !enforce_sysfs, vma->vm_mm, vm_flags) > -                       ? orders : 0; > +       if (!in_pf && shmem_file(vma->vm_file)) { > +               bool global_huge = > shmem_is_huge(file_inode(vma->vm_file), vma->vm_pgoff, > +                                    !enforce_sysfs, vma->vm_mm, vm_flags); Nit: add a blank line after the declaration. Otherwise looks good to me. > +               if (!vma_is_anon_shmem(vma)) > +                       return global_huge? orders : 0; > +               return > shmem_allowable_huge_orders(file_inode(vma->vm_file), > +                                               vma, vma->vm_pgoff, > global_huge); > +       } > >         if (!vma_is_anonymous(vma)) { > > Thanks, > Bang