From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4599BC27C4F for ; Tue, 11 Jun 2024 02:04:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B4606B0088; Mon, 10 Jun 2024 22:04:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53D6E6B0089; Mon, 10 Jun 2024 22:04:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3DDD76B008A; Mon, 10 Jun 2024 22:04:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1C7AC6B0088 for ; Mon, 10 Jun 2024 22:04:42 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 64BEBA208E for ; Tue, 11 Jun 2024 02:04:41 +0000 (UTC) X-FDA: 82216963962.06.DE97FE4 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by imf06.hostedemail.com (Postfix) with ESMTP id D82DC180009 for ; Tue, 11 Jun 2024 02:04:36 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="t/b2HW8t"; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718071478; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7j4g4SdK/dgS0J/c8Twl0jOBteQXn2uqyypbDOrvpVM=; b=vmmgqeXMMYHPIy1SN2dge2gx3JXqpZsBGau+8wkHud1sqE/VC2fN5Eu4n6fiFkbg/24HH1 1n9lgN6lrruswd+jHY3l50FbVYMWQQNp7d9+F/hpAz17HjWw1FEsFITriJNtV3jfXBTr8J rO5qvtpW3jvbdVpc53sl933r4O5N9fc= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="t/b2HW8t"; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718071478; a=rsa-sha256; cv=none; b=ezsv26pj3Y6RYSbeYgUBtixM1Hkd92MRjo8UK3kNoN0+VqNxhv/CDe/l43dono+9PpF/Wm ESVxEg/epLn0rB6rApv6FDxryIuQjOL8KAchCN1AtMBw5VwsgS+OGCkJI3hnezh/BDv1+D YHoKKaBsPjXOH/ZNMeZJGZlSb9GqrGc= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718071474; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=7j4g4SdK/dgS0J/c8Twl0jOBteQXn2uqyypbDOrvpVM=; b=t/b2HW8t/Wflliv86LLSBqZpwduSb1VCrfnnUln9w/tmvS/8L7nNGOcrxa+S2U7nLfeGcMaXnX/hKST3LiskawF83M16TVkLQ5tQ5n39WYM6XYyTRIp6oWqwdtVSi13Tuc33P+whfzZy8+heTLAZfE1giXdpaRVN7/FzF3Y+3ZM= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R461e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032014031;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0W8EY.xP_1718071471; Received: from 30.97.56.68(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8EY.xP_1718071471) by smtp.aliyun-inc.com; Tue, 11 Jun 2024 10:04:32 +0800 Message-ID: <6c7a8602-5b88-424c-a8c4-8a9502865d94@linux.alibaba.com> Date: Tue, 11 Jun 2024 10:04:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 3/6] mm: shmem: add multi-size THP sysfs interface for anonymous shmem To: Daniel Gomez Cc: "akpm@linux-foundation.org" , "hughd@google.com" , "willy@infradead.org" , "david@redhat.com" , "wangkefeng.wang@huawei.com" , "ying.huang@intel.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "ryan.roberts@arm.com" , "shy828301@gmail.com" , "ziy@nvidia.com" , "ioworker0@gmail.com" , Pankaj Raghav , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" References: <119966ae28bf2e2d362ae3d369ac1a1cd27ba866.1717495894.git.baolin.wang@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D82DC180009 X-Stat-Signature: d5xxtfyacsscpsyf36erywgdgxaiun1e X-HE-Tag: 1718071476-168216 X-HE-Meta: U2FsdGVkX1+3zFOIh1lGeimvX1ufhwQFwm+2LVee/F7BZt1Yfl18ypzgXGzgZDwhJpvoMvgjQTBulUCh9tJGNi0Kk645/xrbQMpHM++orHhzBBldyK+2S7fj3ZDPFudOi6UL9lxxkPPUOZN4I2bmRWLN++lTam9NtG4lsxP/i+NmL8RQpdBWN9MwaBE7jTjrcNVqL9xmPqcA5X3GAr9qZbU0pLwcfa/IdSn6x5Xcxb1Js/KsIUdCQ5N0KYP8pNetEh94SvLFtWaoO7ezdF3xPe1CpqeohHxda9Nwkd44FD/v9CH92L+216oWddqfVzBIs7sIp174eJFfAj66ydr3qs9anV6rY9YZbiggFsLm9g7GOi1uD407EUpfDJI5dwVBqhDIL9q8ZY8q5/wUwwzo1iaiAtFRAyfx1CLxmFsbP5jNp9Ff1rsjQcEjjE/D2bZLRDG7Z05XNJFpU37AazaSoH2nA8BOZbW0Qrk/LY50DQgq0Bcu9uEZYLrvvCCerInb1zsV/7hHBGIcOHwC0N8O/nxMiVyc52rQVVBHifzVj4tyb0stORkuk63qLTnVvFQ3PDI2QRDMcSET2SrKWL0dObI312Ewld41PqbHRiV8vH5RNPtzgPpN3JzVJC6GYvm2pCRGMEbK31JdRTKr6VNOXfakVG0bSJ5zOiFN1mxcRTbh07r0XFjkQIg+qZ4gJW1SPZE+OrCpdf/ee8De1cP0Z2yPp4GX5tcUDfHxs5jp0SXfgmN1cKrv0/Zc0cKlNkjv7U58tCw4gbYTrJ+hh+6Z82iv7qFLWSDN4PYE2N0CGdrpF1q3emWUfY4qQokSLSEHpoOw6z35zS66jy55hhyiNmjZENESXnbjetcLFdtctmza4TBAd2gOC8905R7y3ft77PXK5+yKCuS2u9W/57L8xlxWNTPdWvp+mkdTkuegtA631lYp4e8bCfB3qbMSDepa0NR5aOxS2fd3nxrluMx egvEwWDD 4bVhQS05xXweTCcfUB2MRlrxKT8rd51qujYUMhMpOIanGQ/9LwRn2or97yE9FwVPrRijJyVpnmvrkRBsuPDSwsNGrcyCkfyrBIW32c2kere/FrwCAsDcUfBfOGQdR0lmjtcAr57zzq/y98soREEWRVqeUPIpX4g4/PfDBt5v9/iw9UnsfPWdS+AkjHdIZL7M/qk7/OfEkfoVc4L0ZRp7LSZM7q0PmutmvbahWAZOuEM1VoJ1D8cXPY8cZcxo+HWWgcwCt0s739HUsfxk7VWe6c1cvxNcX/rEOHMD8zsU4Xbv8veEaBLM10JPYVmoZgrxxskEImldyQ7WOgInSMpvUlUQty6Im2Zz2n/AeWIpYmzlIf8yX6ju7mGR4eTbakGcgxlSS8dgPrWcvew8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/6/10 20:23, Daniel Gomez wrote: > Hi Baolin, > On Tue, Jun 04, 2024 at 06:17:47PM +0800, Baolin Wang wrote: >> To support the use of mTHP with anonymous shmem, add a new sysfs interface >> 'shmem_enabled' in the '/sys/kernel/mm/transparent_hugepage/hugepages-kB/' >> directory for each mTHP to control whether shmem is enabled for that mTHP, >> with a value similar to the top level 'shmem_enabled', which can be set to: >> "always", "inherit (to inherit the top level setting)", "within_size", "advise", >> "never". An 'inherit' option is added to ensure compatibility with these >> global settings, and the options 'force' and 'deny' are dropped, which are >> rather testing artifacts from the old ages. >> >> By default, PMD-sized hugepages have enabled="inherit" and all other hugepage >> sizes have enabled="never" for '/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/shmem_enabled'. >> >> In addition, if top level value is 'force', then only PMD-sized hugepages >> have enabled="inherit", otherwise configuration will be failed and vice versa. >> That means now we will avoid using non-PMD sized THP to override the global >> huge allocation. >> >> Signed-off-by: Baolin Wang >> --- >> Documentation/admin-guide/mm/transhuge.rst | 23 ++++++ >> include/linux/huge_mm.h | 10 +++ >> mm/huge_memory.c | 11 +-- >> mm/shmem.c | 96 ++++++++++++++++++++++ >> 4 files changed, 132 insertions(+), 8 deletions(-) >> >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >> index d414d3f5592a..b76d15e408b3 100644 >> --- a/Documentation/admin-guide/mm/transhuge.rst >> +++ b/Documentation/admin-guide/mm/transhuge.rst >> @@ -332,6 +332,29 @@ deny >> force >> Force the huge option on for all - very useful for testing; >> >> +Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to control >> +mTHP allocation: '/sys/kernel/mm/transparent_hugepage/hugepages-kB/shmem_enabled', >> +and its value for each mTHP is essentially consistent with the global setting. >> +An 'inherit' option is added to ensure compatibility with these global settings. >> +Conversely, the options 'force' and 'deny' are dropped, which are rather testing >> +artifacts from the old ages. >> +always >> + Attempt to allocate huge pages every time we need a new page; >> + >> +inherit >> + Inherit the top-level "shmem_enabled" value. By default, PMD-sized hugepages >> + have enabled="inherit" and all other hugepage sizes have enabled="never"; >> + >> +never >> + Do not allocate huge pages; >> + >> +within_size >> + Only allocate huge page if it will be fully within i_size. >> + Also respect fadvise()/madvise() hints; >> + >> +advise >> + Only allocate huge pages if requested with fadvise()/madvise(); >> + >> Need of application restart >> =========================== >> >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >> index 020e2344eb86..fac21548c5de 100644 >> --- a/include/linux/huge_mm.h >> +++ b/include/linux/huge_mm.h >> @@ -6,6 +6,7 @@ >> #include >> >> #include /* only for vma_is_dax() */ >> +#include >> >> vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); >> int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, >> @@ -63,6 +64,7 @@ ssize_t single_hugepage_flag_show(struct kobject *kobj, >> struct kobj_attribute *attr, char *buf, >> enum transparent_hugepage_flag flag); >> extern struct kobj_attribute shmem_enabled_attr; >> +extern struct kobj_attribute thpsize_shmem_enabled_attr; >> >> /* >> * Mask of all large folio orders supported for anonymous THP; all orders up to >> @@ -265,6 +267,14 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, >> return __thp_vma_allowable_orders(vma, vm_flags, tva_flags, orders); >> } >> >> +struct thpsize { >> + struct kobject kobj; >> + struct list_head node; >> + int order; >> +}; >> + >> +#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) >> + >> enum mthp_stat_item { >> MTHP_STAT_ANON_FAULT_ALLOC, >> MTHP_STAT_ANON_FAULT_FALLBACK, >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 8e49f402d7c7..1360a1903b66 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -449,14 +449,6 @@ static void thpsize_release(struct kobject *kobj); >> static DEFINE_SPINLOCK(huge_anon_orders_lock); >> static LIST_HEAD(thpsize_list); >> >> -struct thpsize { >> - struct kobject kobj; >> - struct list_head node; >> - int order; >> -}; >> - >> -#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) >> - >> static ssize_t thpsize_enabled_show(struct kobject *kobj, >> struct kobj_attribute *attr, char *buf) >> { >> @@ -517,6 +509,9 @@ static struct kobj_attribute thpsize_enabled_attr = >> >> static struct attribute *thpsize_attrs[] = { >> &thpsize_enabled_attr.attr, >> +#ifdef CONFIG_SHMEM >> + &thpsize_shmem_enabled_attr.attr, >> +#endif >> NULL, >> }; >> >> diff --git a/mm/shmem.c b/mm/shmem.c >> index ae358efc397a..643ff7516b4d 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -131,6 +131,14 @@ struct shmem_options { >> #define SHMEM_SEEN_QUOTA 32 >> }; >> >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +static unsigned long huge_anon_shmem_orders_always __read_mostly; >> +static unsigned long huge_anon_shmem_orders_madvise __read_mostly; >> +static unsigned long huge_anon_shmem_orders_inherit __read_mostly; >> +static unsigned long huge_anon_shmem_orders_within_size __read_mostly; >> +static DEFINE_SPINLOCK(huge_anon_shmem_orders_lock); >> +#endif > > Since we are also applying the new sysfs knob controls to tmpfs and anon mm, > should we rename this to get rid of the anon prefix? Sure. I want to do this in the patch set of mTHP support tmpfs originally, but yes, I can just drop the 'anon' prefix as a preparation.