From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FC22C3DA4A for ; Fri, 9 Aug 2024 07:47:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D45306B0088; Fri, 9 Aug 2024 03:47:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCDBA6B0092; Fri, 9 Aug 2024 03:47:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6DD86B0098; Fri, 9 Aug 2024 03:47:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 954EB6B0088 for ; Fri, 9 Aug 2024 03:47:04 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 459A2A4DA8 for ; Fri, 9 Aug 2024 07:47:04 +0000 (UTC) X-FDA: 82431925968.28.AA02EFD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf13.hostedemail.com (Postfix) with ESMTP id 0D1D52000F for ; Fri, 9 Aug 2024 07:47:01 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723189557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oTW9Py7nXbiBd10JK5wbEn8NA0pqMeznHr3YoMLA9tk=; b=oiCjIvnGeIPQzTen+9rfrkzebM+cx9BmuT61cvcZG/IJnbDkG9S7f0IFLqwv/e4BcJftqD C9MaEY062vn50iPTrCJqK7JkA5CLtbTlsNq91YJKH2In0qE/fYKdtg8YIYlkvqbeI2Imzv XswMPjOCO81go9nP33lIFKvGFAtRkHg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723189557; a=rsa-sha256; cv=none; b=loIrk3UHDs5lR5y/4fd/W4LTbtqvSyhrV7pUuHNlikw1LMe2RcyQDhIYFOc9Uy4jCsD9qO FFhkfIrcvIdk8v6T7KyyyzWqbMlzGmZupE902iNAIs+57ayqFh1Gvh/CNPJk+Dt7CrF4LR U33WCXp06e1RmsPd9KaKKNwUZeGG8z4= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C4D41FEC; Fri, 9 Aug 2024 00:47:26 -0700 (PDT) Received: from [10.57.95.64] (unknown [10.57.95.64]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 423683F766; Fri, 9 Aug 2024 00:46:59 -0700 (PDT) Message-ID: <9e7b7ded-62ef-4a7c-992a-dda0e540cff8@arm.com> Date: Fri, 9 Aug 2024 08:46:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 2/2] mm: Tidy up shmem mTHP controls and stats Content-Language: en-GB To: Barry Song Cc: Andrew Morton , Hugh Dickins , "Matthew Wilcox (Oracle)" , David Hildenbrand , Lance Yang , Baolin Wang , Gavin Shan , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240808111849.651867-1-ryan.roberts@arm.com> <20240808111849.651867-3-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 0D1D52000F X-Stat-Signature: rj3ketr51n1mcp9ituwe6o77ipkq3hpc X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1723189621-1819 X-HE-Meta: U2FsdGVkX19FpviAPiEmV+uOXT3ByVm/683YLy4+s3gka8JKrS/htV3gfiQA9f2i0vOEWuhk+rnXqH89Yr798jenstJlktyuk9O55jexa6Zy2YW0Ecd4BIqogRUOmtXEqLvXoBWc0pxbhNkNLcY2COmGJNxDKlSRoOZeFnJzIjNisOKUE5GFGWdFpb1YM7Atm/pmFDRHtVOchVKWzvU/FPkJ3fQ9LH5DEYg639V1Pd3pUkepScmzoTJyVXPLLY6/5al7aXnitCAqWVuFUrDRpQcrdi1t5T/ACTlYrMVCxKmiPJf5HzPaDyuCHalbKti/cWmKYIEMUh4wIS1oy9H9KdIkf6+Qy6nUxPvE6UbX1iRvngca9zONjdH3ZfJ6zs9vU7qCImOSmbkQ+4Ox9U4WzZwK1dcycENPsmKAjIBVi75LlaP+P6BL86RzX1hq/jjdzIumO8pIAatZOTmaxgaXQHehDShWWbHL7yceajdibq/ZJZZDCs87GOLtmuFJE8LWZhRAaV5nU448lPyAMA+bo8JfbKztHw2vPhQXfPRvuDwx5m4LPkAA5hR/dabEXwsQ8ppEKFajwWr0Ub7xAkTZEKcFgd+YEE2ntr8D6gFTGhfUKBndmvBL2gjfiK+MhVsqYQmDhGDT4Yn+kKysjxVxDbPZqU0F1TsEfNVYG/UNAvcrmgnZs5ol+mnpW8Qnu7R+IpcRGDgisVKe7fIUJAi/062Sr6W81akpts/IpGKFXneydak8ivbn8cVAGY1PmQZYWF5uHmQcIav3kaep4r/Y9BJS3MQnu6TfhxYJtm7jRVuRwfcCipqqdVjKKYhMfg9+i2DtN5h3lCHe5zjMaQnkfU11TQ77xU2HRCW3N5po2A3zgWF9cSO1N50REMIuf4y0ndYxXxVuYn0+IQx5fH7uMAa9/+p7Cf49xOprSK0rBQNa0qoqVF7mz68l6xnwHYxyW5v+rpQjZlBu0TTd5lb rS1o6XyC QfPVsDJOuiDd/fvhdtcCeiRMiAoIJdEhiUNgqOx2mUPmfqmaq4DmKivLwbSv4WE0AB4GzQSYhoAed8sFzFTUzV8G30hpObXa52DLj3yLcXn/0PbDN7Fc7K12Vaaz4VoXcMLAhc3dV4yNWNNn2F15LOUzKURz+bx8snIK07OkI2UdGJr1m1s2xz1C/Ge3tuhgjb84q0CzRrfETDkyFcDB6hMhJisD599+Lznwn1/4AxTAHnAhQZY8AsUMusCpaxv3MHh3DQj9GU4Vo5hvHgBhYK5tdFNH+D3WR+u77 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 08/08/2024 23:11, Barry Song wrote: > On Thu, Aug 8, 2024 at 11:19 PM Ryan Roberts wrote: >> >> Previously we had a situation where shmem mTHP controls and stats were >> not exposed for some supported sizes and were exposed for some >> unsupported sizes. So let's clean that up. >> >> Anon mTHP can support all large orders [2, PMD_ORDER]. But shmem can >> support all large orders [1, MAX_PAGECACHE_ORDER]. However, per-size >> shmem controls and stats were previously being exposed for all the anon >> mTHP orders, meaning order-1 was not present, and for arm64 64K base >> pages, orders 12 and 13 were exposed but were not supported internally. >> >> Tidy this all up by defining ctrl and stats attribute groups for anon >> and file separately. Anon ctrl and stats groups are populated for all >> orders in THP_ORDERS_ALL_ANON and file ctrl and stats groups are >> populated for all orders in THP_ORDERS_ALL_FILE_DEFAULT. >> >> Additionally, create "any" ctrl and stats attribute groups which are >> populated for all orders in (THP_ORDERS_ALL_ANON | >> THP_ORDERS_ALL_FILE_DEFAULT). swpout stats use this since they apply to >> anon and shmem. > > This passage is a bit confusing to me. I'm not sure whether it's about > creating any control and stats for both file and anon, or creating them > separately depending on the situation. However, the previous passage > mentions that file and anon create control and stats separately based > on different orders, and the orders they create might differ. Yes, I'll admit that I added this paragraph for the change I made since the last version, and it's not the clearest thing I've ever written on reflection. The idea is that there are 3 buckets; "anon", "file" and "any". controls/stats in the "anon" bucket are populated for all orders supported by anon memory. Same for file. controls/stats in the "any" bucket are populated for all orders supported by either anon or file. > > But after running your patches, I understood it. For instance, I'm now > seeing 8kB folder that didn't exist before: > > /sys/kernel/mm/transparent_hugepage # ls -l > drwxr-xr-x 3 root root 0 Aug 8 22:01 hugepages-1024kB > ... > drwxr-xr-x 3 root root 0 Aug 8 22:01 hugepages-8kB > drwxr-xr-x 2 root root 0 Aug 8 22:01 khugepaged > ... > > Then, when I entered the 8kB folder, I found 'shmem_enabled' > but not 'enabled' for anon: > /sys/kernel/mm/transparent_hugepage/hugepages-8kB # ls > shmem_enabled stats > > However, in the 16kB folder, I found both: > /sys/kernel/mm/transparent_hugepage/hugepages-16kB # ls > enabled shmem_enabled stats > > In the 8kB 'stats,' I see 'shmem_alloc' but not 'anon_alloc.' Since > both shmem and anon require swapout, I am seeing 'swpout' and > 'swpout_fallback': > > /sys/kernel/mm/transparent_hugepage/hugepages-8kB/stats # ls > shmem_alloc shmem_fallback shmem_fallback_charge swpout > swpout_fallback > > Everything observed seems to meet expectations, so: Yes, what you observe is exactly as intended. It gets a bit tricky if CONFIG_SHMEM is disabled; In this case, some ifdeffery ensures that the swpout stats are declared in the "anon" bucket instead of the "any" bucket. > > Tested-by: Barry Song Thanks! > >> >> The side-effect of all this is that different hugepage-*kB directories >> contain different sets of controls and stats, depending on which memory >> types support that size. This approach is preferred over the >> alternative, which is to populate dummy controls and stats for memory >> types that do not support a given size. >> >> Signed-off-by: Ryan Roberts >> --- >> mm/huge_memory.c | 144 +++++++++++++++++++++++++++++++++++++---------- >> 1 file changed, 114 insertions(+), 30 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 0c3075ee00012..082d86b7c6c2f 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -482,8 +482,8 @@ static void thpsize_release(struct kobject *kobj); >> static DEFINE_SPINLOCK(huge_anon_orders_lock); >> static LIST_HEAD(thpsize_list); >> >> -static ssize_t thpsize_enabled_show(struct kobject *kobj, >> - struct kobj_attribute *attr, char *buf) >> +static ssize_t anon_enabled_show(struct kobject *kobj, >> + struct kobj_attribute *attr, char *buf) >> { >> int order = to_thpsize(kobj)->order; >> const char *output; >> @@ -500,9 +500,9 @@ static ssize_t thpsize_enabled_show(struct kobject *kobj, >> return sysfs_emit(buf, "%s\n", output); >> } >> >> -static ssize_t thpsize_enabled_store(struct kobject *kobj, >> - struct kobj_attribute *attr, >> - const char *buf, size_t count) >> +static ssize_t anon_enabled_store(struct kobject *kobj, >> + struct kobj_attribute *attr, >> + const char *buf, size_t count) >> { >> int order = to_thpsize(kobj)->order; >> ssize_t ret = count; >> @@ -544,19 +544,35 @@ static ssize_t thpsize_enabled_store(struct kobject *kobj, >> return ret; >> } >> >> -static struct kobj_attribute thpsize_enabled_attr = >> - __ATTR(enabled, 0644, thpsize_enabled_show, thpsize_enabled_store); >> +static struct kobj_attribute anon_enabled_attr = >> + __ATTR(enabled, 0644, anon_enabled_show, anon_enabled_store); >> >> -static struct attribute *thpsize_attrs[] = { >> - &thpsize_enabled_attr.attr, >> +static struct attribute *anon_ctrl_attrs[] = { >> + &anon_enabled_attr.attr, >> + NULL, >> +}; >> + >> +static const struct attribute_group anon_ctrl_attr_grp = { >> + .attrs = anon_ctrl_attrs, >> +}; >> + >> +static struct attribute *file_ctrl_attrs[] = { >> #ifdef CONFIG_SHMEM >> &thpsize_shmem_enabled_attr.attr, >> #endif >> NULL, >> }; >> >> -static const struct attribute_group thpsize_attr_group = { >> - .attrs = thpsize_attrs, >> +static const struct attribute_group file_ctrl_attr_grp = { >> + .attrs = file_ctrl_attrs, >> +}; >> + >> +static struct attribute *any_ctrl_attrs[] = { >> + NULL, >> +}; >> + >> +static const struct attribute_group any_ctrl_attr_grp = { >> + .attrs = any_ctrl_attrs, >> }; >> >> static const struct kobj_type thpsize_ktype = { >> @@ -595,64 +611,132 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); >> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >> DEFINE_MTHP_STAT_ATTR(swpout, MTHP_STAT_SWPOUT); >> DEFINE_MTHP_STAT_ATTR(swpout_fallback, MTHP_STAT_SWPOUT_FALLBACK); >> +#ifdef CONFIG_SHMEM >> DEFINE_MTHP_STAT_ATTR(shmem_alloc, MTHP_STAT_SHMEM_ALLOC); >> DEFINE_MTHP_STAT_ATTR(shmem_fallback, MTHP_STAT_SHMEM_FALLBACK); >> DEFINE_MTHP_STAT_ATTR(shmem_fallback_charge, MTHP_STAT_SHMEM_FALLBACK_CHARGE); >> +#endif >> DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT); >> DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED); >> DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); >> >> -static struct attribute *stats_attrs[] = { >> +static struct attribute *anon_stats_attrs[] = { >> &anon_fault_alloc_attr.attr, >> &anon_fault_fallback_attr.attr, >> &anon_fault_fallback_charge_attr.attr, >> +#ifndef CONFIG_SHMEM >> &swpout_attr.attr, >> &swpout_fallback_attr.attr, >> - &shmem_alloc_attr.attr, >> - &shmem_fallback_attr.attr, >> - &shmem_fallback_charge_attr.attr, >> +#endif >> &split_attr.attr, >> &split_failed_attr.attr, >> &split_deferred_attr.attr, >> NULL, >> }; >> >> -static struct attribute_group stats_attr_group = { >> +static struct attribute_group anon_stats_attr_grp = { >> + .name = "stats", >> + .attrs = anon_stats_attrs, >> +}; >> + >> +static struct attribute *file_stats_attrs[] = { >> +#ifdef CONFIG_SHMEM >> + &shmem_alloc_attr.attr, >> + &shmem_fallback_attr.attr, >> + &shmem_fallback_charge_attr.attr, >> +#endif >> + NULL, >> +}; >> + >> +static struct attribute_group file_stats_attr_grp = { >> + .name = "stats", >> + .attrs = file_stats_attrs, >> +}; >> + >> +static struct attribute *any_stats_attrs[] = { >> +#ifdef CONFIG_SHMEM >> + &swpout_attr.attr, >> + &swpout_fallback_attr.attr, >> +#endif >> + NULL, >> +}; >> + >> +static struct attribute_group any_stats_attr_grp = { >> .name = "stats", >> - .attrs = stats_attrs, >> + .attrs = any_stats_attrs, >> }; >> >> +static int sysfs_add_group(struct kobject *kobj, >> + const struct attribute_group *grp) >> +{ >> + int ret = -ENOENT; >> + >> + /* >> + * If the group is named, try to merge first, assuming the subdirectory >> + * was already created. This avoids the warning emitted by >> + * sysfs_create_group() if the directory already exists. >> + */ >> + if (grp->name) >> + ret = sysfs_merge_group(kobj, grp); >> + if (ret) >> + ret = sysfs_create_group(kobj, grp); >> + >> + return ret; >> +} >> + >> static struct thpsize *thpsize_create(int order, struct kobject *parent) >> { >> unsigned long size = (PAGE_SIZE << order) / SZ_1K; >> struct thpsize *thpsize; >> - int ret; >> + int ret = -ENOMEM; >> >> thpsize = kzalloc(sizeof(*thpsize), GFP_KERNEL); >> if (!thpsize) >> - return ERR_PTR(-ENOMEM); >> + goto err; >> + >> + thpsize->order = order; >> >> ret = kobject_init_and_add(&thpsize->kobj, &thpsize_ktype, parent, >> "hugepages-%lukB", size); >> if (ret) { >> kfree(thpsize); >> - return ERR_PTR(ret); >> + goto err; >> } >> >> - ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group); >> - if (ret) { >> - kobject_put(&thpsize->kobj); >> - return ERR_PTR(ret); >> + >> + ret = sysfs_add_group(&thpsize->kobj, &any_ctrl_attr_grp); >> + if (ret) >> + goto err_put; >> + >> + ret = sysfs_add_group(&thpsize->kobj, &any_stats_attr_grp); >> + if (ret) >> + goto err_put; >> + >> + if (BIT(order) & THP_ORDERS_ALL_ANON) { >> + ret = sysfs_add_group(&thpsize->kobj, &anon_ctrl_attr_grp); >> + if (ret) >> + goto err_put; >> + >> + ret = sysfs_add_group(&thpsize->kobj, &anon_stats_attr_grp); >> + if (ret) >> + goto err_put; >> } >> >> - ret = sysfs_create_group(&thpsize->kobj, &stats_attr_group); >> - if (ret) { >> - kobject_put(&thpsize->kobj); >> - return ERR_PTR(ret); >> + if (BIT(order) & THP_ORDERS_ALL_FILE_DEFAULT) { >> + ret = sysfs_add_group(&thpsize->kobj, &file_ctrl_attr_grp); >> + if (ret) >> + goto err_put; >> + >> + ret = sysfs_add_group(&thpsize->kobj, &file_stats_attr_grp); >> + if (ret) >> + goto err_put; >> } >> >> - thpsize->order = order; >> return thpsize; >> +err_put: >> + kobject_put(&thpsize->kobj); >> +err: >> + return ERR_PTR(ret); >> } >> >> static void thpsize_release(struct kobject *kobj) >> @@ -692,7 +776,7 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) >> goto remove_hp_group; >> } >> >> - orders = THP_ORDERS_ALL_ANON; >> + orders = THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DEFAULT; >> order = highest_order(orders); >> while (orders) { >> thpsize = thpsize_create(order, *hugepage_kobj); >> -- >> 2.43.0 >> > > Thanks > Barry