From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED8DFC3DA41 for ; Tue, 9 Jul 2024 08:35:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 82FCE6B0099; Tue, 9 Jul 2024 04:35:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E08B6B009D; Tue, 9 Jul 2024 04:35:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A7606B009E; Tue, 9 Jul 2024 04:35:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4C61B6B0099 for ; Tue, 9 Jul 2024 04:35:39 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E80B11C33C7 for ; Tue, 9 Jul 2024 08:35:38 +0000 (UTC) X-FDA: 82319555556.23.CD7E67A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf15.hostedemail.com (Postfix) with ESMTP id B6962A001F for ; Tue, 9 Jul 2024 08:35:36 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720514107; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VFWVKAIJo+rHlp17wn4C4VtjcGV/eyhmycl+qcV4ORM=; b=DeXYF4mDI4SK6BIWqmBMZQJcq1MHj0WGLnZWbrW7NTPQvMyrKkX9banBbjPTnnwl5bZUoo xqF9+QTIAWJq8tqUp5ccpxmafwPz76v+YbqFsDPKe6Ysye4sD8CEvMsc/vHZQXJrzqZvhU kZqeY/M9fOWsAsD3mISSnjHSYbscEU4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720514107; a=rsa-sha256; cv=none; b=OSh30LSRDkh5v5LhKELAccyP+qV1q6lRwBf/mQMA5E4slU6wbgRRGgO8X1X745FcN5JvOz X1yDXrT7kHSO11hagZrInOpmD6SEdtk+C1KpQHQE6tlbEAl4dWhzhrxRWKZfsWbgl0+hP+ Kbyu1CEZkGFGv1wyFZW3sL+I0iVM4Ew= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1B9441042; Tue, 9 Jul 2024 01:36:01 -0700 (PDT) Received: from [10.57.76.194] (unknown [10.57.76.194]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BF6CF3F762; Tue, 9 Jul 2024 01:35:33 -0700 (PDT) Message-ID: Date: Tue, 9 Jul 2024 09:35:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1] mm: shmem: Rename mTHP shmem counters Content-Language: en-GB To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, corbet@lwn.net, da.gomez@samsung.com, david@redhat.com, hughd@google.com, ioworker0@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, ziy@nvidia.com References: <744749c3-4506-40d9-ac48-0dbc59689f92@arm.com> <20240709014413.18044-1-21cnbao@gmail.com> <70ac8a5d-f6cf-41e6-b295-cd40b81fa85e@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: B6962A001F X-Stat-Signature: atqrx117jhd6eobwaoxi55ihjji711fk X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1720514136-347033 X-HE-Meta: U2FsdGVkX1/mjKP3zu77fj/N58/bouIVCFeeCiswkLPce/0rWxBTdQAb9bF0JLCsjovZ1+E6L5uw8HZOHtAzSUQ+roXTzrZuFzABwmTvMmtgAB7lr4EQHAYYTOAsZoqdpUqKiT5hx6LW92GWlHJIzheL2768cuRBzNhvDUl6n9/3RykM1iCfhg0aozv/PHx6Kcdjmbq1Zi/b2nFdDQ3iOsemFG8tlTCZgYlZC/ASdudqJypxgzu4yPRpBMAwQ+Pe/DbflcXAREdwCDugFkaVyEtLEws8mPT2vTa+Ql2EFT+SCuJu3ZqMhdDjVFC4RUlo0MfpP2dGLmwko14takolLcoWYKeFlCFWp6c5mv6FigyZtZxFXaG6JvDXC0UZLWzp5Hbl/lr5Beo1l+1a1/BQjPYzCFugBRMzhogdqW0iZsXcQj4BXR96cS9HGrAqTxr6iar6ASdwnGXV4Y0we/YZsp6kp0FNdmDAfpBIi+KJIMixHT23SgHA/Iny+L+KidE8bASfV/2COohFJnZpjxZTjbQ9uhajr5Y/VNifZ0DVbEMBgQS6tmB+OyYGZJY++q5tlFm26XSdfN6Ovqq+sNN0McWpS9B9eWtM4cbVpVn0eKClIKNltgtMqCO4RS4yxKYQts04Y3V5Ij6LyLPJ0vwCrUrsKEoqHG5wXsEIxdekONzC8Xtveyi4dUsoe4ekslVHQyHxKjGthUlpz6cnOUc7sWqF+KZlkOjx7NiMOFgMUEM3bQHliVCmNST0cLpEfvnp3Pv8GKzojBTF+2U/PZCOP5u5cMp4I0sTNqj3nApTL30zrX1V/hICDlPY6aaGwt8F8Lwt0/+I4pm6+no6/SoYnVJwBEKrB4ne2xvs2q2WmD/B9Y2M0VYOxDfVQSgBTax7sLl2rEuHYNyAJz8py+dkyADj9XcRjLXe1yzKcLMVMP+HS4ujkDgmtf5t4Pln7UNK+M1ZYgsuO61yH8BWHx7 5JSoGSno 0ura6wDuEftRNERqfxxE5eT+buIaIbrIjah2/w8BFIgA8vXGCcIIiSR4fCrJYw6738nXXBx5tR0GzzRiHnAD54H4Babbv6ilPixrI/ROAb3Y0eZhw6SIc5f7p575UHaT55HFw/JPMaTFSRR9re3JV5CXEnhMDPJ8QToiFPNzWR0kFJsr213SZ8C834cI7x15tvqu5bEZWRzj1MO+K0hnrVk3x+mqsYpgsD69xEUHTGA3atvKUKkPuCGxokCNaBOZaYoy5+QFXI4ndE9kJT/QgCFBJcc/I9BO361zbR9y0TWj2Q3RxfScKvOi0Nw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 09/07/2024 09:13, Barry Song wrote: > On Tue, Jul 9, 2024 at 7:55 PM Ryan Roberts wrote: >> >> On 09/07/2024 02:44, Barry Song wrote: >>> On Tue, Jul 9, 2024 at 12:30 AM Ryan Roberts wrote: >>>> >>>> On 08/07/2024 12:36, Barry Song wrote: >>>>> On Mon, Jul 8, 2024 at 11:24 PM Ryan Roberts wrote: >>>>>> >>>>>> The legacy PMD-sized THP counters at /proc/vmstat include >>>>>> thp_file_alloc, thp_file_fallback and thp_file_fallback_charge, which >>>>>> rather confusingly refer to shmem THP and do not include any other types >>>>>> of file pages. This is inconsistent since in most other places in the >>>>>> kernel, THP counters are explicitly separated for anon, shmem and file >>>>>> flavours. However, we are stuck with it since it constitutes a user ABI. >>>>>> >>>>>> Recently, commit 66f44583f9b6 ("mm: shmem: add mTHP counters for >>>>>> anonymous shmem") added equivalent mTHP stats for shmem, keeping the >>>>>> same "file_" prefix in the names. But in future, we may want to add >>>>>> extra stats to cover actual file pages, at which point, it would all >>>>>> become very confusing. >>>>>> >>>>>> So let's take the opportunity to rename these new counters "shmem_" >>>>>> before the change makes it upstream and the ABI becomes immutable. >>>>> >>>>> Personally, I think this approach is much clearer. However, I recall >>>>> we discussed this >>>>> before [1], and it seems that inconsistency is a concern? >>>> >>>> Embarrassingly, I don't recall that converstation at all :-| but at least what I >>>> said then is consistent with what I've done in this patch. >>>> >>>> I think David's conclusion from that thread was to call them FILE_, and add both >>>> shmem and pagecache counts to those counters, meaning we can keep the same name >>>> as legacy THP counters. But those legacy THP counters only count shmem, and I >>>> don't think we would get away with adding pagecache counts to those at this >>>> point? (argument: they have been around for long time and there is a risk that >>>> user space relies on them and if they were to dramatically increase due to >>>> pagecache addition now that could break things). In that case, there is still >>>> inconsistency, but its worse; the names are consistent but the semantics are >>>> inconsistent. >>>> >>>> So my vote is to change to SHMEM_ as per this patch :) >>> >>> I have no objections. However, I dislike the documentation for >>> thp_file_*. Perhaps we can clean it all up together ? >> >> I agree that we should clean this documentation up and I'm happy to roll it into >> v2. However, I don't think what you have suggested is quite right. >> >> thp_file_alloc, thp_file_fallback and thp_file_fallback_charge *only* count >> shmem. They don't count pagecache. So perhaps the change should be "...every >> time a shmem huge page (dispite being named after "file", the counter measures >> only shmem) is..."? > > I understand what you are saying, and I know that thp_file_* has only > included shmem so far. My question is whether it will include regular > files in the future? If not, I am perfectly fine with your approach. My whole reasoning for this patch is based on the assertion that since THP_FILE_ALLOC has been there for 8 years and in all that time has only counted shmem, then its highly likely that someone is depending on that semantic and we can't change it. I don't have any actual evidence of code that relies on it though. I propose I change the docs to reflect what's actually happening today (i.e. shmem *only*). If we later decide we want to also report page cache numbers through that same counter, then we can change the docs at that point. But if I get my way, we'll soon have mTHP counters for FILE, which is solely for page cache. So You'll be able to get all the fine-grained info out of those and there will be no need to mess with the legacy counters. > > READ_ONLY_THP_FOR_FS isn't applicable in this path as it is created > by khugepaged collapse. > >> >> thp_file_mapped includes both file and shmem, so agree with your change there. >> >> What do you think? >> >> >>> >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >>> index 709fe10b60f4..65df48cb3bbb 100644 >>> --- a/Documentation/admin-guide/mm/transhuge.rst >>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>> @@ -417,21 +417,22 @@ thp_collapse_alloc_failed >>> the allocation. >>> >>> thp_file_alloc >>> - is incremented every time a file huge page is successfully >>> - allocated. >>> + is incremented every time a file (including shmem) huge page is >>> + successfully allocated. >>> >>> thp_file_fallback >>> - is incremented if a file huge page is attempted to be allocated >>> - but fails and instead falls back to using small pages. >>> + is incremented if a file (including shmem) huge page is attempted >>> + to be allocated but fails and instead falls back to using small >>> + pages. >>> >>> thp_file_fallback_charge >>> - is incremented if a file huge page cannot be charged and instead >>> - falls back to using small pages even though the allocation was >>> - successful. >>> + is incremented if a file (including shmem) huge page cannot be >>> + charged and instead falls back to using small pages even though >>> + the allocation was successful. >>> >>> thp_file_mapped >>> - is incremented every time a file huge page is mapped into >>> - user address space. >>> + is incremented every time a file (including shmem) huge page is >>> + mapped into user address space. >>> >>> thp_split_page >>> is incremented every time a huge page is split into base >>> >>>> >>>>> >>>>> [1] https://lore.kernel.org/linux-mm/05d0096e4ec3e572d1d52d33a31a661321ac1551.1713755580.git.baolin.wang@linux.alibaba.com/ >>>>> >>>>> >>>>>> >>>>>> Signed-off-by: Ryan Roberts >>>>>> --- >>>>>> >>>>>> Hi All, >>>>>> >>>>>> Applies on top of today's mm-unstable (2073cda629a4) and tested with mm >>>>>> selftests; no regressions observed. >>>>>> >>>>>> The backstory here is that I'd like to introduce some counters for regular file >>>>>> folio allocations to observe how often large folio allocation succeeds, but >>>>>> these shmem counters are named "file" which is going to make things confusing. >>>>>> So hoping to solve that before commit 66f44583f9b6 ("mm: shmem: add mTHP >>>>>> counters for anonymous shmem") goes upstream (it is currently in mm-stable). >>>>>> >>>>>> Admittedly, this change means the mTHP stat names are not the same as the legacy >>>>>> PMD-size THP names, but I think that's a smaller issue than having "file_" mTHP >>>>>> stats that only count shmem, then having to introduce "file2_" or "pgcache_" >>>>>> stats for the regular file memory, which is even more inconsistent IMHO. I guess >>>>>> the alternative is to count both shmem and file in these mTHP stats (that's how >>>>>> they were documented anyway) but I think it's better to be able to consider them >>>>>> separately like we do for all the other counters. >>>>>> >>>>>> Thanks, >>>>>> Ryan >>>>>> >>>>>> Documentation/admin-guide/mm/transhuge.rst | 12 ++++++------ >>>>>> include/linux/huge_mm.h | 6 +++--- >>>>>> mm/huge_memory.c | 12 ++++++------ >>>>>> mm/shmem.c | 8 ++++---- >>>>>> 4 files changed, 19 insertions(+), 19 deletions(-) >>>>>> >>>>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >>>>>> index 747c811ee8f1..8b891689fc13 100644 >>>>>> --- a/Documentation/admin-guide/mm/transhuge.rst >>>>>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>>>>> @@ -496,16 +496,16 @@ swpout_fallback >>>>>> Usually because failed to allocate some continuous swap space >>>>>> for the huge page. >>>>>> >>>>>> -file_alloc >>>>>> - is incremented every time a file huge page is successfully >>>>>> +shmem_alloc >>>>>> + is incremented every time a shmem huge page is successfully >>>>>> allocated. >>>>>> >>>>>> -file_fallback >>>>>> - is incremented if a file huge page is attempted to be allocated >>>>>> +shmem_fallback >>>>>> + is incremented if a shmem huge page is attempted to be allocated >>>>>> but fails and instead falls back to using small pages. >>>>>> >>>>>> -file_fallback_charge >>>>>> - is incremented if a file huge page cannot be charged and instead >>>>>> +shmem_fallback_charge >>>>>> + is incremented if a shmem huge page cannot be charged and instead >>>>>> falls back to using small pages even though the allocation was >>>>>> successful. >>>>>> >>>>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>>>> index acb6ac24a07e..cff002be83eb 100644 >>>>>> --- a/include/linux/huge_mm.h >>>>>> +++ b/include/linux/huge_mm.h >>>>>> @@ -269,9 +269,9 @@ enum mthp_stat_item { >>>>>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, >>>>>> MTHP_STAT_SWPOUT, >>>>>> MTHP_STAT_SWPOUT_FALLBACK, >>>>>> - MTHP_STAT_FILE_ALLOC, >>>>>> - MTHP_STAT_FILE_FALLBACK, >>>>>> - MTHP_STAT_FILE_FALLBACK_CHARGE, >>>>>> + MTHP_STAT_SHMEM_ALLOC, >>>>>> + MTHP_STAT_SHMEM_FALLBACK, >>>>>> + MTHP_STAT_SHMEM_FALLBACK_CHARGE, >>>>>> MTHP_STAT_SPLIT, >>>>>> MTHP_STAT_SPLIT_FAILED, >>>>>> MTHP_STAT_SPLIT_DEFERRED, >>>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>>>> index 9ec64aa2be94..f9696c94e211 100644 >>>>>> --- a/mm/huge_memory.c >>>>>> +++ b/mm/huge_memory.c >>>>>> @@ -568,9 +568,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); >>>>>> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >>>>>> DEFINE_MTHP_STAT_ATTR(swpout, MTHP_STAT_SWPOUT); >>>>>> DEFINE_MTHP_STAT_ATTR(swpout_fallback, MTHP_STAT_SWPOUT_FALLBACK); >>>>>> -DEFINE_MTHP_STAT_ATTR(file_alloc, MTHP_STAT_FILE_ALLOC); >>>>>> -DEFINE_MTHP_STAT_ATTR(file_fallback, MTHP_STAT_FILE_FALLBACK); >>>>>> -DEFINE_MTHP_STAT_ATTR(file_fallback_charge, MTHP_STAT_FILE_FALLBACK_CHARGE); >>>>>> +DEFINE_MTHP_STAT_ATTR(shmem_alloc, MTHP_STAT_SHMEM_ALLOC); >>>>>> +DEFINE_MTHP_STAT_ATTR(shmem_fallback, MTHP_STAT_SHMEM_FALLBACK); >>>>>> +DEFINE_MTHP_STAT_ATTR(shmem_fallback_charge, MTHP_STAT_SHMEM_FALLBACK_CHARGE); >>>>>> DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT); >>>>>> DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED); >>>>>> DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); >>>>>> @@ -581,9 +581,9 @@ static struct attribute *stats_attrs[] = { >>>>>> &anon_fault_fallback_charge_attr.attr, >>>>>> &swpout_attr.attr, >>>>>> &swpout_fallback_attr.attr, >>>>>> - &file_alloc_attr.attr, >>>>>> - &file_fallback_attr.attr, >>>>>> - &file_fallback_charge_attr.attr, >>>>>> + &shmem_alloc_attr.attr, >>>>>> + &shmem_fallback_attr.attr, >>>>>> + &shmem_fallback_charge_attr.attr, >>>>>> &split_attr.attr, >>>>>> &split_failed_attr.attr, >>>>>> &split_deferred_attr.attr, >>>>>> diff --git a/mm/shmem.c b/mm/shmem.c >>>>>> index 921d59c3d669..f24dfbd387ba 100644 >>>>>> --- a/mm/shmem.c >>>>>> +++ b/mm/shmem.c >>>>>> @@ -1777,7 +1777,7 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, >>>>>> if (pages == HPAGE_PMD_NR) >>>>>> count_vm_event(THP_FILE_FALLBACK); >>>>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>>>> - count_mthp_stat(order, MTHP_STAT_FILE_FALLBACK); >>>>>> + count_mthp_stat(order, MTHP_STAT_SHMEM_FALLBACK); >>>>>> #endif >>>>>> order = next_order(&suitable_orders, order); >>>>>> } >>>>>> @@ -1804,8 +1804,8 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, >>>>>> count_vm_event(THP_FILE_FALLBACK_CHARGE); >>>>>> } >>>>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK); >>>>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK_CHARGE); >>>>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK); >>>>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK_CHARGE); >>>>>> #endif >>>>>> } >>>>>> goto unlock; >>>>>> @@ -2181,7 +2181,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, >>>>>> if (folio_test_pmd_mappable(folio)) >>>>>> count_vm_event(THP_FILE_ALLOC); >>>>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>>>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_ALLOC); >>>>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_ALLOC); >>>>>> #endif >>>>>> goto alloced; >>>>>> } >>>>>> -- >>>>>> 2.43.0 >>>>>> >>>>> > > Thanks > Barry