From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BAB5C2BD09 for ; Mon, 1 Jul 2024 08:16:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A844A6B0092; Mon, 1 Jul 2024 04:16:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0D686B0095; Mon, 1 Jul 2024 04:16:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AEA16B0098; Mon, 1 Jul 2024 04:16:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 686816B0092 for ; Mon, 1 Jul 2024 04:16:54 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F2026C1A8A for ; Mon, 1 Jul 2024 08:16:53 +0000 (UTC) X-FDA: 82290477906.25.E70D2C1 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 19C22140013 for ; Mon, 1 Jul 2024 08:16:51 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719821792; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sQ1SiTCJn3jVXRy4kbT/OI9bJmFfo7YKroNyoiGbTDs=; b=fZVnsVodLSpOZiYnkqHJLCpIY/7e9xbYPgZp+yuCtZJQvrTBELgNOma0ZZO9cGG2gXMEbc /7yzk0D/ccW+AOQCQ+CeJwH4LMCAAnzhBUCsCpBMiSnHwGRGuUDJh/Y22gJAp0kSajUMUr ZxP5WTfizji/2iml9kTiR6vkN20+NWw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719821792; a=rsa-sha256; cv=none; b=PDI7yFecwmO7jT5EJV2sxf35Zdd6OvsXWOYW/YsAJJL7EXlK6Ph6rEFttJsihchUMUoixs 1NpLq7uRRlDL5hWKwHgLQ8KVuX2hkn9sdclHpfMopjSIx+Xz9CKjhWGknLy7a++5/S1jI9 1y4bvHtElIs3QDkReVTm7UuciECeq2k= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 52B5D339; Mon, 1 Jul 2024 01:17:16 -0700 (PDT) Received: from [10.57.72.41] (unknown [10.57.72.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2F6143F766; Mon, 1 Jul 2024 01:16:49 -0700 (PDT) Message-ID: <71fdab06-0442-4c55-811b-b38d3b024c85@arm.com> Date: Mon, 1 Jul 2024 09:16:48 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] mm: add per-order mTHP split counters Content-Language: en-GB To: Lance Yang , Barry Song Cc: akpm@linux-foundation.org, david@redhat.com, baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240424135148.30422-1-ioworker0@gmail.com> <20240424135148.30422-2-ioworker0@gmail.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 19C22140013 X-Stat-Signature: qx8hhu7ixq65pfd68wnrgczsugmn5ry5 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719821811-660245 X-HE-Meta: U2FsdGVkX1/74rtYSm3k81q906hSkzxzYhCHbGovgN59xiJZ/FftOaGJQHY5BMYx6rZRAMEdyzsEn7Gh4kezDlBAdyaKWgVCIGIkPYDvdFtQUIrOnpToBs5qf/pPt360y6hNAz+ixtIkA16zQUZHRhbuVgd3kOOd/gkkkeYHqMlaPw5nRI2VqoEJn+xqk668WBpkX0oBkI7rPvNAWa6A+Yc436GKCTl0dDjz5X/RjC69NJWAtpPLhGa9KhSGFCL3EuaYYUiq0O8S4BWXE4+cRXX37fhTG3C6HfwhgqWB3T7iLLXxorChDts6BKP2crKut+X8qoQcwd9aBN8NPaRLr90viIZ1kKr3lhNhuxNTro/epx0YxK7KuB9qzt0w4YhxC6a90le0cBnjmJuqEfb1FUq4KbivEY+S4osgCCQ60WjopQzRV0kuS6N3hzhY9dahGGRiWyw5kraIb9RFJd7e5PGfOEM0HixqoIAlcqOPCjDxASgzABydlMaU0HKTmpPLk63gY15JlsEElzpMEkJk1XUTA6XdX0qJh4jJryEtBG89uxnmKerPKNKAYW9QG+d/uF+jpJkQfQNFo2TsbC2xFvqGm3hfIJwOgzaLkh24fmQ/cOLL2XE2tREdazxViJ3GrWROQMC0Lt0ojqOKo9zLd85miU0trlsfHIHekLgFzDXIS0sGMWjvbGKt27Ci/1IlvNkiv7jmZDzS4wqm0r92Sa22pWcDM4+2/YMr+7mkVDpK6fWP/IS2kd7GRIYAwXeGVRrXdOaEFMac+m/4dEVmdBMRINgWMehfUsUre2iJj33eSSSdhgQeMntPV0MQF9NKTDQpkcNWhgA0B+GfaG6Oqt66SM1RsjlZqz/yFbp9UQtfpM5U3b/+RLC2L/Z75NjRSeFvzhK86YhyVwaO25GYftu+xwZ+wXvfBgbzZVuOMo+k7BDIgpo5vXvDm3ynOxWuBY+jcpi0W3nl3HDE00B +vPh1+MG KgnPGHql2YRNwAsFI+9GUs8hOSfRFY6ngWNDlbgo9Df0J7tAG2oilah/BfTknnygl0OJfxR/QRnUCMZHbCWhhmzTNn7bEH8ykOLGh5aIKY28odhqOQZovyFkBVs0t+6alcG8D5yWcz9B8WRLVcf1XPNugqYN0NfClPkxNNtv9U7APdbKOP4YCoxaopwKLVAd5eZ1osI/aSfr3GRja4CP6b4B4kVUMc8WIHiXo8Grxn9+vk7Oi+BC0dICrzfPKnchEAV6u9/Cy7JqfpQNMOrI8cor4LVvVHr7uYi1r8GBktvEOhJENpAJwleiNgg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 30/06/2024 12:34, Lance Yang wrote: > Hi Barry, > > Thanks for following up! > > On Sun, Jun 30, 2024 at 5:48 PM Barry Song wrote: >> >> On Thu, Apr 25, 2024 at 3:41 AM Ryan Roberts wrote: >>> >>> + Barry >>> >>> On 24/04/2024 14:51, Lance Yang wrote: >>>> At present, the split counters in THP statistics no longer include >>>> PTE-mapped mTHP. Therefore, this commit introduces per-order mTHP split >>>> counters to monitor the frequency of mTHP splits. This will assist >>>> developers in better analyzing and optimizing system performance. >>>> >>>> /sys/kernel/mm/transparent_hugepage/hugepages-/stats >>>> split_page >>>> split_page_failed >>>> deferred_split_page >>>> >>>> Signed-off-by: Lance Yang >>>> --- >>>> include/linux/huge_mm.h | 3 +++ >>>> mm/huge_memory.c | 14 ++++++++++++-- >>>> 2 files changed, 15 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>> index 56c7ea73090b..7b9c6590e1f7 100644 >>>> --- a/include/linux/huge_mm.h >>>> +++ b/include/linux/huge_mm.h >>>> @@ -272,6 +272,9 @@ enum mthp_stat_item { >>>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, >>>> MTHP_STAT_ANON_SWPOUT, >>>> MTHP_STAT_ANON_SWPOUT_FALLBACK, >>>> + MTHP_STAT_SPLIT_PAGE, >>>> + MTHP_STAT_SPLIT_PAGE_FAILED, >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE, >>>> __MTHP_STAT_COUNT >>>> }; >>>> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 055df5aac7c3..52db888e47a6 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -557,6 +557,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); >>>> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT); >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_FALLBACK); >>>> +DEFINE_MTHP_STAT_ATTR(split_page, MTHP_STAT_SPLIT_PAGE); >>>> +DEFINE_MTHP_STAT_ATTR(split_page_failed, MTHP_STAT_SPLIT_PAGE_FAILED); >>>> +DEFINE_MTHP_STAT_ATTR(deferred_split_page, MTHP_STAT_DEFERRED_SPLIT_PAGE); >>>> >>>> static struct attribute *stats_attrs[] = { >>>> &anon_fault_alloc_attr.attr, >>>> @@ -564,6 +567,9 @@ static struct attribute *stats_attrs[] = { >>>> &anon_fault_fallback_charge_attr.attr, >>>> &anon_swpout_attr.attr, >>>> &anon_swpout_fallback_attr.attr, >>>> + &split_page_attr.attr, >>>> + &split_page_failed_attr.attr, >>>> + &deferred_split_page_attr.attr, >>>> NULL, >>>> }; >>>> >>>> @@ -3083,7 +3089,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, >>>> XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); >>>> struct anon_vma *anon_vma = NULL; >>>> struct address_space *mapping = NULL; >>>> - bool is_thp = folio_test_pmd_mappable(folio); >>>> + int order = folio_order(folio); >>>> int extra_pins, ret; >>>> pgoff_t end; >>>> bool is_hzp; >>>> @@ -3262,8 +3268,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, >>>> i_mmap_unlock_read(mapping); >>>> out: >>>> xas_destroy(&xas); >>>> - if (is_thp) >>>> + if (order >= HPAGE_PMD_ORDER) >>>> count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED); >>>> + count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT_PAGE : >>>> + MTHP_STAT_SPLIT_PAGE_FAILED); >>>> return ret; >>>> } >>>> >>>> @@ -3327,6 +3335,8 @@ void deferred_split_folio(struct folio *folio) >>>> if (list_empty(&folio->_deferred_list)) { >>>> if (folio_test_pmd_mappable(folio)) >>>> count_vm_event(THP_DEFERRED_SPLIT_PAGE); >>>> + count_mthp_stat(folio_order(folio), >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE); >>> >>> There is a very long conversation with Barry about adding a 'global "mTHP became >>> partially mapped 1 or more processes" counter (inc only)', which terminates at >>> [1]. There is a lot of discussion about the required semantics around the need >>> for partial map to cover alignment and contiguity as well as whether all pages >>> are mapped, and to trigger once it becomes partial in at least 1 process. >>> >>> MTHP_STAT_DEFERRED_SPLIT_PAGE is giving much simpler semantics, but less >>> information as a result. Barry, what's your view here? I'm guessing this doesn't >>> quite solve what you are looking for? >> >> This doesn't quite solve what I am looking for but I still think the >> patch has its value. >> >> I'm looking for a solution that can: >> >> * Count the amount of memory in the system for each mTHP size. >> * Determine how much memory for each mTHP size is partially unmapped. >> >> For example, in a system with 16GB of memory, we might find that we have 3GB >> of 64KB mTHP, and within that, 512MB is partially unmapped, potentially wasting >> memory at this moment. I'm uncertain whether Lance is interested in >> this job :-) > > Nice, that's an interesting/valuable job for me ;) > > Let's do it separately, as 'split' and friends probably can’t be the > solution you > mentioned above, IMHO. > > Hmm... I don't have a good idea about the solution for now, but will > think it over > and come back to discuss it here. I have a grad starting in a couple of weeks and I had been planning to initially ask him to look at this to help him get up to speed on mTHP/mm stuff. But I have plenty of other things for him to do if Lance wants to take this :) > >> >> Counting deferred_split remains valuable as it can signal whether the system is >> experiencing significant partial unmapping. > > Have a nice weekend! > Lance > >> >>> >>> [1] https://lore.kernel.org/linux-mm/6cc7d781-884f-4d8f-a175-8609732b87eb@arm.com/ >>> >>> Thanks, >>> Ryan >>> >>>> list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); >>>> ds_queue->split_queue_len++; >>>> #ifdef CONFIG_MEMCG >>> >> >> Thanks >> Barry