From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CBE7C3DA4A for ; Thu, 8 Aug 2024 21:27:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3808A6B0088; Thu, 8 Aug 2024 17:27:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32FDD6B0089; Thu, 8 Aug 2024 17:27:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21E746B008A; Thu, 8 Aug 2024 17:27:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 055E26B0088 for ; Thu, 8 Aug 2024 17:27:21 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7172140140 for ; Thu, 8 Aug 2024 21:27:21 +0000 (UTC) X-FDA: 82430364282.26.AAD22ED Received: from mail-vk1-f173.google.com (mail-vk1-f173.google.com [209.85.221.173]) by imf05.hostedemail.com (Postfix) with ESMTP id B1506100009 for ; Thu, 8 Aug 2024 21:27:19 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723152406; a=rsa-sha256; cv=none; b=Uef24uIKIWoIOf0dwDUvbUvFYg+AOdLgzVk0dgcC2agf9dEfRVR+cWdpQcOQMKqJKT1xb8 uDv/4PCcNXwZhAvl1pgLCYX+fCWBCTbxVtfnv/ZnGUEqpdjpvGPna3UlqflwxUjGZAyurF V9x5U+KiDKsGrkqGEvwGYDDAdFI4aIE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723152406; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lEM57bMsGd2LiGZ0cIQMWRGCVKM7YmCar5qKtXQlN9k=; b=KRwbZlm+Ww2Algo6mLIvbh7dOsb+Q7gUzPTfrvOmvNBA6l9/my2a2FhGKyqq19sFLJB1pb hrGxXRN8XjZntlLwj4+GpUAQmdaxAvRlMqPVCQ3KYV0xDtboNbV30e+o6onncj30OAuoSw ++YtwSYtDUtm2VF6WzLtYYMsN7bShvM= Received: by mail-vk1-f173.google.com with SMTP id 71dfb90a1353d-4f6e36ea1ebso516890e0c.0 for ; Thu, 08 Aug 2024 14:27:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723152439; x=1723757239; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lEM57bMsGd2LiGZ0cIQMWRGCVKM7YmCar5qKtXQlN9k=; b=V3rqS/p+JPxRwQ6Fwcx2D6bWuRbmuVxtpyjCmQYJCLlGN9dUzRCN5NtASJ854A4eXv ILuEdcsEx+B63X4E6ctdDjy4vCKIgRc1B+RyI7mOJbFzNd70PUhNh2C101SphwbKRRrT AOSpQUsOHiByDzVy7Z9j4rhveKlfNPXb/9fd3DijqNGA8CBM5xu9eW2UOGCSQrY48ZLX OvKnvK2r5FKzPIT6QZ0C/+M9JfREBzGZD99iIDmytuWFDdVHdTzhfXXZCg6kmlqKedst 9vVYYxiArqNgiDOig+pRxjSmGCRRmk6Z3ic8/g3phdpmBDFyfovDurKrl1ryGDIk3G3C /GwQ== X-Forwarded-Encrypted: i=1; AJvYcCV5HLeAMGQFHDSB+IeUOOMQLfWRRbMWpVltNK+HvDzQjNdBPGosuDef6A6ZeV0MKAlT9fqGvq9Uig==@kvack.org X-Gm-Message-State: AOJu0YzKY6iI45d7FHGWm+a7C9wa7CNzqzwqlUwnn4gZuPFBNWlqAzms 0Dkr5hW5VfIxbIud3o9O21vHyJqg+XKTn1rX0lB7ZFdGOPbLMb2qwNKbqEGzHffzpqQ3ncHc+/V YTMv5wAjhmUW586S+XCY6Y8d7KuE= X-Google-Smtp-Source: AGHT+IGvb9pCgRYsaz2IEjIApJeKm08lIzCGEkMhAVgb0cxFbJ9uUz9M9p+UragOiOloOJ8i4KCrmsIlJAi9ti4hdI0= X-Received: by 2002:a05:6122:3bce:b0:4f2:ffa9:78b5 with SMTP id 71dfb90a1353d-4f9027b9526mr3945389e0c.11.1723152438543; Thu, 08 Aug 2024 14:27:18 -0700 (PDT) MIME-Version: 1.0 References: <20240424135148.30422-1-ioworker0@gmail.com> <20240424135148.30422-2-ioworker0@gmail.com> <71fdab06-0442-4c55-811b-b38d3b024c85@arm.com> In-Reply-To: <71fdab06-0442-4c55-811b-b38d3b024c85@arm.com> From: Barry Song Date: Fri, 9 Aug 2024 09:27:07 +1200 Message-ID: Subject: Re: [PATCH 1/2] mm: add per-order mTHP split counters To: Ryan Roberts Cc: Lance Yang , akpm@linux-foundation.org, david@redhat.com, baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ipyntoji55dt6gm6ewuync6qdo61s18d X-Rspamd-Queue-Id: B1506100009 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1723152439-323240 X-HE-Meta: U2FsdGVkX19RitByILV2KQxRURs5Vbin8XBJIFSp+MQstTSXCeYgOC2uyQVg78D5+CI4ZPS8X0f5+Gb0aN5Ko/K/J+N5+cqj7u7CzaYMI6i1EFGJdd2ici/2gu6sZIh9bspBKSELLdlM2H3x1zJarQrVy6YQDF0L82TATkmSWwMJYrLqyY53LYqWFQT0Cj0XdVaKzROrFX54BjDqyrP8PMADiTDlkjY7NIZL8Ba6hw9ieJejU4QXfyaP0njvsBx2m996MOXedewoA3s+7w5kdd3qTMeo5XadGYDEvjq/EfUzFD+6tcB+3sw3VpCWu3SOxyUwcPvsniCzjsIrpodfxOCJ7Ei1c5tSwNHNsh/neKLxVAnhJzKru9GQBaYZkvgB87w7RDX9j78epSbuKNOoEGtydAWqEoJsiepiREfVCgU/rdCqfSR3vhj4ysUZXh5ysvM43r3Q9cdDlD8bvEvEQLBNm1eVDgKR5Fo78LiLAMkDPyhleUI9TDeIAj1WojCFeHOVzGmxeQOS6eRqafd6WIGQc89xVBXBkW4pmgLq68OEDnqUy6NtOMNzVvsLiJYa4IRBQbu8ALj8QOPimXoOiHM7bl2SUpOVYK+9ao3vESmKv5HgUfu6r3x8LJVZvWeslQPLQWzrjbmsZ/d9KsHoTARH3P1ctQAn7V/VJqtMscSyEyIesry+RFGh8xRJ27P0qE9y3v5xSBT11gBVHZVu6FrOCxSjDjqs+d1CGWJqfTpGgraePj0UFOX5OG/PkaC+3BKmDM65mP/MW7v99U6i5bWeb4fyb+Qt/1roMUz5OxJu2QEOSr0G6nchpJJJD8Mql7HtEnH3pmIuiHf52dHnIcCsdIfa1IHkoDvsZIgBRxht36/FXskowmSEgEZjgRp9uDEITrvkd9YIjrPZWqUiWgGFFI1lDnCFRe5J9bkVfvtZZXQs7Yq5f4cfMpcH913pkJ34Vwn4UQXzdC0YQ6p +j/8DCBW g2Xap63dUyhPo52DC4FMwhh7zbi0Yo1XH5DjRqjghmz4SJZS7xF9RqH+UErxQCdAYaYE490oZHSV4vS95sdEzxXQK3RvmnmoDDfEWFwuJIhmucx7LTOQmctpm7+DTPwc3E1pmhEef/FmKGgteyGoLjHqiRipm98QBn7dFbk8b0ApcWyQJhoMcegYmxPKahpFR3+z/Y2f1pOrrGfQ8Bsj+aSS5zYSbzachb9rjCnneJnj3miUK7E7jnrbJzMVeJBpB/lusFiwhF9ZGbxs1m2OO22k51lirY/mEGpOIxrtcOkdKb+uIxIHjlcjfVLLvdzQDZYErltd2zZOp8byl+XhnIuxsCeGJ7WmTwWFjtlHMyZpFSz8HpkRhQo+eGYE5wAmgsDTmdObt4Qy4BYfcnlWTEKSSB2XWqsQtKHAT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 1, 2024 at 8:16=E2=80=AFPM Ryan Roberts = wrote: > > On 30/06/2024 12:34, Lance Yang wrote: > > Hi Barry, > > > > Thanks for following up! > > > > On Sun, Jun 30, 2024 at 5:48=E2=80=AFPM Barry Song = wrote: > >> > >> On Thu, Apr 25, 2024 at 3:41=E2=80=AFAM Ryan Roberts wrote: > >>> > >>> + Barry > >>> > >>> On 24/04/2024 14:51, Lance Yang wrote: > >>>> At present, the split counters in THP statistics no longer include > >>>> PTE-mapped mTHP. Therefore, this commit introduces per-order mTHP sp= lit > >>>> counters to monitor the frequency of mTHP splits. This will assist > >>>> developers in better analyzing and optimizing system performance. > >>>> > >>>> /sys/kernel/mm/transparent_hugepage/hugepages-/stats > >>>> split_page > >>>> split_page_failed > >>>> deferred_split_page > >>>> > >>>> Signed-off-by: Lance Yang > >>>> --- > >>>> include/linux/huge_mm.h | 3 +++ > >>>> mm/huge_memory.c | 14 ++++++++++++-- > >>>> 2 files changed, 15 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > >>>> index 56c7ea73090b..7b9c6590e1f7 100644 > >>>> --- a/include/linux/huge_mm.h > >>>> +++ b/include/linux/huge_mm.h > >>>> @@ -272,6 +272,9 @@ enum mthp_stat_item { > >>>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, > >>>> MTHP_STAT_ANON_SWPOUT, > >>>> MTHP_STAT_ANON_SWPOUT_FALLBACK, > >>>> + MTHP_STAT_SPLIT_PAGE, > >>>> + MTHP_STAT_SPLIT_PAGE_FAILED, > >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE, > >>>> __MTHP_STAT_COUNT > >>>> }; > >>>> > >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>>> index 055df5aac7c3..52db888e47a6 100644 > >>>> --- a/mm/huge_memory.c > >>>> +++ b/mm/huge_memory.c > >>>> @@ -557,6 +557,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_= STAT_ANON_FAULT_FALLBACK); > >>>> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FA= ULT_FALLBACK_CHARGE); > >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT); > >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_F= ALLBACK); > >>>> +DEFINE_MTHP_STAT_ATTR(split_page, MTHP_STAT_SPLIT_PAGE); > >>>> +DEFINE_MTHP_STAT_ATTR(split_page_failed, MTHP_STAT_SPLIT_PAGE_FAILE= D); > >>>> +DEFINE_MTHP_STAT_ATTR(deferred_split_page, MTHP_STAT_DEFERRED_SPLIT= _PAGE); > >>>> > >>>> static struct attribute *stats_attrs[] =3D { > >>>> &anon_fault_alloc_attr.attr, > >>>> @@ -564,6 +567,9 @@ static struct attribute *stats_attrs[] =3D { > >>>> &anon_fault_fallback_charge_attr.attr, > >>>> &anon_swpout_attr.attr, > >>>> &anon_swpout_fallback_attr.attr, > >>>> + &split_page_attr.attr, > >>>> + &split_page_failed_attr.attr, > >>>> + &deferred_split_page_attr.attr, > >>>> NULL, > >>>> }; > >>>> > >>>> @@ -3083,7 +3089,7 @@ int split_huge_page_to_list_to_order(struct pa= ge *page, struct list_head *list, > >>>> XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, ne= w_order); > >>>> struct anon_vma *anon_vma =3D NULL; > >>>> struct address_space *mapping =3D NULL; > >>>> - bool is_thp =3D folio_test_pmd_mappable(folio); > >>>> + int order =3D folio_order(folio); > >>>> int extra_pins, ret; > >>>> pgoff_t end; > >>>> bool is_hzp; > >>>> @@ -3262,8 +3268,10 @@ int split_huge_page_to_list_to_order(struct p= age *page, struct list_head *list, > >>>> i_mmap_unlock_read(mapping); > >>>> out: > >>>> xas_destroy(&xas); > >>>> - if (is_thp) > >>>> + if (order >=3D HPAGE_PMD_ORDER) > >>>> count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_= FAILED); > >>>> + count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT_PAGE : > >>>> + MTHP_STAT_SPLIT_PAGE_FAILED); > >>>> return ret; > >>>> } > >>>> > >>>> @@ -3327,6 +3335,8 @@ void deferred_split_folio(struct folio *folio) > >>>> if (list_empty(&folio->_deferred_list)) { > >>>> if (folio_test_pmd_mappable(folio)) > >>>> count_vm_event(THP_DEFERRED_SPLIT_PAGE); > >>>> + count_mthp_stat(folio_order(folio), > >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE); > >>> > >>> There is a very long conversation with Barry about adding a 'global "= mTHP became > >>> partially mapped 1 or more processes" counter (inc only)', which term= inates at > >>> [1]. There is a lot of discussion about the required semantics around= the need > >>> for partial map to cover alignment and contiguity as well as whether = all pages > >>> are mapped, and to trigger once it becomes partial in at least 1 proc= ess. > >>> > >>> MTHP_STAT_DEFERRED_SPLIT_PAGE is giving much simpler semantics, but l= ess > >>> information as a result. Barry, what's your view here? I'm guessing t= his doesn't > >>> quite solve what you are looking for? > >> > >> This doesn't quite solve what I am looking for but I still think the > >> patch has its value. > >> > >> I'm looking for a solution that can: > >> > >> * Count the amount of memory in the system for each mTHP size. > >> * Determine how much memory for each mTHP size is partially unmappe= d. > >> > >> For example, in a system with 16GB of memory, we might find that we ha= ve 3GB > >> of 64KB mTHP, and within that, 512MB is partially unmapped, potentiall= y wasting > >> memory at this moment. I'm uncertain whether Lance is interested in > >> this job :-) > > > > Nice, that's an interesting/valuable job for me ;) > > > > Let's do it separately, as 'split' and friends probably can=E2=80=99t b= e the > > solution you > > mentioned above, IMHO. > > > > Hmm... I don't have a good idea about the solution for now, but will > > think it over > > and come back to discuss it here. > > I have a grad starting in a couple of weeks and I had been planning to in= itially > ask him to look at this to help him get up to speed on mTHP/mm stuff. But= I have > plenty of other things for him to do if Lance wants to take this :) Hi Ryan, Lance, My performance profiling is pending on the mTHP size and partially unmapped mTHP size issues (understanding the distribution of folio sizes within the system), so I'm not waiting for either Ryan's grad or Lance. I've sent an RFC for this, and both of you are CC'd: https://lore.kernel.org/all/20240808010457.228753-1-21cnbao@gmail.com/ Apologies for not waiting. You are still warmly welcomed to participate in the discussion and review. > > > > >> > >> Counting deferred_split remains valuable as it can signal whether the = system is > >> experiencing significant partial unmapping. > > > > Have a nice weekend! > > Lance > > > >> > >>> > >>> [1] https://lore.kernel.org/linux-mm/6cc7d781-884f-4d8f-a175-8609732b= 87eb@arm.com/ > >>> > >>> Thanks, > >>> Ryan > >>> > >>>> list_add_tail(&folio->_deferred_list, &ds_queue->split= _queue); > >>>> ds_queue->split_queue_len++; > >>>> #ifdef CONFIG_MEMCG > >>> > >> Thanks Barry