From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E30ECD1284 for ; Thu, 4 Apr 2024 10:53:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99F196B0089; Thu, 4 Apr 2024 06:53:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 927936B008A; Thu, 4 Apr 2024 06:53:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A1436B008C; Thu, 4 Apr 2024 06:53:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 55CA56B0089 for ; Thu, 4 Apr 2024 06:53:10 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CC292120FFA for ; Thu, 4 Apr 2024 10:53:09 +0000 (UTC) X-FDA: 81971537298.19.1AD79E8 Received: from mail-vk1-f169.google.com (mail-vk1-f169.google.com [209.85.221.169]) by imf22.hostedemail.com (Postfix) with ESMTP id 093A8C000D for ; Thu, 4 Apr 2024 10:53:06 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=k2NB8QMW; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712227987; a=rsa-sha256; cv=none; b=1SFT3WYatk13hxldPVWhY3dz/iFcFakCDfAaTppi3fPQvziLSbMFzoxx6Tw6I1mCfV77np BenUgN5eSAJD96ysOOnmTwfKqTg8Der7eaJnsNTlTWdheBEa65Art1iazpA5op9Ai63s+a 64Aifm0ieH+63wbmktoeuKZi4td7bpc= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=k2NB8QMW; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712227987; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nT52WlynKBDI7yIvvGm+XCa/PlJPkqYkeiYljo8TnZU=; b=31g8NhtLdMkkmiqkPlTGff/VqkYGZM4Dx6KBNhD0ea6SAx3OZ07yqheUkfKXZ3mhmWzn6C wXGf1oAlglK6d5XaWIIs9ONIT0x2yhABcJ3EtbsBTzJHMk9zvtC6/8e6JniYh8mZrS8lzF fmyNg7i3lY/WaaHVkLqinJ8WmaqtJfU= Received: by mail-vk1-f169.google.com with SMTP id 71dfb90a1353d-4da5eb33c45so277889e0c.0 for ; Thu, 04 Apr 2024 03:53:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712227986; x=1712832786; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nT52WlynKBDI7yIvvGm+XCa/PlJPkqYkeiYljo8TnZU=; b=k2NB8QMWEJ1AvFueKvLrjW+4FQiZmtr5tEgDVSObgL5jdvwjBPQoqKXj51laV3BhXQ zFLusuDpgqW9L6GCHVMXZiUwl85haV9pWtNhJryHG1tL33Edd1D47lWHeCLClyes+aFk jwjl7bn16yfN+G+gfMTu0Hkr0w+jL0umOQkYgDx08MIHve3ZJJ87xNtj2fqGIznbL88o RcYDf6ZCW1X8lyjIMij2ZLnvteWzz7FZXHW8nVq+Igx/yeBShNxiXmcqsQvg7ZfhIX3v IaAtkAW3oEqU2+0Hg1r3T3s9Iu0RWa8p1KMNv/adbk2Zfp5bo9tfAEsFTTizIXPvVO2+ ZOaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712227986; x=1712832786; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nT52WlynKBDI7yIvvGm+XCa/PlJPkqYkeiYljo8TnZU=; b=KZbqxdILEa5/CiIWMq7Ega1wfxnx62kw2yqpwFwUQR8e2uFKybRgZnFeqzbFQYuea7 LUD3W1b/2WSrU1YKfcHY61OmJyzGuAI7p4ESmujDV/CYIIpsAUcXg4E/yE7vVe1ZcRsH iVBJDRUWe4c9UnGYx5ND7nuVmrdlu4pPXlcb4f7mNmvRQehBsuyunY1mXV6FbRWMmt9i 2NvVhocI/Q1PGk/m+W6ltxJxAQ6kBlGi62sx88MD/BYIbNlJugCoM6fV8ODoWRgUgqXJ FjtKT9/LaYTrlsdPMZOBezp+6C6X+9M3QYB0bLFIQh8uKdiqPV0xaCMqGoDAJdP/7qlo pllg== X-Forwarded-Encrypted: i=1; AJvYcCWkFspdiyRExyWdKPoAU7eRw+kmZ0mKH9eYl0NJoHSdy6nTKGAIGUzfi4qc9xvzH+rEXdOUioHztI4strtxjfglxeQ= X-Gm-Message-State: AOJu0YyFNugmrPBcUi9/CilJLfdoTkvb6+IkM1Qv7Oi+7l2iUM1z22fD 2NwidCxS4+NuhSRBd2HUu4n6S3SF8MbuGOuaECY8eycCg2DooUZkeg2J2EiaNGiZEm9GWFSEY36 9+mC9JiBeEJrkoLIRUXLJH/OEsK0= X-Google-Smtp-Source: AGHT+IH2RTCg3LPtnJLqy+MJJNrKMD+RBWlDAPltMuEvBp7nCOXthbX9dBzI8k5+1Jxmzvw+YsYfKpLcU6ZPwhZ/pmI= X-Received: by 2002:a1f:fc06:0:b0:4d8:797b:94d5 with SMTP id a6-20020a1ffc06000000b004d8797b94d5mr1496130vki.3.1712227986058; Thu, 04 Apr 2024 03:53:06 -0700 (PDT) MIME-Version: 1.0 References: <20240403035502.71356-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 4 Apr 2024 23:52:54 +1300 Message-ID: Subject: Re: [PATCH v3] mm: add per-order mTHP alloc_success and alloc_fail counters To: Ryan Roberts Cc: David Hildenbrand , akpm@linux-foundation.org, linux-mm@kvack.org, cerasuolodomenico@gmail.com, chrisl@kernel.org, kasong@tencent.com, peterx@redhat.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, yosryahmed@google.com, yuzhao@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 093A8C000D X-Stat-Signature: 44jf6ictmyn8dhj5z8pidmy346nrrnqj X-Rspam-User: X-HE-Tag: 1712227986-503751 X-HE-Meta: U2FsdGVkX1+q+2CG7L8HEg70lqpCsQ2M/q4SkADTj9uEh9qF5VvQxDgwxhFHCI2LnLOqSfVgI6TM5/c056/KmYEJbprC2n/HUfsgIDl8QtKG5EIKQlgI9toaz3BUiuLPq5IwytFQmeLZXuLKKAR8mXKddOVmAusrrIYyuWusLvROUIuNES2inhGGQKVzq9TAzRIYuLdh/LY78SQEC73hlauup682Xk8Ni3pHQNrUfGC4+huywdOtdAPxH19S1sQOH52JoGbUR+JVlRoGTKeRIGTD42rfGg0gYQPNvX1+XiULJGM5GbY0AzV/2n8Qsfs/YSnRo2RWNYTLPsP9tahIIpyM7+LkYPtG1s3P1tFFwj2xDTcU58A0HEtL8K6sCXECR+yH9D0JcwDasxEu9P/5u9p0nTZDD8S8nKIcZpECsD6a5k9MEWf4g2VMRWuuBUBfyTYj5rBkaAMCXVJ7Crkf4g+ysbhlfjfrZaLMNuXvE4708jOedlEY4sFnMdAQqB00ZKw8qfxkVC0I+pR9ld8Q/h5Ge3gQDlHi50bYmu5Y3IkkTUUg4MpHq3qUFZ5ovTAc6Deg2xLpAfIaZNIgRhy3c9OV+EZfALvp2anFq/wBu1oqEv3E8a8YdhFj45Z9oG7Aq6dZuG1aOOjV/TFMvw0HL1ZKNhA968B0Ox6cJhZM+PQmb734Q6Z+redDnwUyweh6seXsyKLXvwgLM+G4BynuFJ5ui9b5/e11az/VZ3Z0mH/zCbfvvrs7IyoZUXam7eq9wTcbKLdDcrgKLXKIOBmiT/NPaACrV54DPtaS2DL0f0D5kQGXK2IQfKaKAfxzyriaQVNGG/eq+DsP5X8n887MqjiQl8ZKDzVSilkbQJHGo9OSH4JDg6VjSOVeXJX9FhMAhR3/nI3qMexttjyKkxgGYV/Nw7JjodtsEq0fEuWwNZ06IgIZiCCxtHZOtpneIKunLR1fWcrBSQ7n+ubF2Wz vURR/TA3 sxyXUxZG0380ySqCt84up7BrjFhEQGLCnDdIK5Yb0wE5XF7lm8/kVaNHFd0VO7vR6dLzTyoycH6AvVqH4yvnu+HZRS3q9qNAwAep64v5Dy8Bm9d6gKwDaI7GIwRNGoPFPawreB9VIVeTvxj48KKpW0U0L+O50ykqAvVl3K+tfZt//KDCbr6iT/gtaOP2jaNOjzVF+8UIOjjuAkolka3AaYhOo03F/2J1jp0Ik/uO++aJL7wUtG0/KR2ZCszX6l+xY53OiMWAFL2MwSusYDUdRreI8IUaRr9R2tA9nUSAw4cozTsII50KzghJC2S5qpQjDnvx//Xa+a4K8Y2kzohbgBmb+4dwIaOfaV6QzNhq+v9GFWtTbYGLzUO7dX5vYtDfkKbjtMV64gX6Wu+QCU/L8kukyXejB3qODvK41CEuGikDCy2TNh8HlPKkwE3I5YN6OoShdXTqR3YSUozg88N9QApWIh2hSStUq9DHvovqkGOKN+yk0j83ktU9oRvVLXa8NZk1GCRI7ZCqLwAc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 4, 2024 at 8:21=E2=80=AFPM Ryan Roberts = wrote: > > On 03/04/2024 22:00, Barry Song wrote: > > On Thu, Apr 4, 2024 at 12:48=E2=80=AFAM Ryan Roberts wrote: > >> > >> On 03/04/2024 09:22, David Hildenbrand wrote: > >>> On 03.04.24 05:55, Barry Song wrote: > >>>> From: Barry Song > >>>> > >>>> Profiling a system blindly with mTHP has become challenging due > >>>> to the lack of visibility into its operations. Presenting the > >>>> success rate of mTHP allocations appears to be pressing need. > >>>> > >>>> Recently, I've been experiencing significant difficulty debugging > >>>> performance improvements and regressions without these figures. > >>>> It's crucial for us to understand the true effectiveness of > >>>> mTHP in real-world scenarios, especially in systems with > >>>> fragmented memory. > >>>> > >>>> This patch sets up the framework for per-order mTHP counters, > >>>> starting with the introduction of anon_alloc_success and > >>>> anon_alloc_fail counters. Incorporating additional counters > >>>> should now be straightforward as well. > >>>> > >>>> Signed-off-by: Barry Song > >>>> --- > >>>> -v3: > >>>> * save some memory as order-0 and order-1 can't be THP, Ryan; > >>>> * rename to anon_alloc as right now we only support anon to addres= s > >>>> David's comment; > >>>> * drop a redundant "else", Ryan > >>>> > >>>> include/linux/huge_mm.h | 18 ++++++++++++++ > >>>> mm/huge_memory.c | 54 +++++++++++++++++++++++++++++++++++++= ++++ > >>>> mm/memory.c | 2 ++ > >>>> 3 files changed, 74 insertions(+) > >>>> > >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > >>>> index e896ca4760f6..5e9af6be9537 100644 > >>>> --- a/include/linux/huge_mm.h > >>>> +++ b/include/linux/huge_mm.h > >>>> @@ -70,6 +70,7 @@ extern struct kobj_attribute shmem_enabled_attr; > >>>> * (which is a limitation of the THP implementation). > >>>> */ > >>>> #define THP_ORDERS_ALL_ANON ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0= ) | BIT(1))) > >>>> +#define THP_MIN_ORDER 2 > >>>> /* > >>>> * Mask of all large folio orders supported for file THP. > >>>> @@ -264,6 +265,23 @@ unsigned long thp_vma_allowable_orders(struct > >>>> vm_area_struct *vma, > >>>> enforce_sysfs, orders); > >>>> } > >>>> +enum thp_event_item { > >>>> + THP_ANON_ALLOC_SUCCESS, > >>>> + THP_ANON_ALLOC_FAIL, > >>>> + NR_THP_EVENT_ITEMS > >>>> +}; > >>> > >>> Maybe use a prefix that resembles matches the enum name and is "obvio= usly" > >>> different to the ones in vm_event_item.h, like > >>> > >>> enum thp_event { > >>> THP_EVENT_ANON_ALLOC_SUCCESS, > >>> THP_EVENT_ANON_ALLOC_FAIL, > >>> __THP_EVENT_COUNT, > >>> }; > >> > >> FWIW, I'd personally replace "event" with "stat". For me "event" only = ever > >> increments, but "stat" can increment and decrement. An event is a type= of stat. > >> > >> You are only adding events for now, but we have identified a need for = inc/dec > >> stats that will be added in future. > > > > What about the below? > > > > enum thp_stat { > > THP_EVENT_ANON_ALLOC, > > THP_EVENT_ANON_ALLOC_FALLBACK, > > THP_EVENT_SWPOUT, > > THP_EVENT_SWPOUT_FALLBACK, > > ... > > THP_NR_ANON_PAGES, > > THP_NR_FILE_PAGES, > > I find this ambiguous; Is it the number of THPs or the number of base pag= es? Indeed. I've noticed that the kernel seems rather unpredictable in terms of= both the number of THPs and the number of base pages. for example: __node_stat_mod_folio(folio, NR_FILE_PAGES, nr); __node_stat_mod_folio(folio, NR_FILE_PAGES, -nr); add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); The previous statements pertain to base pages encountered when dealing with a large folio.; but the below means THP number: __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr_pmdmapped); My feeling is that when assessing the true memory size, base pages are the primary consideration. However, I haven't addressed this aspect yet, and we can defer this discussion until I delve into it further upon arrival :-) Currently, I'm exclusively employing it as a demonstration to differentiate between "event" and "stat", aiming to provide a glimpse into the future perspective= . > > I think David made the point about incorporating the enum name into the l= abels > too, so that there can be no namespace confusion. How about: > > __ > > So: > > THP_STAT_EV_ANON_ALLOC > THP_STAT_EV_ANON_ALLOC_FALLBACK > THP_STAT_EV_ANON_PARTIAL > THP_STAT_EV_SWPOUT > THP_STAT_EV_SWPOUT_FALLBACK > ... > THP_STAT_NR_ANON > THP_STAT_NR_FILE > ... > __THP_STAT_COUNT > > I'm not sure if we need to prefix SWPOUT with ANON? (might we additionall= y want > SHMEM in future? > > I'm aware I'm bikeshedding at this point... sorry :) No problem at all. I'm open to discussing it further so that I can better prepare for version 4. > > > > ... > > __THP_STAT_COUNT, > > }; > > > > THP_EVENT can only increase; THP_NR_ANON/FILE_PAGES can increase or > > decrease. Thus, their names have no "EVENT". > > Thanks Barry