From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F01E7C52D71 for ; Fri, 9 Aug 2024 08:48:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DCF66B0089; Fri, 9 Aug 2024 04:48:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68D256B0098; Fri, 9 Aug 2024 04:48:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5546D6B009A; Fri, 9 Aug 2024 04:48:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 376546B0089 for ; Fri, 9 Aug 2024 04:48:17 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D2F8E80B52 for ; Fri, 9 Aug 2024 08:48:16 +0000 (UTC) X-FDA: 82432080192.28.EFC02B7 Received: from mail-ua1-f54.google.com (mail-ua1-f54.google.com [209.85.222.54]) by imf09.hostedemail.com (Postfix) with ESMTP id 1C266140011 for ; Fri, 9 Aug 2024 08:48:14 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PnMAmmjF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723193246; a=rsa-sha256; cv=none; b=mHzkzezbflbFWa7Ciwy3Dbwg1GJdDmSq3MLOL0edZUkTHyUcMIQl81pfB6sUmU7NBD0rBP EAE+XkF0L+ZwqKDqVkq4UA8/ZGyZPf91ay2LKWtgRtC0wtUqZyP1H2P82XiN7vrXIcwrsj 4fxuzs30ajNoW2qzNc96RIpZIeMH2uA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PnMAmmjF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723193246; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jwg2cjOZZPVR9Lr9Y/oCQfiAGsyONN4P4/iI9UPEYzM=; b=4BviaEO8phIu63b51R00vniVbiOHrBbcqpIWUsT6+xW+eZOOV6J32pPqae2LcHZTypAVRq VJSHxjVRSIoJPbJaRkQncmkuZ3vHiJMuiMtR0nDlZwdCBh0EuS1L3sm9lN+QMcKWfmkyzn R74IGqRga23fTL/c3zGpjflfAKWAcwA= Received: by mail-ua1-f54.google.com with SMTP id a1e0cc1a2514c-84119164f2cso481484241.3 for ; Fri, 09 Aug 2024 01:48:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723193294; x=1723798094; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jwg2cjOZZPVR9Lr9Y/oCQfiAGsyONN4P4/iI9UPEYzM=; b=PnMAmmjF/840fy6V3ftPd0hbIzsGgyCrnpFklXB7/QGREnAe/rBbQsgwOXjrlTfY80 d7NwgV178GWEWcA7z7r+CCg55TZ2sxT28ZXb/fGbgMyv/otV0LVa7HLynXB/t+dQcugK lBhqGFMPjSMgj+Q7SRKXldIYk1NPuZLH+AkOSC4RFvmzohpuuIdcK1mpWbZArMKiZ7fP TgK7EgIL3q33JLbHty+N25sNy9xLnnAQkL5BBesi1HKyEYF9kLwbWkdYgGrHgMxQbbBo F1cLNxKpreOOxf3yEeTkhHJKnFony/6W9DLbmW5tevHfmzt71mg3wDoouVOGJEb+VRLD PdxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723193294; x=1723798094; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jwg2cjOZZPVR9Lr9Y/oCQfiAGsyONN4P4/iI9UPEYzM=; b=je5meWkqHThYXwftyBcWd6BOnBlR7WyXOww/5s2sZ9ZZIQRHGYoNFHsi13+ry2QF9d xaHgjHhsuZWlyOAU+2siJ5dPMMVeVEmepytOEsfedteMSwWVzQUslLE2RKgvj7URilGB 6RTfr+2R0pYflU2zs2HauK5a6/ttUbzHkA/8eNmtuFNL7zGgyc2Sa2KCpL4HRQQ0AUoc I6Ae8OktmXKiAt0xVL3UKEa28pFmFArheHQvlLho2H2NKieW+bt+8MU1cOBM6gkJAkVy aLqo3zGXZn/Yjc5FD5AB9POl7kE+/xaxlpwoOV9pOP8XaLzuIAzZ29XwJs/OFZcw1G27 CruQ== X-Forwarded-Encrypted: i=1; AJvYcCX5R58IRYU6fEdUOZIHVhUgdihCTEg1rhOz/SEVJCrMxcR5ayCsGwJ6izwofy9LlQGm5sRY1EMO20dEeeKUtb6ttVg= X-Gm-Message-State: AOJu0YwX6c6tQiKOyYOEzPegxkhfSLxMLoVj8nOAsiOI6XMDgb5HT7yA RrpskP89MstFpinqY2ghqrxSjs0dcQZGPKp2n9yDLp4SBX6UYRR7u+AYcKN6QAB2F6Oxk0pTAiq sh+yPx/Q0pKZ5xt6Xp50Zh0FeL0g= X-Google-Smtp-Source: AGHT+IH4k6D+eqrGKlNBgwJQBKYI4j6VF1hO9C83Bnfw6RYS9XAbkJQBK2IMOcPvDzCm5InOMiWpt/jsB1IwFEmtt9U= X-Received: by 2002:a05:6102:4191:b0:48f:82c5:90e7 with SMTP id ada2fe7eead31-495d8465fbbmr933927137.14.1723193293996; Fri, 09 Aug 2024 01:48:13 -0700 (PDT) MIME-Version: 1.0 References: <20240808010457.228753-1-21cnbao@gmail.com> <20240808010457.228753-3-21cnbao@gmail.com> <36e8f1be-868d-4bce-8f32-e2d96b8b7af3@arm.com> In-Reply-To: <36e8f1be-868d-4bce-8f32-e2d96b8b7af3@arm.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 9 Aug 2024 16:48:02 +0800 Message-ID: Subject: Re: [PATCH RFC 2/2] mm: collect the number of anon large folios partially unmapped To: Ryan Roberts Cc: akpm@linux-foundation.org, linux-mm@kvack.org, chrisl@kernel.org, david@redhat.com, kaleshsingh@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, ioworker0@gmail.com, baolin.wang@linux.alibaba.com, ziy@nvidia.com, hanchuanhua@oppo.com, Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1C266140011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: sdqaqxrskpg5iyg3wix41dnmddopwn9w X-HE-Tag: 1723193294-341051 X-HE-Meta: U2FsdGVkX1+y/We/Pf36qfhXSFWVRPCO0zvFlWEfbbvCQwJRUIVD8Q5Ca5wCxENfRTN93qxy9Y6En+qatEHebbK4BG4k6JovE6kB6dRsJBhuCsZY/oBTlf7FT+XcubOmFFC/iYzQGWXygJfE2AKMWBeH6s0LaRUSHB++blmr/q3dCRLYVjHVCRoVj9EE44Lc3yxQMVQRdiRnKZdZ9xzBKb55t9KRabqsbB6yKKi9jnbLTVBRGvwPeonFU0RQsZum7r0Gh0u12xXuVcWfkTGyLS3Erq1i9gqWR09Cku1x8drtnO8z2jQED0R8FZyiO7RlziyTtJaHbwyKTyVi8tXanZQhfCr6KyTUTYMeT6wuu7MXK9RA2JorImzdnV9SgqnlcTeizRaQOGBpfem83vT5XNuLQtgJkWkpmGnMTiunO4JcDsRJwqkjnx6XKX/KwMjokTLJBuO5q4f/3nqtXtJvElfvyaF+6cwyUsxAFOWFKL9v7naCWVqDGYFOuKP7GiKTqkZw0Q9WT8jUMlPq2x8QP3P5FLN+Hn0mHdCaauihxgmH3HR4irDHfopQ+HUMCtyDsBIf09yw087TgNlU9Yshh5Vrf4UM7TolCSCMnSC9OiGLS04h/ZEjqaba6hY8WaFYlcaVKh4y9MWbbdpyqCTXAF01r9SbgcqJvX6Kp6nik1UXYhSkmaDSzuz9O6q5RG/r8xKEFCcUelYgv3afEUvOlBlpX9j4A2Ig7CdW0wM2qpShoUN1P1XHO9gH2uyYfYFGVEpjnlwS5F6hSVQxU8JShqQsuUJ18GzVhkyWdpyT0BBbTzht1pPgJEndjef9dr0x5x0r+4daYt/8nXnkYJl4jmriAaY/n3tbGG6kHIPzlszcWrRU4fWGBC2CWf5+8nS2ed7MhRFtbFNUGQuGQWqcOaNAjsOsk91xn6EiXS6UHWEoDRk2brP4i5Ew/0hyvi3o5uq7iQxxntuyQ9x/Hz1 I1yvfYFX AofC3TyYMpU7wtWDFVQ6RmyLJZpan70XHD6+AVyUkf9QZCa8QpJi7GBkZzT9fYotbOCVx6jqqogvZYEWG8ssl4XgFSjyWshon4EXg+jxsUg+/2DiCNuoqEYa6mdbUtiw8G3iakeFRNFou0GqklDx9I8lvHcYFKYzqzLWcf6p5mIX8nbqqXeBLPkhsP8o3xBCcJDb5+UNUvSAk0qf3dkjGhrnX+EOyJ5BF3maE8G24kdrx22vhtsnISF/veeu+4P2XW1U2Nkep22sxr0E7mQxnn0IVgYkmdaGwP8bWtbTGu8eaxz4U/g/z214EdFxb6pR4mADD4atxp4QA5mdySyra7f3UbcC5DaA02f8jYBUtNUL1qxdKpMuzi9rmPGoA+PM3flACAjAsR+TRnj+8rufrNl73INBDACKlJzdq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 9, 2024 at 4:23=E2=80=AFPM Ryan Roberts = wrote: > > On 08/08/2024 02:04, Barry Song wrote: > > From: Barry Song > > > > When an mTHP is added to the deferred_list, its partial pages > > are unused, leading to wasted memory and potentially increasing > > memory reclamation pressure. Tracking this number indicates > > the extent to which userspace is partially unmapping mTHPs. > > > > Detailing the specifics of how unmapping occurs is quite difficult > > and not that useful, so we adopt a simple approach: each time an > > mTHP enters the deferred_list, we increment the count by 1; whenever > > it leaves for any reason, we decrement the count by 1. > > > > Signed-off-by: Barry Song > > --- > > Documentation/admin-guide/mm/transhuge.rst | 5 +++++ > > include/linux/huge_mm.h | 1 + > > mm/huge_memory.c | 6 ++++++ > > 3 files changed, 12 insertions(+) > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation= /admin-guide/mm/transhuge.rst > > index 715f181543f6..5028d61cbe0c 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -532,6 +532,11 @@ anon_num > > These huge pages could be still entirely mapped and have partia= lly > > unmapped and unused subpages. > > > > +anon_num_partial_unused > > Why is the user-exposed name completely different to the internal > (MTHP_STAT_NR_ANON_SPLIT_DEFERRED) name? My point is that the user might not even know what a deferred split is; they are more concerned with whether there's any temporary memory waste or what the deferred list means from a user perspective. However, since we've referred to it as SPLIT_DEFERRED in other sys ABI, I agree with you that we should continue using that term. > > > + the number of anon huge pages which have been partially unmappe= d > > + we have in the whole system. These unmapped subpages are also > > + unused and temporarily wasting memory. > > + > > As the system ages, allocating huge pages may be expensive as the > > system uses memory compaction to copy data around memory to free a > > huge page for use. There are some counters in ``/proc/vmstat`` to help > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > > index 294c348fe3cc..4b27a9797150 100644 > > --- a/include/linux/huge_mm.h > > +++ b/include/linux/huge_mm.h > > @@ -282,6 +282,7 @@ enum mthp_stat_item { > > MTHP_STAT_SPLIT_FAILED, > > MTHP_STAT_SPLIT_DEFERRED, > > MTHP_STAT_NR_ANON, > > + MTHP_STAT_NR_ANON_SPLIT_DEFERRED, > > So the existing MTHP_STAT_SPLIT_DEFERRED is counting all folios that were= ever > put on the list, and the new MTHP_STAT_NR_ANON_SPLIT_DEFERRED is counting= the > number of folios that are currently on the list? Yep. > > In which case, do we need the "ANON" in the name? It's implicit for the e= xisting > split counters that they are anon-only. That would relate it more clearly= to the > existing MTHP_STAT_SPLIT_DEFERRED too? ack. > > > __MTHP_STAT_COUNT > > }; > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index b6bc2a3791e3..6083144f9fa0 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -579,6 +579,7 @@ DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT); > > DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED); > > DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); > > DEFINE_MTHP_STAT_ATTR(anon_num, MTHP_STAT_NR_ANON); > > +DEFINE_MTHP_STAT_ATTR(anon_num_partial_unused, MTHP_STAT_NR_ANON_SPLIT= _DEFERRED); > > > > static struct attribute *stats_attrs[] =3D { > > &anon_fault_alloc_attr.attr, > > @@ -593,6 +594,7 @@ static struct attribute *stats_attrs[] =3D { > > &split_failed_attr.attr, > > &split_deferred_attr.attr, > > &anon_num_attr.attr, > > + &anon_num_partial_unused_attr.attr, > > NULL, > > }; > > > > @@ -3229,6 +3231,7 @@ int split_huge_page_to_list_to_order(struct page = *page, struct list_head *list, > > if (folio_order(folio) > 1 && > > !list_empty(&folio->_deferred_list)) { > > ds_queue->split_queue_len--; > > + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_AN= ON_SPLIT_DEFERRED, -1); > > /* > > * Reinitialize page_deferred_list after removing= the > > * page from the split_queue, otherwise a subsequ= ent > > @@ -3291,6 +3294,7 @@ void __folio_undo_large_rmappable(struct folio *f= olio) > > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > > if (!list_empty(&folio->_deferred_list)) { > > ds_queue->split_queue_len--; > > + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_SPLIT= _DEFERRED, -1); > > list_del_init(&folio->_deferred_list); > > } > > spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > > @@ -3332,6 +3336,7 @@ void deferred_split_folio(struct folio *folio) > > if (folio_test_pmd_mappable(folio)) > > count_vm_event(THP_DEFERRED_SPLIT_PAGE); > > count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFER= RED); > > + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_SPLIT= _DEFERRED, 1); > > list_add_tail(&folio->_deferred_list, &ds_queue->split_qu= eue); > > ds_queue->split_queue_len++; > > #ifdef CONFIG_MEMCG > > @@ -3379,6 +3384,7 @@ static unsigned long deferred_split_scan(struct s= hrinker *shrink, > > list_move(&folio->_deferred_list, &list); > > } else { > > /* We lost race with folio_put() */ > > + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_AN= ON_SPLIT_DEFERRED, -1); > > list_del_init(&folio->_deferred_list); > > ds_queue->split_queue_len--; > > } > Thanks Barry