From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27F98CD1297 for ; Thu, 11 Apr 2024 05:42:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 371366B0083; Thu, 11 Apr 2024 01:42:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 320886B0085; Thu, 11 Apr 2024 01:42:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 211016B0087; Thu, 11 Apr 2024 01:42:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 02E886B0083 for ; Thu, 11 Apr 2024 01:42:53 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5006C1C0EC7 for ; Thu, 11 Apr 2024 05:42:53 +0000 (UTC) X-FDA: 81996157026.08.39D72C5 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf28.hostedemail.com (Postfix) with ESMTP id 86161C000E for ; Thu, 11 Apr 2024 05:42:51 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WkZHLqBe; spf=pass (imf28.hostedemail.com: domain of souravpanda@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=souravpanda@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712814171; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jMkfqoDkboYRR/XCrZxl2oRpYUAvMTs4LCq2F5wEBu4=; b=xVtkik3yOAPaSMHWvg4G9u5347Uo9OXtoZbGPwEstabAJHUXO03id5uPIwxHcK1vbieEmW LilSKB01uGKk4gbFOHvaaHWxoOyar88ec+MVosXUY91sZi+O6+EDlKJKsUHdhKP2Ii5M8p vfFHyanAbxdAm1GgGoObHorJ1/z3Hi4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712814171; a=rsa-sha256; cv=none; b=A/u1I6fOYtxD3zfCCxmJxRDxuViaJO1ECWpxapnjLeEJqjetB8AAwZQdMMpH3bGbsXNMV1 LYZQh5UZ5ENmUbzH64ex0qPUGCYJH1ruRZ2sDjEXohX0ZfckQZcQz9U7e59z4gMg2uHLO8 5Ee7JbRURpRXk7Jq3BESaqGBj3WWNKA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WkZHLqBe; spf=pass (imf28.hostedemail.com: domain of souravpanda@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=souravpanda@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-56e6acb39d4so4680705a12.1 for ; Wed, 10 Apr 2024 22:42:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712814170; x=1713418970; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=jMkfqoDkboYRR/XCrZxl2oRpYUAvMTs4LCq2F5wEBu4=; b=WkZHLqBe2yAz6Ax724fCjW5PDDPnzeRZlJjn/LAGcTQSfTbOOoSZddvYtYX/7HEz9y YLnFZhR6bvX4NXXavuY2f6RGY5LwW4vv1YczMcE7neoczWwp0hfHIhSau0GNj/B66s6s SHSEJua2cO7s6HxVVs54tYFRr5FW8yEdOmbc5Ec/sUzqXCAGlPFM/IaVcWuqAbibp/tu 8cGN9PCG0Pd07UZRCgvWX74xdTtlG/ve2z+WTiOJ2f4vd0Ot8OHVa95NerHLSgbeH9ES LP7RZaFh9f1ZG9+SVYgrEW8tv/YY4Rl6eGCYBYfbdVhNCdDqGU0PcFXGuh6V7YREJzcS Hlxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712814170; x=1713418970; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jMkfqoDkboYRR/XCrZxl2oRpYUAvMTs4LCq2F5wEBu4=; b=TW704HcWQa5NNaIpEg+/W5JOUGQdB7ah+T1iVsvbjjEu1NxvlQJJLTwmiFCl93ZPnH bGPq0e1dagOL1eZXppa25Lj2P3PCYf7XNUbVKZzdiVNNPUNcIuF3cwpc4lA/tsL+Ldpw Hek4P1S4Lqj6Iu4sgVM5rcVc2LXb2+X3+JFgsqvy3bC30pNgCNKWJ55GAzufZGRaO7rx o6ZS0uv9YwC28EL9/10CuiTgOAcNj490LTR8eVoS9UjZbbPKoUf7cDutjeiCwru4JpZE trJFOQ9zpe8GVltPSo8GGQzoKr/saqCNuouBpxuwhKwVeEyvMkrsgIPH4Ep585VlinM9 Pt8g== X-Forwarded-Encrypted: i=1; AJvYcCXn8zh0RsTYJjSppSNG0o+jp/FINKaT3zmj7OlttvGUShIPmOuBdtW0chOv9NGwfNc+QHwXSvkLNd1DipUg6TnkBME= X-Gm-Message-State: AOJu0Yw+KhYOllwFFZlpwkcemdcONefl4tX/8d63wPEv5cFLtw3qoj+n mRB8hzg0lY/Znjr3fRnXAP4Yyj6DAN1+1Wra2RUxPG+CKba6x/xoVHk7dOOPP9vTTz5+0UKB+bP 3fq1l6bl+jbnvZ3y69Z5WQzGfW6kHfQxnn5v1 X-Google-Smtp-Source: AGHT+IFwtcPu1rti2KFfqANvc8VuEwDi6OV+j0k1VMTi0MbxroAUlDJMwFf7brXc1gQQF/JclE4Of0neK5MvbbAdoQ4= X-Received: by 2002:a17:906:5918:b0:a4e:cd5c:da72 with SMTP id h24-20020a170906591800b00a4ecd5cda72mr2798546ejq.63.1712814169770; Wed, 10 Apr 2024 22:42:49 -0700 (PDT) MIME-Version: 1.0 References: <20240220214558.3377482-1-souravpanda@google.com> <20240220214558.3377482-2-souravpanda@google.com> <20240319143320.d1b1ef7f6fa77b748579ba59@linux-foundation.org> <65b77d3e-d683-1e90-ebb0-5c7758143048@google.com> In-Reply-To: <65b77d3e-d683-1e90-ebb0-5c7758143048@google.com> From: Sourav Panda Date: Wed, 10 Apr 2024 22:42:38 -0700 Message-ID: Subject: Re: [PATCH v9 1/1] mm: report per-page metadata information To: David Rientjes Cc: Andrew Morton , corbet@lwn.net, gregkh@linuxfoundation.org, rafael@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, rppt@kernel.org, david@redhat.com, rdunlap@infradead.org, chenlinxuan@uniontech.com, yang.yang29@zte.com.cn, tomas.mudrunka@gmail.com, bhelgaas@google.com, ivan@cloudflare.com, pasha.tatashin@soleen.com, yosryahmed@google.com, hannes@cmpxchg.org, shakeelb@google.com, kirill.shutemov@linux.intel.com, wangkefeng.wang@huawei.com, adobriyan@gmail.com, Vlastimil Babka , "Liam R. Howlett" , surenb@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, Matthew Wilcox , weixugc@google.com Content-Type: multipart/alternative; boundary="000000000000281ae30615cba077" X-Rspamd-Queue-Id: 86161C000E X-Rspam-User: X-Stat-Signature: d3te65fxg9ez1fu6z85m8s7fns1exuoe X-Rspamd-Server: rspam03 X-HE-Tag: 1712814171-384577 X-HE-Meta: U2FsdGVkX1+m8ARde9rYUK5nO+3Z4rKhQIUuqYrDN1dVdeC+2TTwABtlSRVXafROnXQuA3VJJ+Ot/2xgH/YtKmGDiaDRnIACGpU4oF5x2z4RTrZqqqqb4RD1tyvWfmKKK5fgCTFRj/efRC210n03sSE8yp3UpkVmPbOJmq1aa4FkIKj65VZP/MCM0dsczx15xgtxPVhFr9RIeKvzI1V6Nm5cAmp9vpBZ9eBu0R1T6FcQ+5uYmuM1SULpzBty9v3iypA4lJXzOSYhUs4xndFF4Y18z7dYyZvK874T8lX12AmTvFQRZvrSt2+p79VAjMAaON1nLT6D9r8pzNepxhfEBrvgTurP//Wyh0LhpwXyJZXGzeUk/qVHQ9HYxs6S8u347NfK6oyjUm2zJ+Ao7DHNfQ+C2e0xGUPKW436V2KVy05e7Q1OuRSFLmPU40cqtCNB5GbOK7CRTa0RLP+8n+5aSj8aB0U/dNGi2CeYYUVqxmfEBVR6IwANdRIJLZvqclpYR7BC4DvNi00YAONOdxtKFjbnvTUfTUD+7znDoDxEAO5nsLlbs+WjuS1vx9esuJZJJbsrO42uD9kIKHyqFfkYpR1jJupj+wU7MhqlgIqn8aQnAfs6wnLXNpcf2qX+OLHAYMo6NRuc4JP5mtBpsjd+6fgyMCDsh5kSypkAr0sm6Wzsv4iws2q29uJuPk98VQkKfhEiBI2bkHGD+rdWsjoazXI9JjUQ6TrOFz2CV4sUUj3qFhpo9XIlTXiFbiheE5/43AlfnqZBcSECEr6Ox8yytV0/uuJMoubRKJsKXpWWZ+opx54egLjQ3K3v94nqkV0QjSjgnjVEIIt4V/r1g3e+TLiSqFR3xrrHZRj2yamGYE/sSHpQLS/QXkvVU7KnFpl5fXPaOPbJkrtnf1nGZaqkqeisRwsrc3h8OemjnPEQxZCN1gY28DdYQYSMJ96+ixUNwx+IYVAPETUs55fMsvU atAfehd1 Zt0JIHqr+gaF3JoExW8Onf0FoDYzePOFmMUz/u4M0291HqIs74fzxFdcJ7BzIKoY2vJJyJrUyB1fWhdaGrgjRtGWmv3lmHk1r4Q2b2fd0tHV7NyXZcCZJfb4Ua5rbLwKNZv9HGw1/mznZLswehQ5YutqE8gt+44EM+8ErisEO3a/sU0X+WMJkuO51MbsT4jsHaMLCJcp3C7f3oArhLwG7yJJdC4FQT/YG+8TqI0wogWEoH/RKFNhvCeOH7lR6vDvY8qV3KcGKJAsEQJ3mL/nbArl3G1pAyMckHFju/WnkLNzCck/Ha/DbYiRyT3K087j+uuSfAm4t0381F3gtjez70iy2Pjro9EFWdDAYCiAJYKvdUScennuOTMQ2LOqh2myEu2MlJ5LUQ1J19tWYXtJ2sNNOdvfqnfXsRSF5cSQ0fH+6R9E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --000000000000281ae30615cba077 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Apr 10, 2024 at 4:58=E2=80=AFPM David Rientjes wrote: > On Tue, 19 Mar 2024, Andrew Morton wrote: > > > On Tue, 20 Feb 2024 13:45:58 -0800 Sourav Panda > wrote: > > > > > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot, > > > to /sys/devices/system/node/nodeN/vmstat and a global Memmap field > > > to /proc/meminfo. This information can be used by users to see how > > > much memory is being used by per-page metadata, which can vary > > > depending on build configuration, machine architecture, and system > > > use. > > > > I yield to no man in my admiration of changelogging but boy, that's a > > lot of changelogging. Would it be possible to consolidate the [0/N] > > coverletter and the [1/N] changelog into a single thing please? > > > > > Documentation/filesystems/proc.rst | 3 +++ > > > fs/proc/meminfo.c | 4 ++++ > > > include/linux/mmzone.h | 4 ++++ > > > include/linux/vmstat.h | 4 ++++ > > > mm/hugetlb_vmemmap.c | 17 ++++++++++++---- > > > mm/mm_init.c | 3 +++ > > > mm/page_alloc.c | 1 + > > > mm/page_ext.c | 32 +++++++++++++++++++++-------= -- > > > mm/sparse-vmemmap.c | 8 ++++++++ > > > mm/sparse.c | 7 ++++++- > > > mm/vmstat.c | 26 +++++++++++++++++++++++- > > > 11 files changed, 94 insertions(+), 15 deletions(-) > > > > And yet we offer the users basically no documentation. The new sysfs > > file should be documented under Documentation/ABI somewhere and > > perhaps we could prepare some more expansive user-facing documentation > > elsewhere? > > > > Sourav, is it possible to refresh this series into a v10 on top of the > latest upstream kernel with a single condensed changelog that details the > current behavior, what extension this is adding, and how it is generally > useful? > > As noted here, the cover letter has great material that discusses the > rationale for this change but would be lost if only this patch is merged. > So typically the cover letter material gets concatenated into the > changelog, but in this case there's a lot of overlap. > > A single patch that includes a succinct changelog would be awesome. > > And then the requested documentation in Documentation/ABI either included > in the same patch or as a second patch in the series? > > I don't think the resulting patch series will actually need a cover lette= r > after that, it will be able to stand on its own. > Thanks David, I will send v10 soon. > > > I'd like to hear others' views on the overall usefulness/utility of thi= s > > change, please? > > > > Likely true for all hyperscalers, the immediate use case that this could > be applied to is to track boot memory overhead and any regression over > time (across kernel upgrades, firmware upgrades, etc) that may change the > amount of total memory available. We'd want to subtract out the boot > overhead that we know about (like struct page here) and then alert on any > regression where we're losing memory from reboot to reboot for any reason= . > > This increased visibility into boot memory overhead allows us to create a > mechanism to track changes over time when otherwise that attribution of > that memory is not available. > --000000000000281ae30615cba077 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, Apr 10, 2024 at 4:58=E2=80=AF= PM David Rientjes <rientjes@googl= e.com> wrote:
On Tue, 19 Mar 2024, Andrew Morton wrote:

> On Tue, 20 Feb 2024 13:45:58 -0800 Sourav Panda <souravpanda@google.com> wr= ote:
>
> > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot= ,
> > to /sys/devices/system/node/nodeN/vmstat and a global Memmap fiel= d
> > to /proc/meminfo. This information can be used by users to see ho= w
> > much memory is being used by per-page metadata, which can vary > > depending on build configuration, machine architecture, and syste= m
> > use.
>
> I yield to no man in my admiration of changelogging but boy, that'= s a
> lot of changelogging.=C2=A0 Would it be possible to consolidate the [0= /N]
> coverletter and the [1/N] changelog into a single thing please?
>
> >=C2=A0 Documentation/filesystems/proc.rst |=C2=A0 3 +++
> >=C2=A0 fs/proc/meminfo.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 |=C2=A0 4 ++++
> >=C2=A0 include/linux/mmzone.h=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0|=C2=A0 4 ++++
> >=C2=A0 include/linux/vmstat.h=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0|=C2=A0 4 ++++
> >=C2=A0 mm/hugetlb_vmemmap.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0| 17 ++++++++++++----
> >=C2=A0 mm/mm_init.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 3 +++
> >=C2=A0 mm/page_alloc.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 1 +
> >=C2=A0 mm/page_ext.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 32 +++++++++++++++++++++---------
> >=C2=A0 mm/sparse-vmemmap.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 |=C2=A0 8 ++++++++
> >=C2=A0 mm/sparse.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 7 ++++++-
> >=C2=A0 mm/vmstat.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 26 +++++++++++++++++++++++-
> >=C2=A0 11 files changed, 94 insertions(+), 15 deletions(-)
>
> And yet we offer the users basically no documentation.=C2=A0 The new s= ysfs
> file should be documented under Documentation/ABI somewhere and
> perhaps we could prepare some more expansive user-facing documentation=
> elsewhere?
>

Sourav, is it possible to refresh this series into a v10 on top of the
latest upstream kernel with a single condensed changelog that details the <= br> current behavior, what extension this is adding, and how it is generally useful?

As noted here, the cover letter has great material that discusses the
rationale for this change but would be lost if only this patch is merged.= =C2=A0
So typically the cover letter material gets concatenated into the
changelog, but in this case there's a lot of overlap.

A single patch that includes a succinct changelog would be awesome.

And then the requested documentation in Documentation/ABI either included <= br> in the same patch or as a second patch in the series?

I don't think the resulting patch series will actually need a cover let= ter
after that, it will be able to stand on its own.

<= /div>
Thanks David, I will send v10 soon.
=C2=A0

> I'd like to hear others' views on the overall usefulness/utili= ty of this
> change, please?
>

Likely true for all hyperscalers, the immediate use case that this could be applied to is to track boot memory overhead and any regression over
time (across kernel upgrades, firmware upgrades, etc) that may change the <= br> amount of total memory available.=C2=A0 We'd want to subtract out the b= oot
overhead that we know about (like struct page here) and then alert on any <= br> regression where we're losing memory from reboot to reboot for any reas= on.

This increased visibility into boot memory overhead allows us to create a <= br> mechanism to track changes over time when otherwise that attribution of that memory is not available.
--000000000000281ae30615cba077--