From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF895C54E60 for ; Tue, 19 Mar 2024 14:26:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46CDF6B0092; Tue, 19 Mar 2024 10:26:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41CF06B0093; Tue, 19 Mar 2024 10:26:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30B396B0095; Tue, 19 Mar 2024 10:26:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 21E356B0092 for ; Tue, 19 Mar 2024 10:26:11 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9CBDE801B7 for ; Tue, 19 Mar 2024 14:26:10 +0000 (UTC) X-FDA: 81914013300.23.0FE1185 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf09.hostedemail.com (Postfix) with ESMTP id 9606F14002A for ; Tue, 19 Mar 2024 14:26:08 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=VtLUHhzp; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf09.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710858368; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; b=qDZahaW6axe/eNGkpRkkvvhg/l+noIfEcThxQVZMqgxKV/bqN3RkHCNKxAhYf3J5JYltNv ufGwnRXBHloHwB50a2dMVf/Gu6MGuFPjufU3+IUXuCh+ypLj+p3hM9LDxBiAm/hUhktcGy cXO7F8jUgEijR9sBNeRAJvchF270kgw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=VtLUHhzp; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf09.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710858368; a=rsa-sha256; cv=none; b=YTEeTnHMrEk1BuvTSoPg/LoFxEI6n0f1QGrONzWEfcAbQ8P3Lhvwyk/fdM98AtmP8V4qXg GcxZf6KDp3YUOCas8RqoHAYjtCbeQ+6SEfN6voZ8OEd/wHMj4d2VAc4S355aZ01miVGelT L1iX8jxFBfzVsx9tbgkBhEEQhCjSAWg= Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-430b7b22b17so25246951cf.2 for ; Tue, 19 Mar 2024 07:26:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710858367; x=1711463167; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; b=VtLUHhzpP5oCZAeEDZ236+7HkRllNBd5LKk1BFuKNlBS2m9PhwurWws3B6xfMXn05P rvemlCncjuVItSaDOoy68C7zRvSpat4sllJSd6uXAl9vlTru1lgMORFcz/Io2LAh2pOl vMe6IQoYSwwnmcT7p+kpezcznAkLDP4r/AqSIgE42XVYip1iR4GwYiLng7ng29aBn7/z ikASeAxbpPdlMhT64cLsAcQTQue5ZQufld4L2ZQul0QptX+0/IcUSbE4eRNf8sJQj0h9 M9/XTTxeqI40okb4i9sEsngCKsFubx5uJbm/NVW9THJL51nTbim9bcgDC4xCtsdFSIYw vrdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710858367; x=1711463167; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/ai7rsRCGQJwbmMN9+0MqsXjffyEnlQWeJfS3vA+etg=; b=NRJ/X+7DJf9kNY5c089S/dyWmjKXbG6k0D5rXh1ya1elMQ+Wr2IgiA72BZCgPFfdaI Eij+Lv9jxfwcJqeL19VRPQifhL+MiRexkJMB86a9AHnKuF3Nw/cztyHkg6IJ/sWp6dch NVH2Tz6aHFdZcTLp4kreu6rrB/JpMa66Wl/JPdX5nnaFAUgi2W9SLwKGiPcSiKJ6vVR4 6aqmnxJ3d2KmlPjG78ArBTYqXmFltzyEkt7FexnCq4K67S+JV1nzH39bnM/+gTbMri1z c9rq3FGmZQoj+sg7rOXf/V7nzxPaL0UaW7GqxPO/0vpEryuLgP+jzBS4V5dEwdbP9yT1 GJag== X-Forwarded-Encrypted: i=1; AJvYcCUbZKfKgnZa/r2kqfigQIIw8KRLRD5FQOCt7AVUsBYhL5X235pP5frzkV+9CI78bc1DV5b6/E5ewVUIgQzh1XET1XY= X-Gm-Message-State: AOJu0YxZGAGCz14herealNjgAfwcd/Lfs+3zvoPTTzKQmVkPiGoGAiSK 8dpv8PckZhs0slTmrkuofVm1Qz5dpHL2n1twf3aNdrqMv7fNpBTmMwaztud5B00nPfgUqNfjsZE 4ja6U+k/cB3eebp32bGZ5Ce8byp5zgngyMcZFvA== X-Google-Smtp-Source: AGHT+IFo9PiCBqdDBB0vvJuRxgnxergt1C9KVhjkk/hjChR3zfOYA2+Oj1/6ln63mQWAtnarjsG0RvDWTC/dvSg3Wa4= X-Received: by 2002:a05:622a:1991:b0:430:ef64:8637 with SMTP id u17-20020a05622a199100b00430ef648637mr1164760qtc.15.1710858367665; Tue, 19 Mar 2024 07:26:07 -0700 (PDT) MIME-Version: 1.0 References: <20240220214558.3377482-1-souravpanda@google.com> <20240220214558.3377482-2-souravpanda@google.com> In-Reply-To: From: Pasha Tatashin Date: Tue, 19 Mar 2024 10:25:30 -0400 Message-ID: Subject: Re: [PATCH v9 1/1] mm: report per-page metadata information To: akpm@linux-foundation.org Cc: Sourav Panda , corbet@lwn.net, gregkh@linuxfoundation.org, rafael@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, rppt@kernel.org, david@redhat.com, rdunlap@infradead.org, chenlinxuan@uniontech.com, yang.yang29@zte.com.cn, tomas.mudrunka@gmail.com, bhelgaas@google.com, ivan@cloudflare.com, yosryahmed@google.com, hannes@cmpxchg.org, shakeelb@google.com, kirill.shutemov@linux.intel.com, wangkefeng.wang@huawei.com, adobriyan@gmail.com, vbabka@suse.cz, Liam.Howlett@oracle.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, weixugc@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 9606F14002A X-Stat-Signature: bx6kisno1bkoxy4cgtdbtmo6p6ohh11x X-Rspam-User: X-HE-Tag: 1710858368-466849 X-HE-Meta: U2FsdGVkX1++F9c5HNHERFulCanCMBlvTLdZmCvuaDvk2FcfZ/+ikIglar+txFBTirET1WfJzOSZG6mA73eaYjTDPMhYbxSKGr/gHYXJi3Z0GLW7ZIr0tr8q9MqzySjBnxqSYKdzPN+WFjHfQ5zBJUm8dsozBMtJZ21CIhDUp8YrM6NA08eqR97e/gC1Et0bxygGq7wFD5vtfGVIO/aLOgvVy42AtlCKmN5I0/5DBRqAVX/A0tbTMoSKi2kOZCui5iPiEbGCzgf3/YmFiA/hao3cjt6cjtgvrRNTkNxjLvHvc+Nf/8IDb6LCMqBcqeJuGTyAfHnOwBgfLosdR/5uWl/MWl5ZeDLZUNScEPECfl+Klioh+JLhZz9lvcA7isqU/hIWiO4MLXWFVoLBW2Me2Oz9DZnqiuyhbSgojU+XNtWUDt6lBaBZOqBi2h6t6Dvbt4xYP0l0Zu2YHlCEprDQu+6CzUZ9f2zzl7ZGZq8GD7KrZAjII7QeTve6aNfXmAaI9UhtAKSJRh/NRCV9AFF6+0lvojV++hfEOqS+q6+KjPZs2jHcKu2Oy0+49snrz321B3V0RH6ixOAK5bE6eET1Id7xIkoauiAQ88RvxnIzeYpMmmLDwe8YMdrOE4ROZgrU8wYaaoP9usaWPQH6B5GOOr/EHwEyanekOLa/C70oZRj7A2wi2MDIwm19SyUmR7315qANCKaNV5WVX4Ozq6DgDDd45heM3sIQvnYWZwj3M2fAmn6nP0Q63hyxgphhrpNbS5HhwIesmzqOkE33PLfkvBiLbfWrb5OM6LXHHf24NHl0j+ayB5Yoc5qqrw+DZio/JEZ6PF3+rCZnR8uZ7Z56GAf1QwimZLSw92/WayQb0cWeK4SBFEluD1f+caDIOkjA1m7qJFWYDHMkBaosR7WmlHVYHRrOtxzeFXcZT0HAgEOR4Eeb4hKmHjhq1y05YQpSzIwOoU0ToA2rzPnqDrl VrZRPUt2 AiXAlfMeMSoy+GGobTn4VGF5RhDyfGHfFiK+fAel6MzaYhvnrL+iOBOn8MRA1Bm9w5jOswI2JM1xlzFh+m7AShsNt154k72QUSasnab4HC/BY07Zb9V7kHOq+yqEfVUc1PJnRD11Y1m94P3K7Fwx7o7Rp9MkqaD5ax47ZHmCCP+3k+cXhPneOQxTUap2B/U892Q9eAxpy3CPKtuQI8WhHAYjh0V8fnRujg9bkMqtA11gxYUWmMNmqJ7hc2BbkkdHfArhSgSojig1dVCabIBUZSd1o9GP0OWStwpp1/8C4mtF5/yJFT7iQvnN6QgOs7qTi3Zs67o2Fni/SH9wjKqj9zQstleoPJTshpGjrvXLTBuHKsq4y5bezeNpD1U5xF5t7Mg9MTOs1iEEQFAzJL03zvnuxF/PrphKQ04aQ7OdSjwK/vHy+hteLktUUq8Xsy2dHQUIa X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 13, 2024 at 6:40=E2=80=AFPM Pasha Tatashin wrote: > > On Tue, Feb 20, 2024 at 4:46=E2=80=AFPM Sourav Panda wrote: > > > > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot, > > to /sys/devices/system/node/nodeN/vmstat and a global Memmap field > > to /proc/meminfo. This information can be used by users to see how > > much memory is being used by per-page metadata, which can vary > > depending on build configuration, machine architecture, and system > > use. > > > > Per-page metadata is the amount of memory that Linux needs in order to > > manage memory at the page granularity. The majority of such memory is > > used by "struct page" and "page_ext" data structures. In contrast to > > most other memory consumption statistics, per-page metadata might not > > be included in MemTotal. For example, MemTotal does not include membloc= k > > allocations but includes buddy allocations. In this patch, exported > > field nr_memmap in /sys/devices/system/node/nodeN/vmstat would > > exclusively track buddy allocations while nr_memmap_boot would > > exclusively track memblock allocations. Furthermore, Memmap in > > /proc/meminfo would exclusively track buddy allocations allowing it to > > be compared against MemTotal. > > > > This memory depends on build configurations, machine architectures, and > > the way system is used: > > > > Build configuration may include extra fields into "struct page", > > and enable / disable "page_ext" > > Machine architecture defines base page sizes. For example 4K x86, > > 8K SPARC, 64K ARM64 (optionally), etc. The per-page metadata > > overhead is smaller on machines with larger page sizes. > > System use can change per-page overhead by using vmemmap > > optimizations with hugetlb pages, and emulated pmem devdax pages. > > Also, boot parameters can determine whether page_ext is needed > > to be allocated. This memory can be part of MemTotal or be outside > > MemTotal depending on whether the memory was hot-plugged, booted with, > > or hugetlb memory was returned back to the system. > > > > Utility for userspace: > > > > Application Optimization: Depending on the kernel version and command > > line options, the kernel would relinquish a different number of pages > > (that contain struct pages) when a hugetlb page is reserved (e.g., 0, 6 > > or 7 for a 2MB hugepage). The userspace application would want to know > > the exact savings achieved through page metadata deallocation without > > dealing with the intricacies of the kernel. > > > > Observability: Struct page overhead can only be calculated on-paper at > > boot time (e.g., 1.5% machine capacity). Beyond boot once hugepages are > > reserved or memory is hotplugged, the computation becomes complex. > > Per-page metrics will help explain part of the system memory overhead, > > which shall help guide memory optimizations and memory cgroup sizing. > > > > Debugging: Tracking the changes or absolute value in struct pages can > > help detect anomalies as they can be correlated with other metrics in > > the machine (e.g., memtotal, number of huge pages, etc). > > > > page_ext overheads: Some kernel features such as page_owner > > page_table_check that use page_ext can be optionally enabled via kernel > > parameters. Having the total per-page metadata information helps users > > precisely measure impact. Hi Andrew, Can you please give this patch another look, does it require more reviews before you can take it in? Thank you, Pasha