From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54275C4332F for ; Thu, 2 Nov 2023 16:44:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6A928D009C; Thu, 2 Nov 2023 12:44:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF34F8D000F; Thu, 2 Nov 2023 12:44:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96CF98D009C; Thu, 2 Nov 2023 12:44:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 826148D000F for ; Thu, 2 Nov 2023 12:44:19 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5C0D11CBCE3 for ; Thu, 2 Nov 2023 16:44:19 +0000 (UTC) X-FDA: 81413587038.05.33A829A Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) by imf19.hostedemail.com (Postfix) with ESMTP id 7B5461A0007 for ; Thu, 2 Nov 2023 16:44:17 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=i2WStsz7; dmarc=none; spf=pass (imf19.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698943457; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D5huHRXTal5EExhtsUH3oeQOysPPywQTNSCA2FV70Z4=; b=BsvAURNQ77EcbckcETrazaNFX6iCslCd9GWKB+FZUgw2yXfQonZUI5hpOo/mEkUYrl/dkD n1DmQrQo5oeMXlpptuKe1pS6WI0r/phQ7tj0Mtg2tprURoNwf1UyzXtdHnw+pK9v03MVpp /SLIUM8CMywC6jNsavdWxCtZ6+yrhag= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=i2WStsz7; dmarc=none; spf=pass (imf19.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698943457; a=rsa-sha256; cv=none; b=rqF+4jXM8PAUENkZV52p6vORcULFCWvEvW2p+7Ha2zXzD5O8IUuCrwdzbE3PE+XD+ewWnP zxDZQan3ItERAZQkqjplDwedl2d4OUT/DXf1LsqIHSKLCCdmgO5/6zIWgQHKgKYp0zBx5+ eoduNas57Iazs+jM0bpDFLlixLi93F8= Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7789aed0e46so65438185a.0 for ; Thu, 02 Nov 2023 09:44:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1698943456; x=1699548256; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=D5huHRXTal5EExhtsUH3oeQOysPPywQTNSCA2FV70Z4=; b=i2WStsz7w3z23kD5E+uClpPTIts5qU7kijq5zNyESLb66q2M/rBda6YLHASIZYfYeg e2DHSpw2+ERJUe06OMkpp7Lo4sRtVLAs6YckrNy/QEQ143+4YxdEw3Fl9MruUN1T+IbY B8mRXeSUQkXJ8MTRscALejCfztp8yBjor135hKNBMQ3HLAL8TkD2xaFBqdKzZA9zv9vt g/F4/kPA9yOJgQ6PATzAxFNGL1RRTEsmZkwQ5hg+x9K+E+1Gc2SEHsBpSGNVe2tmHwnP exJtg1A/VSFJGl85Sr+u5y7FfjDrp0zdYkHKcAoI+nojccpsv+W0mOpQRes1x2rFZ/VI uqHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698943456; x=1699548256; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D5huHRXTal5EExhtsUH3oeQOysPPywQTNSCA2FV70Z4=; b=pPCXnxZ13RQOKtjzVjQsdnqUYUVodfyn8vN1B5GikwjzxcJHk6gl3gielehh8s7Ao7 B7InjUcOJHpojTlDzaR0F0BOJatGo6/PuqrOCJBQl+o2KnRhSif8dewQ4N4WSRdM7I3V I8ynuWOiwHn48Az3RtYSZGgHR0tLqmSFDWD5jOyWCXCcOWwuCarLY5/91SH4p6WH9p1u zM1T4pPqRdbpfIclw9tbXxRihEfTnRsxS4Ba5nEBFpgUI0ZeYRQ8hOLhLKZfMbBuHNNR CbgjwZc88bDKTTSfxsO+Ga0g1OHSJpbzKqGzatSHQxwuREYW8R3nmmSuBS1EgwTxUeyo CXpA== X-Gm-Message-State: AOJu0Yx9SlKK+xSmLjg7McFUXsP+IRpHG67rNxns0XihYXChmDiYoqzB RkpyZwvuSgBipnO/SFHw3s/b6WYScMdvC0AV08I0Yg== X-Google-Smtp-Source: AGHT+IF9lgp6HSclfxgGWa51q3PXdKEwLSx/iwMNQZdCz8lh2GUtaevw/shwA9kvCh/H16OZ4J9mUGLpb/Z5kyo4xi4= X-Received: by 2002:ac8:5c4a:0:b0:41c:baed:2941 with SMTP id j10-20020ac85c4a000000b0041cbaed2941mr22010647qtj.15.1698943456529; Thu, 02 Nov 2023 09:44:16 -0700 (PDT) MIME-Version: 1.0 References: <20231101230816.1459373-1-souravpanda@google.com> <20231101230816.1459373-2-souravpanda@google.com> <1e99ff39-b1cf-48b8-8b6d-ba5391e00db5@redhat.com> <025ef794-91a9-4f0c-9eb6-b0a4856fa10a@redhat.com> <99113dee-6d4d-4494-9eda-62b1faafdbae@redhat.com> In-Reply-To: <99113dee-6d4d-4494-9eda-62b1faafdbae@redhat.com> From: Pasha Tatashin Date: Thu, 2 Nov 2023 12:43:39 -0400 Message-ID: Subject: Re: [PATCH v5 1/1] mm: report per-page metadata information To: David Hildenbrand Cc: Wei Xu , Sourav Panda , corbet@lwn.net, gregkh@linuxfoundation.org, rafael@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, muchun.song@linux.dev, rppt@kernel.org, rdunlap@infradead.org, chenlinxuan@uniontech.com, yang.yang29@zte.com.cn, tomas.mudrunka@gmail.com, bhelgaas@google.com, ivan@cloudflare.com, yosryahmed@google.com, hannes@cmpxchg.org, shakeelb@google.com, kirill.shutemov@linux.intel.com, wangkefeng.wang@huawei.com, adobriyan@gmail.com, vbabka@suse.cz, Liam.Howlett@oracle.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, Greg Thelen Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: fqb9ph5ki19c868dsf1ak3nag6i9793z X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7B5461A0007 X-HE-Tag: 1698943457-114319 X-HE-Meta: U2FsdGVkX1/HiqSNjYWBu9Cl5WE+oQogvct6J8ZnLNESj9d6oO5iyrAWA4YgPmKq7aVVmLWfJpWvDtHVq6X866NHhACjD7hK9m+/gnPY8AG1KaW+xdjUD23G1/xwK3Cr0x/aeXWZHYJQIXHzqKCEAGPbce37BcwADPCnhPlAr/9LKvQUTAyL+x5ew6sw9BnQ2FTY77q4opOW/lg9/kKXSKR1I1h1cLk8lia3jTzLXX0HVAqdlMf4iTiY9zaI+OpCtED0ugMmJEI33ZZW/j5WgK8p4iUNUjGvt2Y3gPRs0jawZoDkW9Uo7CJnW0j/yN63zbB2Dxw5IUqU/h2DJzKarlWkXBrhqoiOuc1rePKxRDLdx4XvuT61BdZCGBT0TufIEnkjQy93gh0HfaX75TC3fNcL9VuHCVeAgq9MbyII946392S162DmcjBXW28yF9FNa9dlIKIBARVoSepRBiB3OIcipZX93ow69dDFc7hPoqbGYAg5AZe63DDDRab4lI0yDHz5oHXBbv5I6eZwrQEijXOVp23yCe8gRf3OUbVdlWrXgFWag/aejFGHdzrA6aUIDMjqdiNicGMX6mkJ0iucTXpXxvJsYd0Ziskf2EbkxBG/rx4v14Vvj1eErgU4ZNNFpQ24hQ7nAJyaQrkC8hyn6CLgvW1OGkoZvZy1Z1foeAjgHeWIfhvNsWznVN8YD4rhu0T9w9ssmcLZnAfjz3Hba3gifDlFU2M98zgtrAgXuOKfoGiHmybFUuMVXrCfQkjCIVRRviJAYUblONhDKZpme4hBX/rW892E2aNiDAeaE/cL6ewC/j2JfFtA8rZnckiirVJ21GlVE14x7xqQOYis6JlUP/JbyD2PRb1Ab0J3lIcqOJ8W+M4+BuugVswLb2awuX8wOMjoIPs7HTNh1Qu3JS/At4N2zTbPqp7RoFXx4dSd7gLlPOuW9Roy/6fkjk1tVg9CLqk7Zw94DYlJKl4 Akg+kFxk bTfX3QzkaDf8VvG8eqv/awgs3l3tUB2AnmEnp3Idf1X8P3dkQg5IkRrxHmWtZlbBwdv8NopQZECFIcptB5vL12h3lh7mq5W7vsZKWCdpZCfSoElUeAl9I+Ut+/QpIeJAvA7s/YGnv2VQuEgiZbls7qTdFJHoTNV6qoha8+qLMBQMhyvCYIJI7gxw/tZuVvDFtoBlvaeA3b+wCJ/WXL9pAbHqKnaefd8fKDhrN2CxXApb5aTwR4lEsoNAADpZeW8BjBPjRwfahheJ8/PZtdRt+WWp+8yjBlM8cCcjVBXmiv/FFXyA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 2, 2023 at 12:09=E2=80=AFPM David Hildenbrand wrote: > > On 02.11.23 17:02, Pasha Tatashin wrote: > > On Thu, Nov 2, 2023 at 11:53=E2=80=AFAM David Hildenbrand wrote: > >> > >> On 02.11.23 16:50, Pasha Tatashin wrote: > >>>>> Adding reserved memory to MemTotal is a cleaner approach IMO as wel= l. > >>>>> But it changes the semantics of MemTotal, which may have compatibil= ity > >>>>> issues. > >>>> > >>>> I object. > >>> > >>> Could you please elaborate what you object (and why): you object that > >>> it will have compatibility issues, or you object to include memblock > >>> reserves into MemTotal? > >> > >> Sorry, I object to changing the semantics of MemTotal. MemTotal is > >> traditionally the memory managed by the buddy, not all memory in the > >> system. I know people/scripts that are relying on that [although it's > >> been source of confusion a couple of times]. > > > > What if one day we change so that struct pages are allocated from > > buddy allocator (i.e. allocate deferred struct pages from buddy) will > > It does on memory hotplug. But for things like crashkernel size > detection doesn't really care about that. "Crash kernel" is a different case: it is kernel external memory, similar to limiting the amount of physical memory via mem=3D/memmap=3D, it sets memory that cannot be used by this kernel, but only by the crash kernel. Also, the crash kernel reserve is exposed in /proc/iomem via "Crash kernel" range. Page metadata memory on the other hand, is used by this kernel, and also can be changed by this kernel depending on how the memory is used: memdec, hotplug, THP, emulated pmem etc. > > it break those MemTotal scripts? What if the size of struct pages > > changes significantly, but the overhead will come from other metadata > > (i.e. memdesc) will that break those scripts? I feel like struct page > > Probably; but ideally the metadata overhead will be smaller with > memdesc. And we'll talk about that once it gets real ;) The size and allocation of struct pages change MemTotal today, during runtime, even without memdesc, I just brought it up, to emphasize that this is something that we should resolve now before it gets worse. > > memory should really be included into MemTotal, otherwise we will have > > this struggle in the future when we try to optimize struct page > > memory. > How far do we want to go, do we want to include crashkernel reserved > memory in MemTotal because it is system memory? Only metadata? what else > allocated using memblock? > > Again, right now it's simple: MemTotal is memory managed by the buddy. > > The spirit of this patch set is good, modifying existing counters needs > good justification. Wei, noticed that all other fields in /proc/meminfo are part of MemTotal, but this new field may be not (depending where struct pages are allocated), so what would be the best way to export page metadata without redefining MemTotal? Keep the new field in /proc/meminfo but be ok that it is not part of MemTotal or do two counters? If we do two counters, we will still need to keep one that is a buddy allocator in /proc/meminfo and the other one somewhere outside? Pasha