From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C115ED339A4 for ; Mon, 28 Oct 2024 17:29:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D7F96B0089; Mon, 28 Oct 2024 13:29:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4872F6B0092; Mon, 28 Oct 2024 13:29:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 327B46B0098; Mon, 28 Oct 2024 13:29:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0865B6B0089 for ; Mon, 28 Oct 2024 13:29:01 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A622BA0CAA for ; Mon, 28 Oct 2024 17:29:00 +0000 (UTC) X-FDA: 82723695390.08.525E54B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id B1551140012 for ; Mon, 28 Oct 2024 17:28:43 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cT3xRtho; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730136459; a=rsa-sha256; cv=none; b=I71UhyXMwxV/6gVcYmOSgzSUg6pHMoDzuM1WQOuNtrSkNDbegghyMJHzffAsj00/3pbxpq mWlXjKGCDR3cN8CwZrFj4ACHe1VXIXBi0sir/bakJ9sakBeG03ngTpoP11dXqiiPM8Ty6m 6OUZ4ewtHRbgaEPxYMrxC8RTMPpEY0U= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cT3xRtho; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730136459; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ezbpaZSbxUXw2G9TE+Ma0HzV5uoHGROVToJ0uvrnXuI=; b=QVgpojU9x1dw32P9Di42az1BW23Yd4RD/Jez55ElR3jNnORbZIDKg5dfohDLyi3oJEZCcr J7HxyCg4zxB/76dYeuxRW7QH1aHCv0fDjOoQWlAe3x82YCLHQp/8OHJE2lFYJ2OBrDjvLg 93QRqGBtt4wG7YJt9p9944l9OjeR86A= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 66E2F5C5D95; Mon, 28 Oct 2024 17:28:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8E37C4CEC3; Mon, 28 Oct 2024 17:28:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730136537; bh=GMw9tnIqWm+cDbQ2JkNYh5ueM+04Z5tH7xQN/ccSh5w=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=cT3xRtho1COzvSIOKQ/DHwqhHI8WdIzYDPMeDHykv1W41JcyXeKtj8lrwTEmKOLHJ S8veFLIU1f9GOWpAGSGNubqH7/bda9doKyx0ezzIhz5ZKDEzB5NbDnnXuce3zw3fjL ZTtlM49KXRqpsegh0Y4QxZZGL8T0hvurAN0ThaKmKeoDzFwYHSISRBPEiO+6eYsqsB boLj8ue2yCIk4Bbplv9CgaaxEHyjZ98fZR0O/MBZsMkQvZH/OKmuPpwCc4/5dsEySZ ywdQrCi552PYCON4EWI2l9MsLovmXRUnPsnUrXs5i0RwOmCHq452fKcN6rCWS4Jpq4 bxmP9GRDxhZBg== Date: Mon, 28 Oct 2024 19:24:54 +0200 From: Mike Rapoport To: Gregory Price Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-cxl@kvack.org, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, rrichter@amd.com, Terry.Bowman@amd.com, dave.jiang@intel.com, ira.weiny@intel.com, alison.schofield@intel.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, rafael@kernel.org, lenb@kernel.org, david@redhat.com, osalvador@suse.de, gregkh@linuxfoundation.org, akpm@linux-foundation.org Subject: Re: [PATCH v3 3/3] acpi,srat: give memory block size advice based on CFMWS alignment Message-ID: References: <20241022213450.15041-1-gourry@gourry.net> <20241022213450.15041-4-gourry@gourry.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241022213450.15041-4-gourry@gourry.net> X-Stat-Signature: qdpmnrz5ft9mg3pkhbxumkis1xstooir X-Rspamd-Queue-Id: B1551140012 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1730136523-123584 X-HE-Meta: U2FsdGVkX18SnJ0JAVpGmE28Ofq2MV1Wpu6wuabKoyfE6aNY5/HR45a8B5oEe4d8ohcixQBETpXS4dlvGbLBzPSRQF1JzEXce93xHXP6gvynNOcq+imdzQsX7lz785w/2BUFyBsiHZSaR0rRXUa6VY189BXSPb+H0fA7xZTBIUWE9zVpXbh64Mb7iNlrGBmGx4D08jLozbNmwg4kFpcqKmRht5yqaCbW+Jt2cMtI3vthovPPAj8HPi7oDFQnlN1RSL+LkhWCAZdzVk6uEh1ddGl15g2Zc6fT8XTA8mjH5s7WH9ZtjLZA1yRvuqxr8yikGONJPStdi2JosJPz6eU4+I+T4TW6MEYSNzipX1r2KEA5lDAtlf/A1Gc6AZuSPfwBU6P/2SLPIO7uJwVF7/9mmUDNmsKGynC2z0nZtAGQ+v41hTwJiBJxGcmSlWFBn6zisOJzXv5l1z6zTraOfHyLtpc7y0e4VRsAVXbF+9R83NSdFUzdIitorrW+BUjuRGgWAlFCAMJf1GNhWjoC9euEhnWTg0WjbNRlAejVAyJLcZTkiJKGRb1WU+yX4a3+bmD1yMFTYipDSedUpYquL9rwD6b5kEy9gdhpjyTMilhE3a1T/TJtKoeyJt4QlU9B4f3CALki+WKMqVCN/sXk1kbcpgvhGxb6E3cYKnJ284DJzX6naV15AwFU7LV4FFs8C5WQEPdTA+69e9p/GIMmY6c3zWC7HC16TP+TFp7Ckn+5AgELJwzYqhtfRnCvIvepIFgzpJSbHfvKkpvFhV3uwIoQKYvfG3XgAOqcCHQYggSYFag22nJafWDNIXR7pT0V+Ol2ndmPXl4U2/sGLwlE6aVObRDQXFzJ6U1r6tU/1HorKtgek/cHI3iECikLNu7o3weEn9ZC6tjIkECmj4gwHo5t6AzxqhXVi5TCN0A/Fn9gzLdwfadLMduxmD2vwyYXKJxR+0KZULuK/CXhptw5dvM V20195Z4 0B/QyjSL0gxkYQJsfAXnvXy1JkUO7AuhBJrK1k9x2S1L7X5UdcoCu6ic9olp6XjtbdEXIH1g+o18nsYB+iuO8FB3LcDryZSmUbtbiJQJqG//XobxpAr6cbf7xF7UYVtHsT9O51egG6M0RlR5F5ITN7G9VDqgoeAdYG86nKgxDXCxRHBsgcb2bmqEXRC6/RuN5EiOC4KHauBmksI3FBZD4+DpP7kcJbS78gw03ZbjC2H4VuvjUkUyQbSJVs7hplkUqhP7kdZs2RacHpK3tOkraCC79IpwQAtOYRuGvVDPTEpWbTFr8i4wNTNHUcFmhtryWbDQwyeDjoR5yNoGPssKbOWDftA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 22, 2024 at 05:34:50PM -0400, Gregory Price wrote: > Capacity is stranded when CFMWS regions are not aligned to block size. > On x86, block size increases with capacity (2G blocks @ 64G capacity). > > Use CFMWS base/size to report memory block size alignment advice. > > After the alignment, the acpi code begins populating numa nodes with > memblocks, so probe the value just prior to lock it in. All future > callers should be providing advice prior to this point. > > Suggested-by: Dan Williams > Signed-off-by: Gregory Price > --- > drivers/acpi/numa/srat.c | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c > index 44f91f2c6c5d..35e6f7c17f60 100644 > --- a/drivers/acpi/numa/srat.c > +++ b/drivers/acpi/numa/srat.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -333,6 +334,29 @@ acpi_parse_memory_affinity(union acpi_subtable_headers *header, > return 0; > } > > +/* Advise memblock on maximum block size to avoid stranded capacity. */ > +static int __init acpi_align_cfmws(union acpi_subtable_headers *header, > + void *arg, const unsigned long table_end) > +{ > + struct acpi_cedt_cfmws *cfmws = (struct acpi_cedt_cfmws *)header; > + u64 start = cfmws->base_hpa; > + u64 size = cfmws->window_size; > + unsigned long bz; Maybe unsigned long size? > + > + for (bz = SZ_64T; bz >= SZ_256M; bz >>= 1) { > + if (IS_ALIGNED(start, bz) && IS_ALIGNED(size, bz)) > + break; > + } > + > + if (bz >= SZ_256M) { > + if (memory_block_advise_max_size(bz) < 0) > + pr_warn("CFMWS: memblock size advise failed\n"); > + } else Nit: braces needed for else arm as well > + pr_err("CFMWS: [BIOS BUG] base/size alignment violates spec\n"); > + > + return 0; > +} > + > static int __init acpi_parse_cfmws(union acpi_subtable_headers *header, > void *arg, const unsigned long table_end) > { > @@ -545,6 +569,15 @@ int __init acpi_numa_init(void) > * Initialize a fake_pxm as the first available PXM to emulate. > */ > > + /* Align memblock size to CFMW regions if possible */ > + acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_align_cfmws, NULL); > + > + /* > + * Nodes start populating with blocks after this, so probe the max > + * block size to prevent it from changing in the future. > + */ > + memory_block_probe_max_size(); > + It won't change, but how drivers/base/memory.c will know about the probed size if architecture does not override memory_block_size_bytes()? > /* fake_pxm is the next unused PXM value after SRAT parsing */ > for (i = 0, fake_pxm = -1; i < MAX_NUMNODES; i++) { > if (node_to_pxm_map[i] > fake_pxm) > -- > 2.43.0 > -- Sincerely yours, Mike.