From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A91B4D18158 for ; Mon, 14 Oct 2024 22:40:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E777E6B0082; Mon, 14 Oct 2024 18:40:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E00366B0083; Mon, 14 Oct 2024 18:40:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C79AA6B0085; Mon, 14 Oct 2024 18:40:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A71866B0082 for ; Mon, 14 Oct 2024 18:40:24 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 500D01612C9 for ; Mon, 14 Oct 2024 22:40:15 +0000 (UTC) X-FDA: 82673677674.07.0791FBF Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf10.hostedemail.com (Postfix) with ESMTP id 37B9AC0013 for ; Mon, 14 Oct 2024 22:40:19 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=MhpxgUZe; spf=pass (imf10.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.44 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728945479; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aYw/DowXMP8iT0vBm8bMBxhj8l9VIuy2ovueWWo/kGY=; b=TCelS4JXlmOgNFXi7/LuYdbr8f5XUB6rLspGEF2DsytG1ZShaqQs2EN6e4xauSPDV6MDae Kogl8ccgBAEI4VO1GUJTjQtTZROiaFzVixQJ5fqGSRFX8Vw+WIZ4RZmaLX/Nq0e/FdIlNh Ttp8rCDf1m6OOa3X6NZZVxrvdqbeFB8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728945479; a=rsa-sha256; cv=none; b=4dUyxbv42jw+JpSvEd52NJSEXKBT+72sQUEItu4Ew3cibJbkiX5M/lIzMj+oUQCK1RQbwn 8L9O3u8MegYN4zDP0YxCXz8fP1q5tbznkLPZUTiN+xc6IR95hNNbXaoksL1Ky5LuUjfy3S ZndKukBnlB6PzjQRuqWqS5+6B03iBkA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=MhpxgUZe; spf=pass (imf10.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.44 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-6cbd550b648so39389686d6.0 for ; Mon, 14 Oct 2024 15:40:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1728945621; x=1729550421; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=aYw/DowXMP8iT0vBm8bMBxhj8l9VIuy2ovueWWo/kGY=; b=MhpxgUZe6a5gMnvGEpQhwAvIUa8XqZIBPEu6CaIzqBZVZFlNG7JNRul+35lICDX4OQ UGxiF/mwKkYr5EQ04R0qY/PiodhhECYmU+X2hOvStyxeEAhdbA7Zcd32QgUKimDhFhx3 BKJKYtJkl3pL1NDvmpdDqYNmHEu6/6/lKemKtBPSROc228qyuw78E6k/P+Gwvg/hQrAr rb5qAK6UNqNfdnVCyO9q33HhvzkixQTeqNltKasAmKCxoN62gG49ZUq5yoTEuoh9IIhX SrBwMqqJCwR+j6djfUIaWePWyB8r6N7Tf3gpg3uVzDTyQQbXADoJMfxtprYWsPSG8y/W a9Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728945621; x=1729550421; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=aYw/DowXMP8iT0vBm8bMBxhj8l9VIuy2ovueWWo/kGY=; b=Oe21MJfUW8YACrJ68UcZFNp5r0uHD7Cfu3DtiBVmfHdJPpcWZht1FrjsQtXLx2D7Ne 0aU5eRmtA/LVAYq7t0w2DnhFLjhT13aMVNQwe7NyGdAVe8xAm1oxJxanxJ8FayRGNWsQ eHHGHHakBcFiyhy/dykRv6F609sst18Frea9wXwnmK1PpW2ERbugVAgVogtBAve8d5y0 mtWLx1hTqTRDRNEPT5o6Pu9O2KIbafx4waUcg5H3oB1A21x97gb7JiU57CafzTpxLe+Q Znkj6FzhqkfiFoB/ypVagcno2eMMPIzzUxBIlc1L2pQvS3BYeNnGvHN8icYG8LYHeY7G IlEg== X-Forwarded-Encrypted: i=1; AJvYcCV9293iEE9KfvdeBL1Qe1Ou5C9zMmUdcg/9ITcZTiGH6nGYVctCPO9kIz5f+i8OYJ3O1+exakPvOw==@kvack.org X-Gm-Message-State: AOJu0Yx+aY2Dz7CxuoT9RNq0iqlPjX+slk82Qe6XzgyBveoDUhIsCEtN JAkd4uH2yly0D2NK4d1Ot0+ElFuF8gR2GvEmbVI9aamtqwWFQajIu3b18jrTROs= X-Google-Smtp-Source: AGHT+IEF3dKRKqarTnBYXO5TAih258zYEUVmUl9i+Hckz/OMf3T05ro3fHPweLsgT86eMNWxUx0sCQ== X-Received: by 2002:a05:6214:33c3:b0:6cc:1dd9:296 with SMTP id 6a1803df08f44-6cc1dd9048bmr26873916d6.0.1728945621142; Mon, 14 Oct 2024 15:40:21 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6cc22910ecfsm392706d6.23.2024.10.14.15.40.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Oct 2024 15:40:19 -0700 (PDT) Date: Mon, 14 Oct 2024 18:40:00 -0400 From: Gregory Price To: David Hildenbrand Cc: linux-cxl@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, osalvador@suse.de, gregkh@linuxfoundation.org, rafael@kernel.org, akpm@linux-foundation.org, dan.j.williams@intel.com, Jonathan.Cameron@huawei.com, alison.schofield@intel.com, rrichter@amd.com, terry.bowman@amd.com, lenb@kernel.org, dave.jiang@intel.com, ira.weiny@intel.com Subject: Re: [PATCH 1/3] memory: extern memory_block_size_bytes and set_memory_block_size_order Message-ID: References: <20241008044355.4325-1-gourry@gourry.net> <20241008044355.4325-2-gourry@gourry.net> <039e8c87-c5da-4469-b10e-e57dd5662cff@redhat.com> <2c854e5e-c200-4ed9-bf21-778779af7e5b@redhat.com> <01fbdcef-b923-4bb0-80b0-f1d3e57fe515@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <01fbdcef-b923-4bb0-80b0-f1d3e57fe515@redhat.com> X-Stat-Signature: twpkn5id8mg6n9yojx1ui8s3nhtpesgn X-Rspamd-Queue-Id: 37B9AC0013 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1728945619-73690 X-HE-Meta: U2FsdGVkX19Re+seZeKl0xOdizLek8mQtU+6z0V9lC+QiyTndHxSerix7Yt5ay5voHZvHAQAzNVt5PzNBjXijOz2JDzzXZXh7Cmybrb4sGIags0RHCEY3VY3T8hLC3Pgx1OmFukbk7ibGf4+zHWdLWurks8zqcCREOvx6foX2Jafb7biifaNZoFGRHGcipCVB/p9kcUnoLBicBGNWyYouZehDnhUdwFar3rr0XaT6ylJdAaSPgkqT/eAQhLxuuHM2h3DFEOYFjfGz4S23cltEl1suTQe/M82fxoEN5tHnPMPCLH2mGjKMxl19UC/N/FWMv8afhmQi7czxXNXrrMr7bKUPiuMoQfLF7pZ2btxh5RkKfhqttCJAf1wqX72T4AwMrRJQuEgdE9wQj1k3W4AkQ+uF4xzxuScVO8CJz2xKyOTD2mHjjTwtnDM712E55gOkwW397Q9c1LMTcAJoVd9yj2hRGMAZZaFug7XtQQfo3ijlTsAlwULwnrpD4NXpEW9hcLshOXsrcvsLx24JMNT+se9p0EfTqdiQnWLw5bI1ojBhI1p8QJco2Jjv1EtCfM5wYumyQdZDZmDNF0eeZ0ZuRQ5DK9iaH0EOdB/NwtmG4ZR4tASxRRz8i3VT1k4yoyp0Yql2ObA3pMEtMbh4B3bEUrzvmoNcvhLKTO8LaXdbsZd2Z/PS/4ZGYAxoInyN8RLncmdcTEcf0ofGtbaP9U4EtBl1wyAFK3SVqLah3qxBQzbyTxgoK50VMnIqvrf1Y6DoMZ2NLmktmGAYC+xXADRxk9dK6bIJI3wk0jAiVR8pk67bnawfL4iQwcDZDq6AWHv6Xbj7i2dgnHuRw51oh0durMTiAQ0ntkUfX49FVW1sW2xqpytejSq55WvLNpYNB7SthMb5GJdvy4GYY+RJxKl+5h78cl3vYey67Ni5qPger7NaZM76fJVSvBYQnzXZ1Hxf5OM+N5/fH53AAGs/xS jpl+a6W3 z4P+mCO1X8qyFhgX7KFs5bgAsf6ZyDDT0YSLfnSsvRI+Y0smUHthJEUsCBYUFDHYeM5nC69xN6Q904vfyAoyGJSwEwEOUgyqRJlVF07ttpb15F5UPfQf0fg7iAr1fvtUGYePgErmIVnrV5rOXWWo0sElF4GM4kbJehKryLhu/EGrWujJkLm9uN4Qd9dvI7AZ7wJJPaMi8I6hppE5RjLVjMI3dpkhd7qFM86bYL0IqAMN+aVb1Ff0JA7UEmxJCznY+A9B2gwi6mPR6unJ7Pd/yDdcnFG8o3dttUqgq8Ujoo98iG8nAjAEOp7hKciT6XbXdZgtAwYHtf3BS/YOloie2LYDGNA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 14, 2024 at 10:32:36PM +0200, David Hildenbrand wrote: > On 14.10.24 16:25, Gregory Price wrote: > > On Mon, Oct 14, 2024 at 01:54:27PM +0200, David Hildenbrand wrote: > > > On 08.10.24 17:21, Gregory Price wrote: > > > > On Tue, Oct 08, 2024 at 05:02:33PM +0200, David Hildenbrand wrote: > > > > > On 08.10.24 16:51, Gregory Price wrote: > > > > > > > > +int __weak set_memory_block_size_order(unsigned int order) > > > > > > > > +{ > > > > > > > > + return -ENODEV; > > > > > > > > +} > > > > > > > > +EXPORT_SYMBOL_GPL(set_memory_block_size_order); > > > > > > > > > > > > > > I can understand what you are trying to achieve, but letting arbitrary > > > > > > > modules mess with this sounds like a bad idea. > > > > > > > > > > > > > > > > > > > I suppose the alternative is trying to scan the CEDT from inside each > > > > > > machine, rather than the ACPI driver? Seems less maintainable. > > > > > > > > > > > > I don't entirely disagree with your comment. I hummed and hawwed over > > > > > > externing this - hence the warning in the x86 machine. > > > > > > > > > > > > Open to better answers. > > > > > > > > > > Maybe an interface to add more restrictions on the maximum size might be > > > > > better (instead of setting the size/order, you would impose another upper > > > > > limit). > > > > > > > > That is effectively what set_memory_block_size_order is, though. Once > > > > blocks are exposed to the allocators, its no longer safe to change the > > > > size (in part because it was built assuming it wouldn't change, but I > > > > imagine there are other dragons waiting in the shadows to bite me). > > > > > > Yes, we must run very early. > > > > > > How is this supposed to interact with code like > > > > > > set_block_size() > > > > > > that also calls set_memory_block_size_order() on UV systems (assuming there > > > will be CXL support sooner or later?)? > > > > > > > > > > Tying the other email to this one - just clarifying the way forward here. > > > > It sounds like you're saying at a minimum drop EXPORT tags to prevent > > modules from calling it - but it also sounds like built-ins need to be > > prevented from touching it as well after a certain point in early boot. > > Right, at least the EXPORT is not required. > > > > > Do you think I should go down the advise() path as suggested by Ira, > > just adding a arch_lock_blocksize() bit and have set_..._order check it, > > or should we just move towards each architecture having to go through > > the ACPI:CEDT itself? > > Let's summarize what we currently have on x86 is: > > 1) probe_memory_block_size() > > Triggered on first memory_block_size_bytes() invocation. Makes a decision > based on: > > a) Already set size using set_memory_block_size_order() > b) RAM size > c) Bare metal vs. virt (bare metal -> use max) > d) Virt: largest block size aligned to memory end > > > 2) set_memory_block_size_order() > > Triggered by set_block_size() on UV systems. > > > I don't think set_memory_block_size_order() is the right tool to use. We > just want to leave that alone I think -- it's a direct translation of a > kernel cmdline parameter that should win. > > You essentially want to tweak the b)->d) logic to take other alignment into > consideration. > > Maybe have some simple callback mechanism probe_memory_block_size() that can > consult other sources for alignment requirements? > Thanks for this - I'll cobble something together. Probably this ends up falling out similar to what Ira suggested. drivers/acpi/numa/srat.c acpi_numa_init(): order = parse_cfwm(...) memblock_advise_size(order); drivers/base/memory.c static int memblock_size_order = 0; /* let arch choose */ int memblock_advise_size(order) int old_order; int new_order; if (order <= 0) return -EINVAL; do { old_order = memblock_size_order; new_order = MIN(old_order, order); } while (!atomic_cmpxchg(&memblock_size_order, old_order, new_order)); /* memblock_size_order is now <= order, if -1 then the probe won */ return new_order; int memblock_probe_size() return atomic_xchg(&memblock_size_order, -1); drivers/base/memblock.h #ifdef HOTPLUG export memblock_advise_size() export memblock_probe_size() #else static memblock_advice_size() { return -ENODEV; } /* always fail */ static memblock_probe_size() { return 0; } /* arch chooses */ #endif arch/*/mm/... probe_block_size(): memblock_probe_size(); /* select minimum across above suggested values */ > If that's not an option, then another way to set further min-alignment > requirements (whereby we take MIN(old_align, new_align))? > > -- > Cheers, > > David / dhildenb >