From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55C93C021BE for ; Thu, 27 Feb 2025 03:20:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DA7AE6B007B; Wed, 26 Feb 2025 22:20:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D57D86B0082; Wed, 26 Feb 2025 22:20:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF7B9280001; Wed, 26 Feb 2025 22:20:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A1A466B007B for ; Wed, 26 Feb 2025 22:20:10 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3CF2C1A1747 for ; Thu, 27 Feb 2025 03:20:10 +0000 (UTC) X-FDA: 83164270980.30.DAE3F27 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf18.hostedemail.com (Postfix) with ESMTP id 978071C0009 for ; Thu, 27 Feb 2025 03:20:07 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf18.hostedemail.com: domain of honggyu.kim@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=honggyu.kim@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740626408; a=rsa-sha256; cv=none; b=WZCPVVSXAVOqsP9JSIhGBKCuPazUrJgFn2hofJqin/Ma2MpiURnviHQKA2GVqkn1NgwabS uwxjV5kl2Qj9+DU4NedVKWShQPgbDw29XsUcnHTft/mtt1JRNtQzmWDJl/N9awI7EHB4K5 UBVLLGvfTLoyH2OYXU24k0iK9q0urI4= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf18.hostedemail.com: domain of honggyu.kim@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=honggyu.kim@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740626408; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8TaR7D9LoR3VVTqQfBjzyaLAjREHYjxIlnaGVo0iTZ4=; b=TGvNXHBvmVC6iPaWeA+RYnhsH1fSyToco8zzCH6XDv34ybqb4TDuc0Fuao5n0IoUH8cY66 ILhcwTTWyl5tPboZ0CRgQQ3xNdFrDamWCJSyQCFzQ7d1SoTADVDtT2iUC1tkQhqlO5lsOP z40doiuNF7yn04ZdXRpBdrEfmdoJ2Z0= X-AuditID: a67dfc5b-3e1ff7000001d7ae-83-67bfd9e6a2cf Message-ID: Date: Thu, 27 Feb 2025 12:20:03 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: kernel_team@skhynix.com, gregkh@linuxfoundation.org, rakie.kim@sk.com, akpm@linux-foundation.org, rafael@kernel.org, lenb@kernel.org, dan.j.williams@intel.com, Jonathan.Cameron@huawei.com, dave.jiang@intel.com, horen.chuang@linux.dev, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com, yunjeong.mun@sk.com Subject: Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for memoryless nodes Content-Language: ko From: Honggyu Kim To: Joshua Hahn , gourry@gourry.net, harry.yoo@oracle.com, ying.huang@linux.alibaba.com References: <20250226213518.767670-1-joshua.hahnjy@gmail.com> <20250226213518.767670-2-joshua.hahnjy@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrFIsWRmVeSWpSXmKPExsXC9ZZnoe6zm/vTDX6d4LeYs34Nm8X0qRcY LU7cbGSz+Hn3OLtF8+L1bBarN/la3F/2jMXidv85VotVC6+xWRzfOo/dYt9FoIadD9+yWSzf 189ocXnXHDaLe2v+s1rM/TKV2WL1mgwHQY/Db94ze+ycdZfdo7vtMrtHy5G3rB6L97xk8ti0 qpPNY9OnSeweJ2b8ZvHY+dDSY2HDVGaP/XPXsHucu1jh8fHpLRaPz5vkAviiuGxSUnMyy1KL 9O0SuDI27MoouC9VMbWtha2B8aZoFyMnh4SAicTH3q9sXYwcYPa0AwIgYV4BS4nnXU/ZQWwW AVWJ/193sEHEBSVOznzCAmKLCshL3L81A6iGi4NZoIVZYtKpVcwgCWGBKImdr/4ygdjMAiIS szvbwOJsAmoSV15OAouLCBRJHJ/xmRGkWUhgIaPErJ4HLCBHcApYSZw6zAHRaybRtbWLEcKW l2jeOpsZpF5C4Bq7xOynU9khHpCUOLjiBssERsFZSA6chWT3LCSzZiGZtYCRZRWjUGZeWW5i Zo6JXkZlXmaFXnJ+7iZGYAwvq/0TvYPx04XgQ4wCHIxKPLwR4vvThVgTy4orcw8xSnAwK4nw cmbuSRfiTUmsrEotyo8vKs1JLT7EKM3BoiTOa/StPEVIID2xJDU7NbUgtQgmy8TBKdXAuEJD 47rW7e49RrxxOU83GGr8cDjVv+gc92oZ3gnBDEZ3O/Pkv82XPFTwaGvWyu3c0nIlq8W1H6w8 98E0xrtCr3+K0uuVuXuaom1Pqfc73NeLDrBk2ChouOv5rHdum/Ut9l+6u2oBT0Abs8jR3UK7 MiWztj7gv3J1/7SdZ2K8Zz9bYcH5V8VzvhJLcUaioRZzUXEiAOhEgyvdAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRmVeSWpSXmKPExsXCNUNLT/fpzf3pBoeWmVvMWb+GzWL61AuM FiduNrJZ/Lx7nN2iefF6NovVm3wt7i97xmJxu/8cq8WqhdfYLI5vncduse8iUMPhuSdZLXY+ fMtmsXxfP6PF5V1z2CzurfnPajH3y1Rmi0PXnrNarF6TYfF72wo2BxGPw2/eM3vsnHWX3aO7 7TK7R8uRt6wei/e8ZPLYtKqTzWPTp0nsHidm/Gbx2PnQ0mNhw1Rmj/1z17B7nLtY4fHx6S0W j2+3PTwWv/jA5PF5k1yAQBSXTUpqTmZZapG+XQJXxoZdGQX3pSqmtrWwNTDeFO1i5OCQEDCR mHZAoIuRk4NXwFLieddTdhCbRUBV4v/XHWwQcUGJkzOfsIDYogLyEvdvzQCq4eJgFmhhlph0 ahUzSEJYIEpi56u/TCA2s4CIxOzONrA4m4CaxJWXk8DiIgJFEsdnfGYEaRYSWMgoMavnAQvI EZwCVhKnDnNA9JpJdG3tYoSw5SWat85mnsDINwvJHbOQrJiFpGUWkpYFjCyrGEUy88pyEzNz TPWKszMq8zIr9JLzczcxAqN1We2fiTsYv1x2P8QowMGoxMMbIb4/XYg1say4MvcQowQHs5II L2fmnnQh3pTEyqrUovz4otKc1OJDjNIcLErivF7hqQlCAumJJanZqakFqUUwWSYOTqkGRse6 tvgfko7WarMv1GxP7ri0gSHjxrrrG5QDPP8qxaUr1/CG/67/c8pyo/UWAaEvn9IFxbLuBZVe fryNZcfG4Li5jyQO67o3TFyeWPFy735Ha/ZFriVVSuo26aHNH08lL/ryJtCj8f2zYFkebq22 pdNurJjk89+w4o/Y74y+BPk+hyX87ifUlViKMxINtZiLihMBTgQItdICAAA= X-CFilter-Loop: Reflected X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 978071C0009 X-Stat-Signature: fwbdoogim5ctt6wzp8sjk3kunbynfkca X-Rspam-User: X-HE-Tag: 1740626407-112404 X-HE-Meta: U2FsdGVkX1/NiIgacNEEXBTtS+EzgLMJmTIjwKNbmLk8FJ420GfeJbklSH3H3JjnTk5kna2LtiB1R9cNS1044IDuGq9RJ3HZEAMqopGh8ilR0SHfW/vZXkCiOIeEORe14X7ZMH74epGqJ7q5RcX0XWhxtod6DK+9izlnhTgP8yKHXHEAiKfiCLnCM686PVBUeBDq7iJoG6QdsDbI9Cv9HcERg93Iv00gbMJuvElNxyBlOQRpytd8WtDwrPpC1aV/NhmwORVmeJpLOhl03guXoYHOoAwAek/Hvnpr5qttDgDV4Qxbd5bYxDYQlZOIYdy/xUv6rj+JSsA7RH/ar+F4JpL/jXzv4mzhVBoaeeWaMpiCGdNoAEqFBziIUh0QXPTao8eIsCwvhUhmqK0XUSeO0a0Hi4u16JZkVr2nV4otyt4iwdy66OygWJl74fk2wH75U6NU0xI+C3ZYo2zhx1vqvFqtfcRrtb53wdObnAzozAaiVWfhxoIIPGVMcm+MMts6PgYst5nKefHTq2RzINZz3uHZoLN1XI2O6Ja+lMHiTxkfObkP6sukxhmIYeDEPPPrm3Gq4S6xLvtJY+fIrzeSv4/HipXGbKCybkq1yLwVrnH672CqOgl1JgjFGT5K3XcOAEmT3RitWiX5/pkI+XydF1yz1SuGHEUdYSILrKkc1aYF6oby+HSqXH8M1mWKfVCb56Zj5vI8a2DpKLzGKGgStbY+BBxEEz5i7XcjAOfOy6hRf8AKGzO16zB5XYmd9tIJA1XPk9xkmQ07aZRMCiWbZZD1eHclrhk9b4nxITFfGkmhl2zE2PfQPL02mQ2Kvp3h71VUfxbSjsRu6k4u/RMPqD7LFfBbMisXQ6y7bpwxs8YrpF3JybVYt//tROCexWtjc5dU3dKipjtW+JBGXAS5axoe2y5oQU1pAbZdHW15kchE50p2LJQ9xOTkZWRPaC+Sp/SuDFk9U2Rv2B5ga8N 9+9AG+3Z Oca7HsPoSpTnxTwxoJWFqmrivW/EVFWCFe3n/O9uzHU0x+mZIq7WdjpZvACz+Kf4swwCSYkzNCJiAWKQtQnJC8j/2roSMuXuC58mgL+pcTSiIHSpk+tqEYFaTzCs3B9mrfFwLsFry+yC2xnk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/27/2025 11:32 AM, Honggyu Kim wrote: > Hi Joshua, > > On 2/27/2025 6:35 AM, Joshua Hahn wrote: >> We should never try to allocate memory from a memoryless node. Creating a >> sysfs knob to control its weighted interleave weight does not make sense, >> and can be unsafe. >> >> Only create weighted interleave weight knobs for nodes with memory. >> >> Signed-off-by: Joshua Hahn >> --- >>   mm/mempolicy.c | 2 +- >>   1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 4cc04ff8f12c..50cbb7c047fa 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -3721,7 +3721,7 @@ static int add_weighted_interleave_group(struct >> kobject *root_kobj) >>           return err; >>       } >> -    for_each_node_state(nid, N_POSSIBLE) { > > Actually, we're aware of this issue and currently trying to fix this. > In our system, we've attached 4ch of CXL memory for each socket as > follows. > >         node0             node1 >       +-------+   UPI   +-------+ >       | CPU 0 |-+-----+-| CPU 1 | >       +-------+         +-------+ >       | DRAM0 |         | DRAM1 | >       +---+---+         +---+---+ >           |                 | >       +---+---+         +---+---+ >       | CXL 0 |         | CXL 4 | >       +---+---+         +---+---+ >       | CXL 1 |         | CXL 5 | >       +---+---+         +---+---+ >       | CXL 2 |         | CXL 6 | >       +---+---+         +---+---+ >       | CXL 3 |         | CXL 7 | >       +---+---+         +---+---+ >         node2             node3 > > The 4ch of CXL memory are detected as a single NUMA node in each socket, > but it shows as follows with the current N_POSSIBLE loop. > > $ ls /sys/kernel/mm/mempolicy/weighted_interleave/ > node0 node1 node2 node3 node4 node5 > node6 node7 node8 node9 node10 node11 > >> +    for_each_node_state(nid, N_MEMORY) { Thinking it again, we can leave it as a separate patch but add our patch on top of it. The only concern I have is having only N_MEMORY patch hides weight setting knobs for CXL memory and it makes there is no way to set weight values to CXL memory in my system. IMHO, this and our patch is better to be submitted together. Thanks, Honggyu > > But using N_MEMORY doesn't fix this problem and it hides the entire CXL > memory nodes in our system because the CXL memory isn't detected at this > point of creating node*.  Maybe there is some difference when multiple > CXL memory is detected as a single node. > > We have to create more nodes when CXL memory is detected later.  In > addition, this part can be changed to "for_each_online_node(nid)" > although N_MEMORY is also fine here. > > We've internally fixed it using a memory hotpluging callback so we can > upload another working version later. > > Do you mind if we continue fixing this work? > > Thanks, > Honggyu > >>           err = add_weight_node(nid, wi_kobj); >>           if (err) { >>               pr_err("failed to add sysfs [node%d]\n", nid); >