From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65124C282D1 for ; Thu, 6 Mar 2025 17:35:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28418280003; Thu, 6 Mar 2025 12:34:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 234EC280001; Thu, 6 Mar 2025 12:34:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0FD18280003; Thu, 6 Mar 2025 12:34:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E7843280001 for ; Thu, 6 Mar 2025 12:34:57 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5FB791A20D9 for ; Thu, 6 Mar 2025 17:34:59 +0000 (UTC) X-FDA: 83191826718.26.4C31511 Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) by imf29.hostedemail.com (Postfix) with ESMTP id B4BE212000D for ; Thu, 6 Mar 2025 17:32:46 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=R9FsNxzs; spf=pass (imf29.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.177 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741282366; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l7+tMCJ7OtJK9Gd69u8t1lDx/CAaXYSFNmDt21Z+gis=; b=XNFG5GBqJsyLVEz8zvBFWZVNTIyr5l20N6ICqjv7DnH5Ai/lP+bZpCWwDzyYrq+iBkrPK9 MPX5sTA3Lf2sl1m6Pvn49VLs11iFWoqI/TWSn8t8GmEcguIhRXOhNyE+w7uDY/62hEfCUW 4hyQcFW1Y8lx9RiggXEKvw9ridatXjY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=R9FsNxzs; spf=pass (imf29.hostedemail.com: domain of gourry@gourry.net designates 209.85.222.177 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741282366; a=rsa-sha256; cv=none; b=M8Uwhku1UFSW7LuUNY63mVELz0txm1zIhzFOsr03jJJvWiUyOsHIKYEKZEVWa/hCS5X+BV ng+32ZUFSYdeLXuCSpO+PpbgkkZGlhchtW/G/FtKbxTwcL1YeWCTdXcuAgTzijW8qZiLsd nC1sLbbyEkIeMo/fR5RbubKPhG0e1rM= Received: by mail-qk1-f177.google.com with SMTP id af79cd13be357-7c08f9d0ef3so55420785a.2 for ; Thu, 06 Mar 2025 09:32:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1741282366; x=1741887166; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=l7+tMCJ7OtJK9Gd69u8t1lDx/CAaXYSFNmDt21Z+gis=; b=R9FsNxzsd0+3d98t5gPGI9JzuwnLKkjIr6xD0E3iTM3EP5GMdIdacSSUJsq94IR7R7 f3me+xujcAEo7koYQC0uokL1vIFE6lXSX6zGKWEgfl8FfRQif7FE0tN0CUw3fryftRfu vpHXWyUY8r0Uz4OYtDC0REPqxseeq7lm49/YBZrGwjKljA01Md2GEt+spCL+uS1GSbfW TukPJd4eGLDKS5IsimnTP7TqNgIiU7/eptu/MxETz2AawDScve2HUHAFzM4Q3GlvOtql czw2vMGtrFllnL3qX9a/lIdfk8NIBxxAYs4azWinVyY8idwtedZFRAKVEWYkHkOwPzmw zqRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741282366; x=1741887166; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=l7+tMCJ7OtJK9Gd69u8t1lDx/CAaXYSFNmDt21Z+gis=; b=nNmY21jzyhLldv7CYGo7+NHT/Y3zT271kwZjPDZSRmpV2cKiUynZPSiscTDdP3b2es JeMcR1qxmjfBfGz74vkf5Uy/U6BM1cvHEsLkaiLBlm9LbBcU4yLehiJZg0VzD5SeP32C 2DHgKMFe59bTgCjs2Z07Kj0ncYA607rSZ5zQSamdWxgVFtsiwXQfTBtMefHqQcNQ5++N 2mLI2eUro0rA9wQuWOijJDF2EJ3ovGIHe3q2taIu39zdv3Ub/BrNX4oaEVAmMeMTFUki OfkLIz+o7jEPwl755H+OBSuvT0oB1Z/PXuKZAwWpT6mso0N77VDfMFq8Qx/QtpJ2W1EU CBjA== X-Forwarded-Encrypted: i=1; AJvYcCWsTtxptZ2pQXMaw9pnyOnxbj1k8vyjlOku1dcIaBpehXzaycRqyoUEZObA5zZ6qMKbxayGQ+PRrA==@kvack.org X-Gm-Message-State: AOJu0Yzj4hdzrKg6NBYp2Zz+qBo8BY7d9T6kVyRLU5sb02qafPhfDz9+ djjAiocyZAWReo2/+uMEQNBYOYnEUdUVYpnEFXxWvAtOTj3Au6NBzy71BsJtYf4= X-Gm-Gg: ASbGnctA0E7Jw0Z7/WU4UoqNkDYam1u2NQknEoKurLuBpqH2r5B8dyCzU3DyjCPquZN qEwfF2bf3sTrXJs8Vz5Szbj3ywZoKFEEXaZboNcT7dyGeD08IaIAXc/fJ9bvp9xsxX9OA6VRzxh W/o3m6B74nEwyEnVELeyi9m76vOyB1HcuFutmjKnyAzhV6SUT6v/VSRLpiNuDBArKBfgtmxwwfF 305ciRUwQSbvBJgOgpGoJHm9zEgskaelBWSPdRn3gYpcVoV6qNygM6qqdklK9JGjaOmER48lBgU Xd7Av9paVWwEoxMfJRZVUBN/nYAdK6VHKq2zF7SImuMaYkcmRrpEW3QKe4/jwmrvTDRtuTvk8OM CrZmQ0IbiiiIsH4vd39BAiket8co= X-Google-Smtp-Source: AGHT+IH3uOwByyoyn3W77J1zHy8yvZSK7GPQs26EVjZnZhtHSMaY2DHWv6MJlU0FC/UTf0LREW6zUw== X-Received: by 2002:a05:620a:1d08:b0:7c0:b24c:c3c8 with SMTP id af79cd13be357-7c3d8e46551mr1097540185a.42.1741282365669; Thu, 06 Mar 2025 09:32:45 -0800 (PST) Received: from gourry-fedora-PF4VCD3F (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7c3e533a032sm115790185a.21.2025.03.06.09.32.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Mar 2025 09:32:44 -0800 (PST) Date: Thu, 6 Mar 2025 12:32:42 -0500 From: Gregory Price To: Honggyu Kim Cc: kernel_team@skhynix.com, Joshua Hahn , harry.yoo@oracle.com, ying.huang@linux.alibaba.com, gregkh@linuxfoundation.org, rakie.kim@sk.com, akpm@linux-foundation.org, rafael@kernel.org, lenb@kernel.org, dan.j.williams@intel.com, Jonathan.Cameron@huawei.com, dave.jiang@intel.com, horen.chuang@linux.dev, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com, yunjeong.mun@sk.com Subject: Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for memoryless nodes Message-ID: References: <20250226213518.767670-1-joshua.hahnjy@gmail.com> <20250226213518.767670-2-joshua.hahnjy@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B4BE212000D X-Stat-Signature: 7esjdusqzdqzgju485iekbmcitpnhs3r X-HE-Tag: 1741282366-329721 X-HE-Meta: U2FsdGVkX19ceuac81eqt1oItOlL6ur2h/USXCsXHcDe0ZPyWl6ZelZcSA11ClDe2MLWUF7d5AiIncuXIVsMbHwdrxgHQXlATDxZqz2XlJyys/0ftYNWuKrmA0yW/JbCea4PF5M6TZMZnM9ASEtEgH0Yg9Ts53++olhG4nIkVRwqg4wfCwuhAI+la5Sb+lwweyLeJdZfXQGLNqmzaO5zWSpeM5mGvR8xO5VcbIZhaNHolmCsMeAkp1uKPQ2kmHZLG7gzswBtECtbBuWWKHnK7U51m/T+GScXuVJhFiNzw34p/CrIKu6ZsaS2Ec9L2Y3geAnJFVgK3pfKjD3rsCS17cttYDIBaC8GFEy1ujdlhcmlZbGO4maEPWOoprg/BF4xTbqtbtGlGDuR+mlRxCpgbV9YuIdpeNTUB2mTFLePIxDoh2LA8UzkH7LRBIBTeLAyBL7WQgMXOpxKr5T6/HdQqAgh+t5F6D9jx54c7s95E73xEigedpwhwzXQw33yzpEAglpXeUX9UDdc3aX6ytOXIGW0/hq/II7P3Hh+prQFGodtSOYNENjFVtVMgD65QgWGPZ4NmaJquPNA6uDvpCJek0qzIZFO93xyg9aegA3eevn6pb/y7A1GNIHE7tcLE/myRkg8slRw13rNFWPcIFnLNLYW6gKnSFiF62jKb/U67eQSd5qX3RsUMIoZS6MPUx0ssN6l5waK1hGnEoH8U48paxPccKMi3XgAHyFNdPDIMpd3fJ9x+hI96E08BPaDzdb4tNNyQ876kdroArWIjtncQKOwUzomrD4zqB0Gd/IgfLuhGmSCpuaRj2Msu5bC71Pm+DfIzVezeOcWsXM1U0+Q0E1PVtHQHz+o0njsIdMuTFKy56YKdwRLf5lMFzik+RbRrPd4duUrEu69wposw1Wy3tF0sRvMSi6j66wEIHkdLzm+cEksvhIqWP6UVqdze44lQ6JmU5sBpJkVpvhSJHp Tyda7ge0 Bw/CfOyEaO77GObRChJfl2PkCi/LO7EBT0Idc9JexWGjnwiaGP2NlRDtMlrsz7qG+X1Vr2//1GADHtYLFCtytOayRFDZ9EFuUX2ogNQS4ut9SobUYzAxUE7m35jsDurwBpU2PqstfuPqL/FCM/Nnzj4zFfESJJCxOJo5fr/wDXPgJtGJxlgORCR5P5EUM56MHr3Ll/xAmgxtHc04hp8S0+TnxVa/K9F1P2V7Elm+aZLhPGquNta4hKcEW+lHzAwguQKYhLovePbe6PWBMXeSrEK/M05fEIfiKMIUZqWh7u+B473Zew1+tBQHoiTWGbluttCL5uOj5LXDhG9dD5JOlS9P0XP9LgFx3QFbAaF/9NjwcZS2+I60o/HOjFySUqPDY3Yo2/ajPYfoxS4lM0qhUOR1+fg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 06, 2025 at 09:39:26PM +0900, Honggyu Kim wrote: > > The memoryless nodes are printed as follows after those ACPI, SRAT, > Node N PXM M messages. > > [ 0.010927] Initmem setup node 0 [mem > 0x0000000000001000-0x000000207effffff] > [ 0.010930] Initmem setup node 1 [mem > 0x0000060f80000000-0x0000064f7fffffff] > [ 0.010992] Initmem setup node 2 as memoryless > [ 0.011055] Initmem setup node 3 as memoryless > [ 0.011115] Initmem setup node 4 as memoryless > [ 0.011177] Initmem setup node 5 as memoryless > [ 0.011238] Initmem setup node 6 as memoryless > [ 0.011299] Initmem setup node 7 as memoryless > [ 0.011361] Initmem setup node 8 as memoryless > [ 0.011422] Initmem setup node 9 as memoryless > [ 0.011484] Initmem setup node 10 as memoryless > [ 0.011544] Initmem setup node 11 as memoryless > > This is related why the 12 nodes at sysfs knobs are provided with the > current N_POSSIBLE loop. > This isn't actually why, this is another symptom. This gets printed because someone is marking nodes 4-11 as possible and setup_nr_node_ids reports 12 total nodes void __init setup_nr_node_ids(void) { unsigned int highest; highest = find_last_bit(node_possible_map.bits, MAX_NUMNODES); nr_node_ids = highest + 1; } Given your configuration data so far, we may have a bug somewhere (or i'm missing a configuration piece). > > Basically I need to know: > > 1) Is each CXL device on a dedicated Host Bridge? > > 2) Is inter-host-bridge interleaving configured? > > 3) Is intra-host-bridge interleaving configured? > > 4) Do SRAT entries exist for all nodes? > > Are there some simple commands that I can get those info? > The content of the CEDT would be sufficient - that will show us the number of CXL host bridges. > > 5) Why are there 12 nodes but only 10 sources? Are there additional > > devices left out of your diagram? Are there 2 CFMWS but and 8 Memory > > Affinity records - resulting in 10 nodes? This is strange. > > My blind guess is that there could be a logic node that combines 4ch of > CXL memory so there are 5 nodes per each socket. Adding 2 nodes for > local CPU/DRAM makes 12 nodes in total. > The issue is that nodes have associated memory regions. If there are multiple nodes with overlapping memory regions, that seems problematic. If there are "possible nodes" without memory and no real use case (because the memory is associated with the aggregate node) then those nodes probably shouldn't be reported as possible. the tl;dr here is we should figure out what is marking those nodes as possible. > Not sure about this part but our approach with hotplug_memory_notifier() > resolves this problem. Rakie will submit an initial working patchset > soonish. This may just be a bandaid on the issue. We should get our node configuration correct from the get-go. ~Gregory