From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9703EC282DE for ; Thu, 13 Mar 2025 17:31:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1AB228000D; Thu, 13 Mar 2025 13:31:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC969280001; Thu, 13 Mar 2025 13:31:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A928828000D; Thu, 13 Mar 2025 13:31:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 89CF8280001 for ; Thu, 13 Mar 2025 13:31:05 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1E8BB803D3 for ; Thu, 13 Mar 2025 17:31:06 +0000 (UTC) X-FDA: 83217218532.23.9BDBC53 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by imf13.hostedemail.com (Postfix) with ESMTP id 6870F20025 for ; Thu, 13 Mar 2025 17:31:02 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=XvdcVRIX; spf=pass (imf13.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.42 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741887063; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O83EJQJInxOlaXTG5L5nMhpQUXrgmfAXio9XKQcNg+E=; b=G9c2xjbKE203rb5zgjBava0Kivi5jW1/eb6dCDT7iI7jpFElisNVgbQpx5HMUvLX0KCb0m hQFhY5r3ICVZk42nrcfaB3EJQyk/2GmCfoiB5xleHm4qUcFZelK8ERF2MmpBxa7d7MfnQE Mqc9Wd89xqJS8ryfcNfi35sckNz9iWo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741887064; a=rsa-sha256; cv=none; b=cfl5qaNaP08T/a8VmV4srdBVR2RGoy1K1C+PwvaMenhIy6KPo0B7XSXs5FH9c1Kd4spD8q zFLLy3xCdghQAuA5RpVuZSNprfrXg5kXjPAgFEsuw5YshIn0QsFggUAjw4SK0LwPoOOE93 xlW7f/ralXzlp+WT6UBzv7i8A1CnKzY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=XvdcVRIX; spf=pass (imf13.hostedemail.com: domain of gourry@gourry.net designates 209.85.219.42 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6dd1962a75bso10258096d6.3 for ; Thu, 13 Mar 2025 10:31:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1741887061; x=1742491861; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=O83EJQJInxOlaXTG5L5nMhpQUXrgmfAXio9XKQcNg+E=; b=XvdcVRIXMsnIGC/4QK+5WlP0qNQxgEx7YVO051v1PudGx0Q6+ZMJNvFpSHT36FTUQi mW+jFR51SBp7NZQy8UrczzHmNT81L+UDhKpQkwadxgr4MeAqvtKRUegu+oiC2zMaeRfB x9MTAWdvI0QQQTktKQDEDy5uYBoRZnVDJtRQuBDez6ut3D0z/w+lTmgSPsEhDmzJs+0D hj9odrJ0q9KuI4pb/RmYZeUqhRoJlO+NNpzOVgKbbWiNOGcnNJqgAlJ6clrAh8awlanY X6WybzFN6X2foUcH4JeD17prnkRndtBd6+fwgvHZHOhuWhz3s8r3IKHHL1nvt4O5uG9z NNjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741887061; x=1742491861; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=O83EJQJInxOlaXTG5L5nMhpQUXrgmfAXio9XKQcNg+E=; b=MelkCLCjlCtbmfW09krwbglW4bw6x2ps91kn6Pn4RNGCmNZia10w2dFWkiUKYXRcrV SsaC89jIlQ3Ykk+jiGQ+EBBnoj5G3FV3FjRz30ygHEU8TWxUCNGXO9ot/DO4tF0nifhP HdgtnUOMNgPHQyQSQASrOP1OvfW6pAXoQf1Y/p323eoI+xD2PVJ4Okb6Pq03JsUcsHTn 5zaxrLhXb1dAkEsLisKFrnagTAz+m2XBku0NNmVXGjHi6fB2g5vpOqrRN0GQzxxtJDfT FPoPe82qWpGCc/CAIvSt5JIpx4RMAiZHyrBaPZfCd1N72SXCyFatOS/bbWdBmsFE3JWG RRgA== X-Forwarded-Encrypted: i=1; AJvYcCXkgR7TxsWnyKJlFsmOVVGCHiEIZyzT61sbzv/CRqBB2N8UggPr4Lw8jwzKIlITezwF6DGahX6xKw==@kvack.org X-Gm-Message-State: AOJu0YzyUwY9RiZcNFZumpE2MVaOZ6bZ477vVSGlXZ6QqmAfR0hbFPcS gr8/LU/F1f4ovTKFBO2cMRTv6KYlEmDmkI6XgRPQAAX6tKEUXNz1YKV+GwTfHPiO02k8cr1vHL+ E X-Gm-Gg: ASbGncuOzgRiosSt91nFAWwUQtjkSzgPPQd08rGuK6MRZ2RVYfg1Ccb1XK9bVLw0/lN tGZrB6kjDT908YmS1cM4+dlFP5zsgxkmklxbIwIM47NqhL900OVLp6taOfXhHh40ClJdOgJXZcL FHcrtCdL9KPdV5AXzk2Niick4Zv3ZyauSXg8MfUKIHipDags1H5f6cg0DBno80wjWFUGGO1UAQO RVkR7sqp++nTwiT+QGhl0FmCyooAM+OsYgkS0yY9+OKgpPxn8QJk9gg+DB5WqM8LJrZSt6Pb47n Ov1OWBBLqtLWRTkz9nZnjf09YkG+McPLzCjl6pwU4TOT9ix5+LbHGpNEpVWVObHq1hoFvMppbEO OpyfvfGyO5Qu1wQapEERhLHb1LSlnGtqkmkrdww== X-Google-Smtp-Source: AGHT+IHuylmJfYk7wLLXX/2M7mKuROyvy47bvGrvCnuvmNFtINIICBzR6GpKU9pV+TBfiu9Nbi1ktg== X-Received: by 2002:ad4:4ee9:0:b0:6e4:4484:f35b with SMTP id 6a1803df08f44-6eae7aa22d7mr6914286d6.30.1741887061046; Thu, 13 Mar 2025 10:31:01 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-476bb663603sm11829221cf.43.2025.03.13.10.30.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Mar 2025 10:31:00 -0700 (PDT) Date: Thu, 13 Mar 2025 13:30:58 -0400 From: Gregory Price To: Jonathan Cameron Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [LSF/MM] CXL Boot to Bash - Section 0: ACPI and Linux Resources Message-ID: References: <20250313165539.000001f4@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250313165539.000001f4@huawei.com> X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 6870F20025 X-Stat-Signature: ru15iqyynrf8z6u4hj633kwzccbtz9rk X-HE-Tag: 1741887062-271798 X-HE-Meta: U2FsdGVkX19nb1qecm4EhcbsfWuAHhB+z0Ogx9grs86nub0RxfxPecD7AQDGA8tQNpBIEcJnU7RZ1+SRZaO0fZQgq2v1s5WDKxgJq+2EaW9jdr8eh9MZLGR3+A1VpPJoggJJVZjsqAirurl3iPf1AWsGrUiBx55/76X8G1nXm3BQOncfIwzjw6sIOU6hLgDsEC4ZUsz+0XSVino4PZ5HapihtpATlE2P2A+fdZFWSFgXoOsoRWZunQKfvS6kFHyZ4HCORYw5haOh5eusQsMzzkdvVt3l+W/4u48VGMFhWMqzCEEzXrWDDmNDIW/gZD4lriolF1bRkTnvxN7+NzXpB2vpHVOdJ2LTKiSR87yelznf+VNiOZ0rcpwjXTKJUkcWGJGWZavL5vru3uE7L6kt25+CCigQteQ3mzs9WU4tAc7Ir6PetWUWz1zksR6Xy7OL9nrDNP8vSb0fh5gus3OT/0msiM+pOgvlKBnJMYQpMnbQAM8k6MOMTxvp3I3PRoLX00Cq+vbKSSe0pIMrKuJoEeyqhQzGDxN2PVtZ5LyRp4z5eWObI8tupYwC2VviM/XaiUxTBlHdSy/aYqFE7L7PoHZkukt2RkwLOGahWVRrEV+FZchWsTQFe/Ie78zz480PKzyZlaWKlpSSah6L9rt4Sm6lcpeAEk9x9vf0cqyfqBrfV3vpRiWm2BxBduFS7BiWWs7abLrCjjfqVRJzzs3FDVpsNH7K40lmbOo4iZusTZ9rsHA++I4jeAySAEFfxxQRpg6UXlQ91QudswcMuc5UF8tX7n8LnMJTQrxGNsugZJ3J23BA80mwkcorVm1Z2KuCr4V5E8/TtWgEkoLhvkialG/047N+H4B81XBlQK4WunMivK+5odLwYW9lDetq57c6FvcPf6aK6T1QfWsz8GPGiJZC5E6UTb+jzmwlDpURITFOuzfiA3sQU1R9f07SK2oOpmV6uvkX/jJ8I71ZNht yoqbAJ/S s6H75w/z/LLEfOKgFPMiMQNTjwOIbMuEYOA/4XLw6od2wc065GvjsdLbLh2p0cPnPHrSJe9miCi1C0GY6YrE3sukIYX6OWxMbsGA5JfR8n/rp7HKqQNotvRuR67L4ITHsGNpo0chAxkx6HSGKpjvwSeLilztXk6Ztid2wCm9bBWzYNXSI6CthnAtt1HaPfwGXN0N/4pEIF6aL/eae2rXEfM5JMCbHWOtSAu7HLHPutB90weF2eOj7O8T1Fc0n8NX4hn/H/31PWvADL13cE4FWMVT7QGa9eN4MEaeV0SArM1gLx+/38VfJMhi2Mni+T43xDDMbfF3HMEmXiyU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 13, 2025 at 04:55:39PM +0000, Jonathan Cameron wrote: > > Maybe ignore Generic Initiators for this doc. They are relevant for > CXL but in the fabric they only matter for type 1 / 2 devices not > memory and only if the BIOS wants to do HMAT for end to end. Gets > more fun when they are in the host side of the root bridge. > Fair, I wanted to reference the proposals but I personally don't have a strong understanding of this yet. Dave Jiang mentioned wanting to write some info on CDAT with some reference to the Generic Port work as well. Some help understanding this a little better would be very much appreciated, but I like your summary below. Noted for updated version. > # Generic Port > > In the scenario where CXL memory devices are not present at boot, or > not configured by the BIOS or he BIOS has not provided full HMAT > descriptions for the configured memory, we may still want to > generate proximity domain configurations for those devices. > The Generic Port structures are intended to fill this gap, so > that performance information can still be utilized when the > devices are available at runtime by combining host information > with that discovered from devices. > > Or just > # Generic Ports > > These are fun ;) > > > > > ==== > > HMAT > > ==== > > The Heterogeneous Memory Attributes Table contains information such as > > cache attributes and bandwidth and latency details for memory proximity > > domains. For the purpose of this document, we will only discuss the > > SSLIB entry. > > No fun. You miss Intel's extensions to memory-side caches ;) > (which is wise!) > Yes yes, but I'm trying to be nice. I'm debating on writing the Section 4 interleave addendum on Zen5 too :P > > ================== > > NUMA node creation > > =================== > > NUMA nodes are *NOT* hot-pluggable. All *POSSIBLE* NUMA nodes are > > identified at `__init` time, more specifically during `mm_init`. > > > > What this means is that the CEDT and SRAT must contain sufficient > > `proximity domain` information for linux to identify how many NUMA > > nodes are required (and what memory regions to associate with them). > > Is it worth talking about what is effectively a constraint of the spec > and what is a Linux current constraint? > > SRAT is only ACPI defined way of getting Proximity nodes. Linux chooses > to at most map those 1:1 with NUMA nodes. > CEDT adds on description of SPA ranges where there might be memory that Linux > might want to map to 1 or more NUMA nodes > Rather than asking if it's worth talking about, I'll spin that around and ask what value the distinction adds. The source of the constraint seems less relevant than "All nodes must be defined during mm_init by something - be it ACPI or CXL source data". Maybe if this turns into a book, it's worth breaking it out for referential purposes (pointing to each point in each spec). > > > > Basically, the heuristic is as follows: > > 1) Add one NUMA node per Proximity Domain described in SRAT > > if it contains, memory, CPU or generic initiator. > noted > > 2) If the SRAT describes all memory described by all CFMWS > > - do not create nodes for CFMWS > > 3) If SRAT does not describe all memory described by CFMWS > > - create a node for that CFMWS > > > > Generally speaking, you will see one NUMA node per Host bridge, unless > > inter-host-bridge interleave is in use (see Section 4 - Interleave). > > I just love corners: QoS concerns might mean multiple CFMWS and hence > multiple nodes per host bridge (feel free to ignore this one - has > anyone seen this in the wild yet?) Similar mess for properties such > as persistence, sharing etc. This actually come up as a result of me writing this - this does exist in the wild and is causing all kinds of fun on the weighted_interleave functionality. I plan to come back and add this as an addendum, but probably not until after LSF. We'll probably want to expand this into a library of case studies that cover these different choices - in hopes of getting some set of *suggested* configurations for platform vendors to help play nice with linux (especially for things that actually consume these blasted nodes). ~Gregory