linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Price <gourry@gourry.net>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	dave.jiang@intel.com, Jonathan.Cameron@huawei.com,
	horenchuang@bytedance.com, linux-kernel@vger.kernel.org,
	linux-acpi@vger.kernel.org, dan.j.williams@intel.com,
	lenb@kernel.org, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: Re: [PATCH] acpi/hmat,mm/memtier: always register hmat adist calculation callback
Date: Tue, 30 Jul 2024 02:12:27 -0400	[thread overview]
Message-ID: <ZqiES1T6PTQHD2Bl@PC2K9PVX.TheFacebook.com> (raw)
In-Reply-To: <Zqh3-TWBkhyY5kPw@PC2K9PVX.TheFacebook.com>

On Tue, Jul 30, 2024 at 01:19:53AM -0400, Gregory Price wrote:
> On Tue, Jul 30, 2024 at 09:12:55AM +0800, Huang, Ying wrote:
> > > Right now HMAT appears to be used prescriptively, this despite the fact
> > > that there was a clear intent to separate CPU-nodes and non-CPU-nodes in
> > > the memory-tier code. So this patch simply realizes this intent when the
> > > hints are not very reasonable.
> > 
> > If HMAT isn't available, it's hard to put memory devices to
> > appropriate memory tiers without other information.  In commit
> > 992bf77591cb ("mm/demotion: add support for explicit memory tiers"),
> > Aneesh pointed out that it doesn't work for his system to put
> > non-CPU-nodes in lower tier.
> > 
> 
> Per Aneesh in 992bf77591cb - The code explicitly states the intent is
> to put non-CPU-nodes in a lower tier by default.
> 
> 
>     The current implementation puts all nodes with CPU into the highest
>     tier, and builds the tier hierarchy by establishing the per-node
>     demotion targets based on the distances between nodes.
> 
> This is accurate for the current code
> 
> 
>     The current tier initialization code always initializes each
>     memory-only NUMA node into a lower tier.
> 
> This is *broken* for the currently upstream code.
> 
> This appears to be the result of the hmat adistance callback introduction
> (though it may have been broken before that).
> 
> ~Gregory

Digging into the history further for the sake of completeness

6c542ab ("mm/demotion: build demotion targets based on ...")

    mm/demotion: build demotion targets based on explicit memory tiers

    This patch switch the demotion target building logic to use memory
    tiers instead of NUMA distance.  All N_MEMORY NUMA nodes will be placed
    in the default memory tier and additional memory tiers will be added by
    drivers like dax kmem.

The decision made in this patch breaks memory-tiers.c for all BIOS
configured CXL devices that generate a DRAM node during early boot,
but for which HMAT is absent or otherwise broken - the new HMAT code
addresses the situation for when HMAT is present.

Hardware supporting this style of configuration has been around for at
least a few years now. I think we should at the very least consider adding
an option to restore this (!N_CPU)=Lower Tier behavior - if not
defaulting to the behavior when HMAT data is not present.

~Gregory


  reply	other threads:[~2024-07-30 16:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-26 21:55 Gregory Price
2024-07-29  1:02 ` Huang, Ying
2024-07-29 14:22   ` Gregory Price
2024-07-30  1:12     ` Huang, Ying
2024-07-30  3:18       ` Gregory Price
2024-07-31  1:22         ` Huang, Ying
2024-07-30 19:58           ` Gregory Price
2024-07-31  7:20             ` Huang, Ying
2024-07-30 20:26               ` Gregory Price
2024-08-27 14:33           ` Gregory Price
2024-07-30  5:19       ` Gregory Price
2024-07-30  6:12         ` Gregory Price [this message]
2024-07-31  1:10         ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqiES1T6PTQHD2Bl@PC2K9PVX.TheFacebook.com \
    --to=gourry@gourry.net \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=horenchuang@bytedance.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox