Re: [RFC PATCH v3 0/4] Node Weights and Weighted Interleave

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Huang, Ying" <ying.huang@intel.com>
To: Ravi Jonnalagadda <ravis.opensrc@micron.com>
Cc: <akpm@linux-foundation.org>,  <aneesh.kumar@linux.ibm.com>,
	<apopple@nvidia.com>,  <dave.hansen@intel.com>,
	<gourry.memverge@gmail.com>,  <gregkh@linuxfoundation.org>,
	<gregory.price@memverge.com>,  <hannes@cmpxchg.org>,
	<linux-cxl@vger.kernel.org>,  <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>,  <mhocko@suse.com>,  <rafael@kernel.org>,
	<shy828301@gmail.com>,  <tim.c.chen@intel.com>,
	 <weixugc@google.com>
Subject: Re: [RFC PATCH v3 0/4] Node Weights and Weighted Interleave
Date: Fri, 03 Nov 2023 15:00:18 +0800	[thread overview]
Message-ID: <87o7gbz5h9.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <20231102093542.70-1-ravis.opensrc@micron.com> (Ravi Jonnalagadda's message of "Thu, 2 Nov 2023 15:05:42 +0530")

Ravi Jonnalagadda <ravis.opensrc@micron.com> writes:

> Should Node based interleave solution be considered complex or not would probably
> depend on number of numa nodes that would be present in the system and whether
> we are able to setup the default weights correctly to obtain optimum bandwidth
> expansion.

Node based interleave is more complex than tier based interleave.
Because you have less tiers than nodes in general.

>>
>>> Pros and Cons of Memory Tier based interleave:
>>> Pros:
>>> 1. Programming weight per initiator would apply for all the nodes in the tier.
>>> 2. Weights can be calculated considering the cumulative bandwidth of all
>>> the nodes in the tier and need to be programmed once for all the nodes in a
>>> given tier.
>>> 3. It may be useful in cases where numa nodes with similar latency and bandwidth
>>> characteristics increase, possibly with pooling use cases.
>>
>>4. simpler.
>>
>>> Cons:
>>> 1. If nodes with different bandwidth and latency characteristics are placed
>>> in same tier as seen in the current mainline kernel, it will be difficult to
>>> apply a correct interleave weight policy.
>>> 2. There will be a need for functionality to move nodes between different tiers
>>> or create new tiers to place such nodes for programming correct interleave weights.
>>> We are working on a patch to support it currently.
>>
>>Thanks!  If we have such system, we will need this.
>>
>>> 3. For systems where each numa node is having different characteristics,
>>> a single node might end up existing in different memory tier, which would be
>>> equivalent to node based interleaving.
>>
>>No.  A node can only exist in one memory tier.
>
> Sorry for the confusion what i meant was, if each node is having different 
> characteristics, to program the memory tier weights correctly we need to place
> each node in a separate tier of it's own. So each memory tier will contain
> only a single node and the solution would resemble node based interleaving.
>
>>
>>> On newer systems where all CXL memory from different devices under a
>>> port are combined to form single numa node, this scenario might be
>>> applicable.
>>
>>You mean the different memory ranges of a NUMA node may have different
>>performance?  I don't think that we can deal with this.
>
> Example Configuration: On a server that we are using now, four different
> CXL cards are combined to form a single NUMA node and two other cards are
> exposed as two individual numa nodes.
> So if we have the ability to combine multiple CXL memory ranges to a
> single NUMA node the number of NUMA nodes in the system would potentially
> decrease even if we can't combine the entire range to form a single node.

Sorry, I misunderstand your words.  Yes, it's possible that there one
tier for each node in some systems.  But I guess we will have less
tiers than nodes in general.

--
Best Regards,
Huang, Ying

>>
>>> 4. Users may need to keep track of different memory tiers and what nodes are present
>>> in each tier for invoking interleave policy.
>>
>>I don't think this is a con.  With node based solution, you need to know
>>your system too.
>>
>>>>
>>>>> Could you elaborate on the 'get what you pay for' usecase you
>>>>> mentioned?
>>>>
>>
>>--
>>Best Regards,
>>Huang, Ying
> --
> Best Regards,
> Ravi Jonnalagadda

next prev parent reply	other threads:[~2023-11-03  7:02 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-31  0:38 Gregory Price
2023-10-31  0:38 ` [RFC PATCH v3 1/4] base/node.c: initialize the accessor list before registering Gregory Price
2023-10-31  0:38 ` [RFC PATCH v3 2/4] node: add accessors to sysfs when nodes are created Gregory Price
2023-10-31  0:38 ` [RFC PATCH v3 3/4] node: add interleave weights to node accessor Gregory Price
2023-10-31  0:38 ` [RFC PATCH v3 4/4] mm/mempolicy: modify interleave mempolicy to use node weights Gregory Price
2023-10-31 17:52   ` [EXT] " Srinivasulu Thanneeru
2023-10-31 18:23   ` Srinivasulu Thanneeru
2023-10-31  9:53 ` [RFC PATCH v3 0/4] Node Weights and Weighted Interleave Michal Hocko
2023-10-31 15:21   ` Johannes Weiner
2023-10-31 15:56     ` Michal Hocko
2023-10-31  4:27       ` Gregory Price
2023-11-01 13:45         ` Michal Hocko
2023-11-01 16:58           ` Gregory Price
2023-11-02  9:47             ` Michal Hocko
2023-11-02  3:18               ` Gregory Price
2023-11-03  7:45                 ` Huang, Ying
2023-11-03 14:16                   ` Jonathan Cameron
2023-11-06  3:20                     ` Huang, Ying
2023-11-03  9:56                 ` Michal Hocko
2023-11-02 18:21                   ` Gregory Price
2023-11-03 16:59                     ` Michal Hocko
2023-11-02  2:01         ` Huang, Ying
2023-10-31 16:22       ` Johannes Weiner
2023-10-31  4:29         ` Gregory Price
2023-11-01  2:34         ` Huang, Ying
2023-11-01  9:29           ` Ravi Jonnalagadda
2023-11-02  6:41             ` Huang, Ying
2023-11-02  9:35               ` Ravi Jonnalagadda
2023-11-02 14:13                 ` Jonathan Cameron
2023-11-03  7:00                 ` Huang, Ying [this message]
2023-11-01 13:56         ` Michal Hocko
2023-11-02  6:21           ` Huang, Ying
2023-11-02  9:30             ` Michal Hocko
2023-11-01  2:21       ` Huang, Ying
2023-11-01 14:01         ` Michal Hocko
2023-11-02  6:11           ` Huang, Ying
2023-11-02  9:28             ` Michal Hocko
2023-11-03  7:10               ` Huang, Ying
2023-11-03  9:39                 ` Michal Hocko
2023-11-06  5:08                   ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7gbz5h9.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=apopple@nvidia.com \
    --cc=dave.hansen@intel.com \
    --cc=gourry.memverge@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregory.price@memverge.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rafael@kernel.org \
    --cc=ravis.opensrc@micron.com \
    --cc=shy828301@gmail.com \
    --cc=tim.c.chen@intel.com \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox