Re: [LSF/MM TOPIC] Tiered memory accounting and management

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yang Shi <shy828301@gmail.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: lsf-pc@lists.linux-foundation.org, Linux MM <linux-mm@kvack.org>,
	 Michal Hocko <mhocko@suse.com>,
	Dan Williams <dan.j.williams@intel.com>,
	 Dave Hansen <dave.hansen@intel.com>,
	Shakeel Butt <shakeelb@google.com>,
	 David Rientjes <rientjes@google.com>
Subject: Re: [LSF/MM TOPIC] Tiered memory accounting and management
Date: Tue, 15 Jun 2021 17:17:33 -0700	[thread overview]
Message-ID: <CAHbLzkqNWn7ONEC=V9z18aWB34rS-q2banDUM=OYU0B=4t91Xw@mail.gmail.com> (raw)
In-Reply-To: <475cbc62-a430-2c60-34cc-72ea8baebf2c@linux.intel.com>

On Mon, Jun 14, 2021 at 2:51 PM Tim Chen <tim.c.chen@linux.intel.com> wrote:
>
>
> From: Tim Chen <tim.c.chen@linux.intel.com>
>
> Tiered memory accounting and management
> ------------------------------------------------------------
> Traditionally, all RAM is DRAM.  Some DRAM might be closer/faster
> than others, but a byte of media has about the same cost whether it
> is close or far.  But, with new memory tiers such as High-Bandwidth
> Memory or Persistent Memory, there is a choice between fast/expensive
> and slow/cheap.  But, the current memory cgroups still live in the
> old model. There is only one set of limits, and it implies that all
> memory has the same cost.  We would like to extend memory cgroups to
> comprehend different memory tiers to give users a way to choose a mix
> between fast/expensive and slow/cheap.
>
> To manage such memory, we will need to account memory usage and
> impose limits for each kind of memory.
>
> There were a couple of approaches that have been discussed previously to partition
> the memory between the cgroups listed below.  We will like to
> use the LSF/MM session to come to a consensus on the approach to
> take.
>
> 1.      Per NUMA node limit and accounting for each cgroup.
> We can assign higher limits on better performing memory node for higher priority cgroups.
>
> There are some loose ends here that warrant further discussions:
> (1) A user friendly interface for such limits.  Will a proportional
> weight for the cgroup that translate to actual absolute limit be more suitable?
> (2) Memory mis-configurations can occur more easily as the admin
> has a much larger number of limits spread among between the
> cgroups to manage.  Over-restrictive limits can lead to under utilized
> and wasted memory and hurt performance.
> (3) OOM behavior when a cgroup hits its limit.
>
> 2.      Per memory tier limit and accounting for each cgroup.
> We can assign higher limits on memories in better performing
> memory tier for higher priority cgroups.  I previously
> prototyped a soft limit based implementation to demonstrate the
> tiered limit idea.
>
> There are also a number of issues here:
> (1)     The advantage is we have fewer limits to deal with simplifying
> configuration. However, there are doubts raised by a number
> of people on whether we can really properly classify the NUMA
> nodes into memory tiers. There could still be significant performance
> differences between NUMA nodes even for the same kind of memory.
> We will also not have the fine-grained control and flexibility that comes
> with a per NUMA node limit.
> (2)     Will a memory hierarchy defined by promotion/demotion relationship between
> memory nodes be a viable approach for defining memory tiers?
>
> These issues related to  the management of systems with multiple kind of memories
> can be ironed out in this session.

Thanks for suggesting this topic. I'm interested in the topic and
would like to attend.

Other than the above points. I'm wondering whether we shall discuss
"Migrate Pages in lieu of discard" as well? Dave Hansen is driving the
development and I have been involved in the early development and
review, but it seems there are still some open questions according to
the latest review feedback.

Some other folks may be interested in this topic either, CC'ed them in
the thread.

>

next prev parent reply	other threads:[~2021-06-16  0:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 21:51 Tim Chen
2021-06-16  0:17 ` Yang Shi [this message]
2021-06-17 18:48   ` Shakeel Butt
2021-06-18 22:11     ` Tim Chen
2021-06-18 23:59       ` Shakeel Butt
2021-06-19  0:56         ` Tim Chen
2021-06-19  1:17           ` Shakeel Butt
2021-06-21 20:42     ` Yang Shi
2021-06-21 21:23       ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHbLzkqNWn7ONEC=V9z18aWB34rS-q2banDUM=OYU0B=4t91Xw@mail.gmail.com' \
    --to=shy828301@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox