From: Gregory Price <gourry.memverge@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: linux-cxl@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
ying.huang@intel.com, akpm@linux-foundation.org,
mhocko@kernel.org, tj@kernel.org, lizefan.x@bytedance.com,
hannes@cmpxchg.org, corbet@lwn.net, roman.gushchin@linux.dev,
shakeelb@google.com, muchun.song@linux.dev,
Gregory Price <gregory.price@memverge.com>
Subject: [RFC PATCH v4 3/3] Documentation: sysfs entries for cgroup.memory.interleave_weights
Date: Wed, 8 Nov 2023 19:25:17 -0500 [thread overview]
Message-ID: <20231109002517.106829-4-gregory.price@memverge.com> (raw)
In-Reply-To: <20231109002517.106829-1-gregory.price@memverge.com>
cgroup.memory.interleave_weights is an array of numa node weights
to be used for interleaving when mempolicy utilizes MPOL_F_IL_WEIGHTING.
By default, weights are set to 1, and are only displayed for possible
numa nodes (ones which are or may become online).
Node weights are set individually, and by default are inherited from
the parent cgroup. Inherited weights may be overridden, and overridden
weights may be reverted to inherit from the parent.
Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
Documentation/admin-guide/cgroup-v2.rst | 45 +++++++++++++++++++
.../admin-guide/mm/numa_memory_policy.rst | 11 +++++
2 files changed, 56 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index b26b5274eaaf..273dbd01a7ec 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1640,6 +1640,51 @@ PAGE_SIZE multiple when read back.
Shows pressure stall information for memory. See
:ref:`Documentation/accounting/psi.rst <psi>` for details.
+ memory.interleave_weights
+ An array of weights to be used for the interleave mempolicy.
+
+ By default, weights are set to 1, and are only displayed for
+ possible numa nodes (ones which are or may become online).
+
+ Example::
+
+ cat memory.interleave_weights
+ 0:1,1:1
+
+ Here both nodes 0 and 1 are set to weight 1. Node weights are
+ set individually.
+
+ Example::
+
+ echo "0:3" > memory.interleave_weights
+ echo "1:1" > memory.interleave_weights
+
+ Here we set a 3:1 ratio for nodes 0 and 1. Mempolicy will
+ allocate 3 pages on node 0 before allocating 1 page on node 1.
+
+ Child cgroups inherit weights from their parent and may override
+ them or revert back to inheriting the parent weights by writing
+ -1:0 to memory.interleave_weights.
+
+ Example::
+
+ echo "0:3" > parent/memory.interleave_weights
+ echo "1:1" > parent/memory.interleave_weights
+
+ # Child cgroup inherits these weights
+ cat parent/child/memory.interleave_weights
+ 0:3,1:1
+
+ # Override the weights
+ echo "0:5" > parent/child/memory.interleave_weights
+ echo "1:2" > parent/child/memory.interleave_weights
+ cat parent/child/memory.interleave_weights
+ 0:5,1:2
+
+ # Revert the child back to inheriting the parent weights
+ echo "-1:0" > parent/child/memory.interleave_weights
+ cat parent/child/memory.interleave_weights
+ 0:3,1:1
Usage Guidelines
~~~~~~~~~~~~~~~~
diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
index eca38fa81e0f..7c82e38dbd2b 100644
--- a/Documentation/admin-guide/mm/numa_memory_policy.rst
+++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
@@ -243,6 +243,17 @@ MPOL_INTERLEAVED
address range or file. During system boot up, the temporary
interleaved system default policy works in this mode.
+ The default interleave behavior is round-robin, however cgroups
+ implement an interleave_weights feature which can be used to
+ change the interleave distribution. When weights are used,
+ the behavior above remains the same, but placement adheres to
+ weights such that multiple allocations will respected the set
+ weights. For example, if the weights for nodes 0 and 1 are
+ 3 and 1 respectively (0:3,1:1), then 3 pages will be allocated
+ on node 0 for every 1 page allocated on node 1.
+
+ For more details, see `Documentation/admin-guide/cgroup-v2.rst`
+
MPOL_PREFERRED_MANY
This mode specifies that the allocation should be preferably
satisfied from the nodemask specified in the policy. If there is
--
2.39.1
next prev parent reply other threads:[~2023-11-09 0:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 0:25 [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control Gregory Price
2023-11-09 0:25 ` [RFC PATCH v4 1/3] mm/memcontrol: implement memcg.interleave_weights Gregory Price
2023-11-09 0:25 ` [RFC PATCH v4 2/3] mm/mempolicy: implement weighted interleave Gregory Price
2023-11-10 15:26 ` Ravi Jonnalagadda
2023-11-09 0:25 ` Gregory Price [this message]
2023-11-09 10:02 ` [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control Michal Hocko
2023-11-09 15:10 ` Gregory Price
2023-11-09 16:34 ` Gregory Price
2023-11-10 9:05 ` Michal Hocko
2023-11-10 21:24 ` Gregory Price
[not found] ` <klhcqksrg7uvdrf6hoi5tegifycjltz2kx2d62hapmw3ulr7oa@woibsnrpgox4>
2023-11-09 22:48 ` John Groves
2023-11-10 22:05 ` tj
2023-11-10 22:29 ` Gregory Price
2023-11-11 3:05 ` tj
2023-11-11 3:42 ` Gregory Price
2023-11-11 11:16 ` tj
2023-11-11 23:54 ` Dan Williams
2023-11-13 2:22 ` Gregory Price
2023-11-14 9:43 ` Michal Hocko
2023-11-14 15:50 ` Gregory Price
2023-11-14 17:01 ` Michal Hocko
2023-11-14 17:49 ` Gregory Price
2023-11-15 5:56 ` Huang, Ying
2023-12-04 3:33 ` Gregory Price
2023-12-04 8:19 ` Huang, Ying
2023-12-04 13:50 ` Gregory Price
2023-12-05 9:01 ` Huang, Ying
2023-12-05 14:47 ` Gregory Price
2023-12-06 0:50 ` Huang, Ying
2023-12-06 2:01 ` Gregory Price
2023-11-10 6:16 ` Huang, Ying
2023-11-10 19:54 ` Gregory Price
2023-11-13 1:31 ` Huang, Ying
2023-11-13 2:28 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231109002517.106829-4-gregory.price@memverge.com \
--to=gourry.memverge@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=gregory.price@memverge.com \
--cc=hannes@cmpxchg.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox