From: Gregory Price <gourry.memverge@gmail.com>
To: linux-mm@kvack.org, jgroves@micron.com, ravis.opensrc@micron.com,
sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com
Cc: linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
arnd@arndb.de, tglx@linutronix.de, luto@kernel.org,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
x86@kernel.org, hpa@zytor.com, mhocko@kernel.org, tj@kernel.org,
ying.huang@intel.com, gregory.price@memverge.com, corbet@lwn.net,
rakie.kim@sk.com, hyeongtak.ji@sk.com, honggyu.kim@sk.com,
vtavarespetr@micron.com, peterz@infradead.org,
Frank van der Linden <fvdl@google.com>
Subject: [RFC PATCH 07/11] mm/mempolicy: add userland mempolicy arg structure
Date: Wed, 6 Dec 2023 19:27:55 -0500 [thread overview]
Message-ID: <20231207002759.51418-8-gregory.price@memverge.com> (raw)
In-Reply-To: <20231207002759.51418-1-gregory.price@memverge.com>
This patch adds the new user-api argument structure intended for
set_mempolicy2 and mbind2.
struct mpol_args {
/* Basic mempolicy settings */
unsigned short mode;
unsigned short mode_flags;
unsigned long *pol_nodes;
unsigned long pol_maxnodes;
/* get_mempolicy2: policy information (e.g. next interleave node) */
int policy_node;
/* get_mempolicy2: memory range policy */
unsigned long addr;
int addr_node;
/* all operations: policy home node */
unsigned long home_node;
/* mbind2: address ranges to apply the policy */
const struct iovec __user *vec;
size_t vlen;
};
This structure is intended to be extensible as new mempolicy extensions
are added.
For example, set_mempolicy_home_node was added to allow vma mempolicies
to have a preferred/home node assigned. This structure allows the
addition of that setting at the time the mempolicy is set, rather
than requiring additional calls to modify the policy.
Another suggested extension is to allow mbind2 to operate on multiple
memory ranges with a single call. mbind presently operates on a single
(address, length) tuple. It was suggested that mbind2 should operate
on an iovec, which allows many memory ranges to have the same mempolicy
applied to it with a single system call.
Full breakdown of arguments as of this patch:
mode: Mempolicy mode (MPOL_DEFAULT, MPOL_INTERLEAVE)
mode_flags: Flags previously or'd into mode in set_mempolicy
(e.g.: MPOL_F_STATIC_NODES, MPOL_F_RELATIVE_NODES)
pol_nodes: Policy nodemask
pol_maxnodes: Max number of nodes in the policy nodemask
policy_node: for get_mempolicy2. Returns extended information
about a policy that was previously reported by
passing MPOL_F_NODE to get_mempolicy. Instead of
overriding the mode value, simply add a field.
addr: for get_mempolicy2. Used with MPOL_F_ADDR to run
get_mempolicy against the vma the address belongs
to instead of the task.
addr_node: for get_mempolicy2. Returns the node the address
belongs to. Previously get_mempolicy() would
override the output value of (mode) if MPOL_F_ADDR
and MPOL_F_NODE were set. Instead, we extend
mpol_args to do this by default if MPOL_F_ADDR is
set and do away with MPOL_F_NODE.
vec/vlen: Used by mbind2 to apply the mempolicy to all
address ranges described by the iovec.
Suggested-by: Frank van der Linden <fvdl@google.com>
Suggested-by: Vinicius Tavares Petrucci <vtavarespetr@micron.com>
Suggested-by: Hasan Al Maruf <Hasan.Maruf@amd.com>
Signed-off-by: Gregory Price <gregory.price@memverge.com>
Co-developed-by: Vinicius Tavares Petrucci <vtavarespetr@micron.com>
Signed-off-by: Vinicius Tavares Petrucci <vtavarespetr@micron.com>
---
.../admin-guide/mm/numa_memory_policy.rst | 31 +++++++++++++++++++
include/uapi/linux/mempolicy.h | 18 +++++++++++
2 files changed, 49 insertions(+)
diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
index b7b8d3dd420f..6d645519c2c1 100644
--- a/Documentation/admin-guide/mm/numa_memory_policy.rst
+++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
@@ -488,6 +488,37 @@ closest to which page allocation will come from. Specifying the home node overri
the default allocation policy to allocate memory close to the local node for an
executing CPU.
+Extended Mempolicy Arguments::
+
+ struct mpol_args {
+ /* Basic mempolicy settings */
+ unsigned short mode;
+ unsigned short mode_flags;
+ unsigned long *pol_nodes;
+ unsigned long pol_maxnodes;
+
+ /* get_mempolicy2: policy node information */
+ int policy_node;
+
+ /* get_mempolicy2: memory range policy */
+ unsigned long addr;
+ int addr_node;
+
+ /* mbind2: policy home node */
+ unsigned long home_node;
+
+ /* mbind2: address ranges to apply the policy */
+ struct iovec *vec;
+ size_t vlen;
+ };
+
+The extended mempolicy argument structure is defined to allow the mempolicy
+interfaces future extensibility without the need for additional system calls.
+
+The core arguments (mode, mode_flags, pol_nodes, and pol_maxnodes) apply to
+all interfaces relative to their non-extended counterparts. Each additional
+field may only apply to specific extended interfaces. See the respective
+extended interface man page for more details.
Memory Policy Command Line Interface
====================================
diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 1f9bb10d1a47..e6b50903047c 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -27,6 +27,24 @@ enum {
MPOL_MAX, /* always last member of enum */
};
+struct mpol_args {
+ /* Basic mempolicy settings */
+ unsigned short mode;
+ unsigned short mode_flags;
+ unsigned long *pol_nodes;
+ unsigned long pol_maxnodes;
+ /* get_mempolicy: policy node information */
+ int policy_node;
+ /* get_mempolicy: memory range policy */
+ unsigned long addr;
+ int addr_node;
+ /* mbind2: policy home node */
+ int home_node;
+ /* mbind2: address ranges to apply the policy */
+ struct iovec *vec;
+ size_t vlen;
+};
+
/* Flags for set_mempolicy */
#define MPOL_F_STATIC_NODES (1 << 15)
#define MPOL_F_RELATIVE_NODES (1 << 14)
--
2.39.1
next prev parent reply other threads:[~2023-12-07 0:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-07 0:27 [RFC PATCH 00/11] mempolicy2, mbind2, and weighted interleave Gregory Price
2023-12-07 0:27 ` [RFC PATCH 01/11] mm/mempolicy: implement the sysfs-based weighted_interleave interface Gregory Price
2023-12-07 21:56 ` Davidlohr Bueso
2023-12-07 22:17 ` Davidlohr Bueso
2023-12-08 0:11 ` Gregory Price
2023-12-07 0:27 ` [RFC PATCH 02/11] mm/mempolicy: introduce MPOL_WEIGHTED_INTERLEAVE for weighted interleaving Gregory Price
2023-12-07 0:27 ` [RFC PATCH 03/11] mm/mempolicy: refactor sanitize_mpol_flags for reuse Gregory Price
2023-12-07 0:27 ` [RFC PATCH 04/11] mm/mempolicy: create struct mempolicy_args for creating new mempolicies Gregory Price
2023-12-07 0:27 ` [RFC PATCH 05/11] mm/mempolicy: refactor kernel_get_mempolicy for code re-use Gregory Price
2023-12-07 0:27 ` [RFC PATCH 06/11] mm/mempolicy: allow home_node to be set by mpol_new Gregory Price
2023-12-07 0:27 ` Gregory Price [this message]
2023-12-07 7:13 ` [RFC PATCH 07/11] mm/mempolicy: add userland mempolicy arg structure Arnd Bergmann
2023-12-07 14:58 ` Gregory Price
2023-12-07 15:43 ` Arnd Bergmann
2023-12-08 0:05 ` Gregory Price
2023-12-07 0:27 ` [RFC PATCH 08/11] mm/mempolicy: add set_mempolicy2 syscall Gregory Price
2023-12-07 0:27 ` [RFC PATCH 10/11] mm/mempolicy: add the mbind2 syscall Gregory Price
2023-12-07 0:27 ` [RFC PATCH 11/11] mm/mempolicy: extend set_mempolicy2 and mbind2 to support weighted interleave Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231207002759.51418-8-gregory.price@memverge.com \
--to=gourry.memverge@gmail.com \
--cc=Hasan.Maruf@amd.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=emirakhur@micron.com \
--cc=fvdl@google.com \
--cc=gregory.price@memverge.com \
--cc=honggyu.kim@sk.com \
--cc=hpa@zytor.com \
--cc=hyeongtak.ji@sk.com \
--cc=jgroves@micron.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rakie.kim@sk.com \
--cc=ravis.opensrc@micron.com \
--cc=sthanneeru@micron.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vtavarespetr@micron.com \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox