From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B6B8C10F05 for ; Sat, 9 Dec 2023 07:00:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A832A6B0088; Sat, 9 Dec 2023 02:00:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A32346B0089; Sat, 9 Dec 2023 02:00:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AD5F6B008A; Sat, 9 Dec 2023 02:00:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 77AD36B0088 for ; Sat, 9 Dec 2023 02:00:22 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 542211A0482 for ; Sat, 9 Dec 2023 07:00:22 +0000 (UTC) X-FDA: 81546381084.27.77D428F Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf26.hostedemail.com (Postfix) with ESMTP id 5BBB014001C for ; Sat, 9 Dec 2023 07:00:20 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eU63FA4b; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of gourry.memverge@gmail.com designates 209.85.222.194 as permitted sender) smtp.mailfrom=gourry.memverge@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702105220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5/XyqqBkg5fuo33Ip2SFhoThrNsBfzrYojUCpknb6qg=; b=mQIkl82IvPXqwrDwrcpdsWaLLOgPgOka3YBVw1SIqzQwXQdc4Fxnjo/31k2uchZuOjPV4V e1Cxej4WOXKEEnRmHmX+hVoPR/qON3m67LLXhs1UBO+vQ5tWjG8dq12wECWS0mz7IrS7f1 e+qZWpkxcqKHMNcqA89PG/65vcM1rCA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eU63FA4b; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of gourry.memverge@gmail.com designates 209.85.222.194 as permitted sender) smtp.mailfrom=gourry.memverge@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702105220; a=rsa-sha256; cv=none; b=ntff9iDPd6qzfbP3L0OS+SbxyxOYKILG0YKbklhdg7evFNjciEORa77ptdgOcEOW9HrpS4 Z/9bCvAnc20fZerNsAbEk5YGBVpLAaG9rEfJFnyPbJZE6cE1cLOwGo+D8U3jxzP4bPI2UM 41+7pb/Ixpu8vSrUuHdKukr/kh7p/Ls= Received: by mail-qk1-f194.google.com with SMTP id af79cd13be357-77f552d4179so62926385a.1 for ; Fri, 08 Dec 2023 23:00:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702105219; x=1702710019; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5/XyqqBkg5fuo33Ip2SFhoThrNsBfzrYojUCpknb6qg=; b=eU63FA4bQDd+7qvNQdeegV9c/tt1NBsuFKeA4zMJlpq1p9or9jm3QK9pwbSrQ/Wcnh TDAJnt74b3gYFu2/ZcKytAq4jpexT3OjPwn95PMtcjj7xLNT9DkgcWzFGrDn8pmSHmnb aEWcas7BEjtRLBMkTpim80hK+rgKYV7sIeDdYSnCTKvC02Kik7PfCRDPpAaPU7ojwUzO qdG+PEjtS3OWgiaL6bmh3ZmHGLkktXHj4uLG0E1SaAyTa6sEPNw1GTIBOfVzbaGjNJEE LbEnuKVl8RawSIkGw76YviZRbCEcJ2eguqlu8YSJJrtAgVaZCYUlJow2oTzGXNp8QdC2 UlPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702105219; x=1702710019; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5/XyqqBkg5fuo33Ip2SFhoThrNsBfzrYojUCpknb6qg=; b=sb4EIBs3vwHgY384FIYGK7Aw+mCcAqGVMZjg137TFyB9b/+J76AGOJrl84mQhcEUGp NexCsDqHbYcZZVKqyiFeC65O9gk81UT3JHSWBt4ABx/8Z3B/NQQ7RvoQcsK07h8pQQht mKKK4+B/L7tz/+nJFdutN1BLzmE+CeVNI0g4EIlLm3o3CxpPinDhyxDD5GRaxLJ94Vey xtMlS2GwZzR4LcpuVyuWjBrp5wyieevkwYVKwo09lmaC2hKj46FiSPKQE1UHfBPkpo7Q 0QM8agxr1flgUTNtEwu7mHqMR4zhqoJjXOJwlnLehDPokC/aGU4UATAcVCI333OCKsM7 JlJQ== X-Gm-Message-State: AOJu0YwNzGwkH90np7AX7kLwGYiOibuCuONvdgyXkG9XbK5uPj+MxWZc rD3LZ3jBeYOLdvR6Y/VGT+SS2cilP/cS X-Google-Smtp-Source: AGHT+IG+mvJKnPRrBL3r0GmZB6X9O+Z0T8mESY8jq/MrpvJQq54w3SN+fxjeQ0TUX61npkiogxX9JQ== X-Received: by 2002:a05:620a:2287:b0:77e:fba3:9383 with SMTP id o7-20020a05620a228700b0077efba39383mr1322911qkh.101.1702105219224; Fri, 08 Dec 2023 23:00:19 -0800 (PST) Received: from fedora.mshome.net (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id x8-20020a81b048000000b005df5d592244sm326530ywk.78.2023.12.08.23.00.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Dec 2023 23:00:19 -0800 (PST) From: Gregory Price X-Google-Original-From: Gregory Price To: linux-mm@kvack.org Cc: linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, arnd@arndb.de, tglx@linutronix.de, luto@kernel.org, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, mhocko@kernel.org, tj@kernel.org, ying.huang@intel.com, gregory.price@memverge.com, corbet@lwn.net, rakie.kim@sk.com, hyeongtak.ji@sk.com, honggyu.kim@sk.com, vtavarespetr@micron.com, peterz@infradead.org, jgroves@micron.com, ravis.opensrc@micron.com, sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com, seungjun.ha@samsung.com, Michal Hocko , Frank van der Linden Subject: [PATCH v2 10/11] mm/mempolicy: add the mbind2 syscall Date: Sat, 9 Dec 2023 01:59:31 -0500 Message-Id: <20231209065931.3458-11-gregory.price@memverge.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20231209065931.3458-1-gregory.price@memverge.com> References: <20231209065931.3458-1-gregory.price@memverge.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 5BBB014001C X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mfpp8yrgogr679nh9mykh9dzya7xtezt X-HE-Tag: 1702105220-49470 X-HE-Meta: U2FsdGVkX19VXAR0PMozzN0bYGTWOj2WiJC5ilqt7kLjJnwY4NKTdXE4hJxuL/1ME/jzid3+Sjgkn5xKfxOG1O+kyAU21yVpHyq6yx2a1WxvJ8PRsWagoJAN/Zwy/yh1jxYEcdV+UHE6Ce3QPfeLmIn24jRfFSK2G5JoDwPHpC8QzadDA0hFVyMwTlq7DYqUQQik+JOTRTKkArmxATvK/7g5vR1Tq24LwYgPebNUQhFRuAr45T5COkZVUsg/3mHgJyM+ee5Zy/4NnerXOrRUrBAQ3Zf7vpUErV/xpAEw2If6M3BP5dzglwLoQXJgIR9PpjLwRpY0nIJo5xhSoRZy+JF/DsV44Db6lJ7KAROMy5X+0Qb5Qvj95Qeg0agxqai0L/0HyEZR50G20t/e0A5WBH4ImA3NREc1pureENEFMTbkxFs2NMv9KKBKQB/1aUq7zSKro7wXVzjWDzA6x+HtudzKtSdDmBkbnQc5/95hL768n/HwZ/ah0mK9IadYnQsNqLWIfzNxSDBGP0Ns9EUeF9YIb0W/TKpcqHX24mUT3sNf8FMBj5rWwtkcdtoxhumf3KMMgJdPFcy+umjiopk2Ndzq8C3xL6BAy0jxU3bDBliaUw2/A52c/SISBIHN8UzxE//2TKRE9IRfq0yaQfjSjcuCd+NwiBdL7TPTiguj7ANNBKANgKjN9wlpp+nk5YdN1MAAqdWha2rzyQ2EYIfZRc4SBNXLKLmX/B6LfXfJj+q0I2+9q30hcXpXB7MK16oIVUnq1Madp69dFWp4Sfthd1gTpCbMpyU93CbF1KcmUuupYZwFtNuKj4zaRr6jgsscrWELZIl16HvmXlROBpng7Lwpuds8hl7Pt2p21fejFr5whFlzeolU4vZHrKZRSsU0gFjxdxLG/ZQdNFhhn1cdAYGRFZMe2x9QOnIcpH5Bwcx8usi1MXSSNHu0toe9VhrJKMIntNq2eccVTUTwqRP 2bcjk9vo ep9un+VgLeF1YSYs9Cpx0qD2Yw28s9SmWjQCkSEEhy5u3dAQsq9ddRXQQFULyF23cRIKLXSt8ULMiexiHX5MuvuiNj6OuAOPadj2ItIvyTEnk3w9OlJN4OKDvSnmyk/eAYFxV0mi3CfFbvuJLEEEjNiuSfowTqrXGmAAahhuxvDVvLK4JirD0jk6MoNgH2KhqF+mGJIJYqz9M7a7EuK34pO/sOb4ggtfYYpWBKzspp3swRiZyGVzHksRqvN9/WFA5NaWshC3oGLSAvgfYKNkk4gWiwj/EWTLzhHrwgDEzWu6wT50/bhvxFwYrBUxbO40UVpnLIIF+WihJ3y6ocUE+y4y1GLjY1C17G7GpoCd3Rx5UkUo1r8HPZZR+fYtN/muR6hKez+ZoIh5EXGxds3iqcxC3ylYBvz62KCJkZe7i63+blnOO8fHKayRonVUUjLE2oU5j06tOyl086s709AaA/iLXJZ9W0ILL8ImHoBTojPOGkx7Ng4RcB4ERQvxyx72sYp1qKHVlGBnZim3f/ib0+Fz1OrneNIBhrWAcv7AIpC7HB7TZ3cBf/4k2ScBTtk4YQehsNrkmWlITeeYIR+PMyXl4Qe4/a65QyBn65nm2VgJLiuLn/5f4nk7FjzUpHAbuIptHxSu2zNoGuRLwYADPIDJ7KOWYIyJUadiCuD1mmEtC0wUtdFqnN4PUXEUNo6bBY0N/7DImV3FCCdrfk5RouGp5AL9hqpjPLlhP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: mbind2 is an extensible mbind interface which allows a user to set the mempolicy for one or more address ranges. Defined as: mbind2(struct mpol_args *args, size_t size, unsigned long flags) Input values include the following fields of mpl_args: mode: The MPOL_* policy (DEFAULT, INTERLEAVE, etc.) mode_flags: The MPOL_F_* flags that were previously passed in or'd into the mode. This was split to hopefully allow future extensions additional mode/flag space. pol_nodes: the nodemask to apply for the memory policy pol_maxnodes: The max number of nodes described by pol_nodes home_node: if MPOL_MF_HOME_NODE, set home node of policy to this vec: the vector of (address, len) memory ranges to operate on vlen: the number of entries in vec The semantics are otherwise the same as mbind(), except that the home_node can be set, and all address ranges defined by vec/vlen will be operated on. Valid flags for mbind2 include the same flags as mbind, plus MPOL_MF_HOME_NODE, which informs the syscall to utilize the value of mpol_args->home_node to set the mempolicy home node. Suggested-by: Michal Hocko Suggested-by: Frank van der Linden Suggested-by: Vinicius Tavares Petrucci Suggested-by: Rakie Kim Suggested-by: Hyeongtak Ji Suggested-by: Honggyu Kim Signed-off-by: Gregory Price Co-developed-by: Vinicius Tavares Petrucci --- .../admin-guide/mm/numa_memory_policy.rst | 12 +++- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/syscalls.h | 3 + include/uapi/asm-generic/unistd.h | 4 +- include/uapi/linux/mempolicy.h | 5 +- mm/mempolicy.c | 68 +++++++++++++++++++ 19 files changed, 102 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst index a52624ab659a..f1ba33de3a6e 100644 --- a/Documentation/admin-guide/mm/numa_memory_policy.rst +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst @@ -475,12 +475,18 @@ Install VMA/Shared Policy for a Range of Task's Address Space:: long mbind(void *start, unsigned long len, int mode, const unsigned long *nmask, unsigned long maxnode, unsigned flags); + long mbind2(struct iovec *vec, size_t vlen, struct mpol_args args, + size_t size, unsigned long flags); mbind() installs the policy specified by (mode, nmask, maxnodes) as a VMA policy for the range of the calling task's address space specified by the 'start' and 'len' arguments. Additional actions may be requested via the 'flags' argument. +mbind2() is an extended version of mbind() capable of operating on multiple +memory ranges in one syscall, and which is capable of setting the home node +for the memory policy without an additional call to set_mempolicy_home_node() + See the mbind(2) man page for more details. Set home node for a Range of Task's Address Spacec:: @@ -496,6 +502,9 @@ closest to which page allocation will come from. Specifying the home node overri the default allocation policy to allocate memory close to the local node for an executing CPU. +mbind2() also provides a way for the home node to be set at the time the +mempolicy is set. See the mbind(2) man page for more details. + Extended Mempolicy Arguments:: struct mpol_args { @@ -512,7 +521,8 @@ Extended Mempolicy Arguments:: The extended mempolicy argument structure is defined to allow the mempolicy interfaces future extensibility without the need for additional system calls. -Extended interfaces (set_mempolicy2 and get_mempolicy2) use this structure. +Extended interfaces (set_mempolicy2, get_mempolicy2, and mbind2) use this +this argument structure. The core arguments (mode, mode_flags, pol_nodes, and pol_maxnodes) apply to all interfaces relative to their non-extended counterparts. Each additional diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 0301a8b0a262..e8239293c35a 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -498,3 +498,4 @@ 566 common futex_requeue sys_futex_requeue 567 common set_mempolicy2 sys_set_mempolicy2 568 common get_mempolicy2 sys_get_mempolicy2 +569 common mbind2 sys_mbind2 diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 771a33446e8e..a3f39750257a 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -472,3 +472,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 048a409e684c..9a12dface18e 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 327b01bd6793..6cb740123137 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -464,3 +464,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 921d58e1da23..52cf720f8ae2 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -397,3 +397,4 @@ 456 n32 futex_requeue sys_futex_requeue 457 n32 set_mempolicy2 sys_set_mempolicy2 458 n32 get_mempolicy2 sys_get_mempolicy2 +459 n32 mbind2 sys_mbind2 diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 9271c83c9993..fd37c5301a48 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -446,3 +446,4 @@ 456 o32 futex_requeue sys_futex_requeue 457 o32 set_mempolicy2 sys_set_mempolicy2 458 o32 get_mempolicy2 sys_get_mempolicy2 +459 o32 mbind2 sys_mbind2 diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 0654f3f89fc7..fcd67bc405b1 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -457,3 +457,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index ac11d2064e7a..89715417014c 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -545,3 +545,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index 1cdcafe1ccca..c8304e0d0aa7 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -461,3 +461,4 @@ 456 common futex_requeue sys_futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 sys_mbind2 diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index f71742024c29..e5c51b6c367f 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -461,3 +461,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 2fbf5dbe0620..74527f585500 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -504,3 +504,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 0af813b9a118..be2e2aa17dd8 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -463,3 +463,4 @@ 456 i386 futex_requeue sys_futex_requeue 457 i386 set_mempolicy2 sys_set_mempolicy2 458 i386 get_mempolicy2 sys_get_mempolicy2 +459 i386 mbind2 sys_mbind2 diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 0b777876fc15..6e2347eb8773 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -380,6 +380,7 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index 4536c9a4227d..f00a21317dc0 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -429,3 +429,4 @@ 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 458 common get_mempolicy2 sys_get_mempolicy2 +459 common mbind2 sys_mbind2 diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 774512b7934e..487dd9155b25 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -816,6 +816,9 @@ asmlinkage long sys_mbind(unsigned long start, unsigned long len, const unsigned long __user *nmask, unsigned long maxnode, unsigned flags); +asmlinkage long sys_mbind2(const struct iovec __user *vec, size_t vlen, + const struct mpol_args __user *uargs, size_t usize, + unsigned long flags); asmlinkage long sys_get_mempolicy(int __user *policy, unsigned long __user *nmask, unsigned long maxnode, diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 719accc731db..cd31599bb9cc 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -832,9 +832,11 @@ __SYSCALL(__NR_futex_requeue, sys_futex_requeue) __SYSCALL(__NR_set_mempolicy2, sys_set_mempolicy2) #define __NR_get_mempolicy2 458 __SYSCALL(__NR_get_mempolicy2, sys_get_mempolicy2) +#define __NR_mbind2 459 +__SYSCALL(__NR_mbind2, sys_mbind2) #undef __NR_syscalls -#define __NR_syscalls 459 +#define __NR_syscalls 460 /* * 32 bit systems traditionally used different diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 00a673e30047..506ea0f8f34e 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -56,13 +56,14 @@ struct mpol_args { #define MPOL_F_ADDR (1<<1) /* look up vma using address */ #define MPOL_F_MEMS_ALLOWED (1<<2) /* return allowed memories */ -/* Flags for mbind */ +/* Flags for mbind/mbind2 */ #define MPOL_MF_STRICT (1<<0) /* Verify existing pages in the mapping */ #define MPOL_MF_MOVE (1<<1) /* Move pages owned by this process to conform to policy */ #define MPOL_MF_MOVE_ALL (1<<2) /* Move every page to conform to policy */ #define MPOL_MF_LAZY (1<<3) /* UNSUPPORTED FLAG: Lazy migrate on fault */ -#define MPOL_MF_INTERNAL (1<<4) /* Internal flags start here */ +#define MPOL_MF_HOME_NODE (1<<4) /* mbind2: set home node */ +#define MPOL_MF_INTERNAL (1<<5) /* Internal flags start here */ #define MPOL_MF_VALID (MPOL_MF_STRICT | \ MPOL_MF_MOVE | \ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index cfe22156ef13..8f609204fbe7 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1600,6 +1600,74 @@ SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len, return kernel_mbind(start, len, mode, nmask, maxnode, flags); } +SYSCALL_DEFINE5(mbind2, const struct iovec __user *, vec, size_t, vlen, + const struct mpol_args __user *, uargs, size_t, usize, + unsigned long, flags) +{ + struct mpol_args kargs; + struct mempolicy_args margs; + nodemask_t policy_nodes; + unsigned long __user *nodes_ptr; + struct iovec iovstack[UIO_FASTIOV]; + struct iovec *iov = iovstack; + struct iov_iter iter; + int err; + + if (!vec || !vlen) + return -EINVAL; + + err = copy_struct_from_user(&kargs, sizeof(kargs), uargs, usize); + if (err) + return -EINVAL; + + err = validate_mpol_flags(kargs.mode, &kargs.mode_flags); + if (err) + return err; + + margs.mode = kargs.mode; + margs.mode_flags = kargs.mode_flags; + margs.addr = kargs.addr; + + /* if home node given, validate it is online */ + if (flags & MPOL_MF_HOME_NODE) { + if ((kargs.home_node >= MAX_NUMNODES) || + !node_online(kargs.home_node)) + return -EINVAL; + margs.home_node = kargs.home_node; + } else + margs.home_node = NUMA_NO_NODE; + flags &= ~MPOL_MF_HOME_NODE; + + if (kargs.pol_nodes) { + nodes_ptr = u64_to_user_ptr(kargs.pol_nodes); + err = get_nodes(&policy_nodes, nodes_ptr, + kargs.pol_maxnodes); + if (err) + return err; + margs.policy_nodes = &policy_nodes; + } else + margs.policy_nodes = NULL; + + /* For each address range in vector, do_mbind */ + err = import_iovec(ITER_DEST, vec, vlen, ARRAY_SIZE(iovstack), &iov, + &iter); + if (err) + return err; + while (iov_iter_count(&iter)) { + unsigned long start, len; + + start = untagged_addr((unsigned long)iter_iov_addr(&iter)); + len = iter_iov_len(&iter); + err = do_mbind(start, len, &margs, flags); + if (err) + break; + iov_iter_advance(&iter, iter_iov_len(&iter)); + } + + kfree(iov); + return err; +} + /* Set the process memory policy */ static long kernel_set_mempolicy(int mode, const unsigned long __user *nmask, unsigned long maxnode) -- 2.39.1