linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhongkun He <hezhongkun.hzk@bytedance.com>
To: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: corbet@lwn.net, mhocko@suse.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, linux-doc@vger.kernel.org,
	wuyun.abel@bytedance.com
Subject: Re: [External] Re: [RFC] mm: add new syscall pidfd_set_mempolicy()
Date: Wed, 12 Oct 2022 16:18:16 +0800	[thread overview]
Message-ID: <ff8d0223-a3da-35a4-8bde-f19d93e06f35@bytedance.com> (raw)
In-Reply-To: <Y0Y/oGToVk3ags7h@debian.me>

> On Mon, Oct 10, 2022 at 05:48:42PM +0800, Zhongkun He wrote:
>> There is usecase that System Management Software(SMS) want to give a
>> memory policy to other processes to make better use of memory.
>>
> 
> Better say "There are usecases when system management utilities
> want to apply memory policy to processes to make better use of memory".
> 
>> The information about how to use memory is not known to the app.
>> Instead, it is known to the userspace daemon(SMS), and that daemon
>> will decide the memory usage policy based on different factors.
>>
> 
> Better say "These utilities doesn't set memory usage policy, but
> rather the job of reporting memory usage and setting the policy is
> offloaded to an userspace daemon."
> 
>> To solve the issue, this patch introduces a new syscall
>> pidfd_set_mempolicy(2).  it sets the NUMA memory policy of the thread
>> specified in pidfd.
>>
> 
> Better say "To solve the issue above, introduce new syscall
> pidfd_set_mempolicy(2). The syscall sets NUMA memory policy for the
> thread specified in pidfd".
> 
>> In current process context there is no locking because only the process
>> accesses its own memory policy, so task_work is used in
>> pidfd_set_mempolicy() to update the mempolicy of the process specified
>> in pidfd, avoid using locks and race conditions.
>>
> 
> Better say "In current process context there is no locking because
> only processes access their own memory policy. For this reason, task_work
> is used in pidfd_set_mempolicy() to set or update the mempolicy of process
> specified in pid. Thuse, it avoids into race conditions."
> 
>> The API is as follows,
>>
>> 		long pidfd_set_mempolicy(int pidfd, int mode,
>>                                       const unsigned long __user *nmask,
>>                                       unsigned long maxnode,
>>                                       unsigned int flags);
>>
>> Set's the [pidfd] task's "task/process memory policy". The pidfd argument
>> is a PID file descriptor (see pidfd_open(2) man page) that specifies the
>> process to which the mempolicy is to be applied. The flags argument is
>> reserved for future use; currently, this argument must be specified as 0.
>> Please see the set_mempolicy(2) man page for more details about
>> other's arguments.
>>
> 
> Why duplicating from the Documentation/ below?
> 
>> Suggested-by: Michal Hocko <mhocko@suse.com>
>> Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
>> ---
>>   .../admin-guide/mm/numa_memory_policy.rst     | 21 ++++-
>>   arch/alpha/kernel/syscalls/syscall.tbl        |  1 +
>>   arch/arm/tools/syscall.tbl                    |  1 +
>>   arch/arm64/include/asm/unistd.h               |  2 +-
>>   arch/arm64/include/asm/unistd32.h             |  3 +-
>>   arch/ia64/kernel/syscalls/syscall.tbl         |  1 +
>>   arch/m68k/kernel/syscalls/syscall.tbl         |  1 +
>>   arch/microblaze/kernel/syscalls/syscall.tbl   |  1 +
>>   arch/mips/kernel/syscalls/syscall_n32.tbl     |  1 +
>>   arch/mips/kernel/syscalls/syscall_n64.tbl     |  1 +
>>   arch/mips/kernel/syscalls/syscall_o32.tbl     |  1 +
>>   arch/parisc/kernel/syscalls/syscall.tbl       |  1 +
>>   arch/powerpc/kernel/syscalls/syscall.tbl      |  1 +
>>   arch/s390/kernel/syscalls/syscall.tbl         |  1 +
>>   arch/sh/kernel/syscalls/syscall.tbl           |  1 +
>>   arch/sparc/kernel/syscalls/syscall.tbl        |  1 +
>>   arch/x86/entry/syscalls/syscall_32.tbl        |  1 +
>>   arch/x86/entry/syscalls/syscall_64.tbl        |  1 +
>>   arch/xtensa/kernel/syscalls/syscall.tbl       |  1 +
>>   include/linux/mempolicy.h                     | 11 +++
>>   include/linux/syscalls.h                      |  4 +
>>   include/uapi/asm-generic/unistd.h             |  5 +-
>>   kernel/sys_ni.c                               |  1 +
>>   mm/mempolicy.c                                | 89 +++++++++++++++++++
>>   24 files changed, 146 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
>> index 5a6afecbb0d0..b864dd88b2d2 100644
>> --- a/Documentation/admin-guide/mm/numa_memory_policy.rst
>> +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
>> @@ -408,9 +408,10 @@ follows:
>>   Memory Policy APIs
>>   ==================
>>   
>> -Linux supports 4 system calls for controlling memory policy.  These APIS
>> -always affect only the calling task, the calling task's address space, or
>> -some shared object mapped into the calling task's address space.
>> +Linux supports 5 system calls for controlling memory policy.  The first four
>> +APIS affect only the calling task, the calling task's address space, or some
>> +shared object mapped into the calling task's address space. The last one can
>> +set the mempolicy of task specified in pidfd.
>>   
>>   .. note::
>>      the headers that define these APIs and the parameter data types for
>> @@ -473,6 +474,20 @@ closest to which page allocation will come from. Specifying the home node overri
>>   the default allocation policy to allocate memory close to the local node for an
>>   executing CPU.
>>   
>> +Set [pidfd Task] Memory Policy::
>> +
>> +        long sys_pidfd_set_mempolicy(int pidfd, int mode,
>> +                                     const unsigned long __user *nmask,
>> +                                     unsigned long maxnode,
>> +                                     unsigned int flags);
>> +
>> +Set's the [pidfd] task's "task/process memory policy". The pidfd argument is
>> +a PID file descriptor (see pidfd_open(2) man page) that specifies the process
>> +to which the mempolicy is to be applied. The flags argument is reserved for
>> +future use; currently, this argument must be specified as 0. Please see the
>> +set_mempolicy(2) man page for more details about other's arguments.
>> +
>> +
>>   
>>   Memory Policy Command Line Interface
>>   ====================================
> 
> The wording can be improved:
> 
> ---- >8 ----
> 
> diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
> index b864dd88b2d236..6df35bf4f960bd 100644
> --- a/Documentation/admin-guide/mm/numa_memory_policy.rst
> +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
> @@ -410,8 +410,8 @@ Memory Policy APIs
>   
>   Linux supports 5 system calls for controlling memory policy.  The first four
>   APIS affect only the calling task, the calling task's address space, or some
> -shared object mapped into the calling task's address space. The last one can
> -set the mempolicy of task specified in pidfd.
> +shared object mapped into the calling task's address space. The last one
> +sets the mempolicy of task specified in the pidfd.
>   
>   .. note::
>      the headers that define these APIs and the parameter data types for
> @@ -481,11 +481,11 @@ Set [pidfd Task] Memory Policy::
>                                        unsigned long maxnode,
>                                        unsigned int flags);
>   
> -Set's the [pidfd] task's "task/process memory policy". The pidfd argument is
> -a PID file descriptor (see pidfd_open(2) man page) that specifies the process
> -to which the mempolicy is to be applied. The flags argument is reserved for
> -future use; currently, this argument must be specified as 0. Please see the
> -set_mempolicy(2) man page for more details about other's arguments.
> +Sets the task/process memory policy for the [pidfd] task. The pidfd argument
> +is a PID file descriptor (see pidfd_open(2) man page for details) that
> +specifies the process for which the mempolicy is applied to. The flags
> +argument is reserved for future use; currently, it must be specified as 0.
> +For the description of all other arguments, see set_mempolicy(2) man page.
>   
>   
>   
> 
> Thanks.
> 

Hi Bagas

I got it, thanks for your suggestions.


      reply	other threads:[~2022-10-12  8:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-10  9:48 Zhongkun He
2022-10-10 16:22 ` Frank van der Linden
2022-10-11 15:00   ` Michal Hocko
2022-10-11 17:22     ` Frank van der Linden
2022-10-11 19:29       ` Michal Hocko
2022-10-12  3:14         ` Abel Wu
2022-10-12 12:34         ` Vinicius Petrucci
2022-10-12 13:07           ` Michal Hocko
2022-10-12 13:23             ` Michal Hocko
2022-10-12 16:51           ` Frank van der Linden
2022-10-11 14:57 ` Michal Hocko
2022-10-12  7:55   ` [External] " Zhongkun He
2022-10-12  9:02     ` Michal Hocko
2022-10-12 11:22       ` Zhongkun He
2022-10-12 12:15         ` Michal Hocko
2022-10-13 10:44           ` Zhongkun He
2022-10-13 11:26             ` Michal Hocko
2022-10-13 12:50               ` Zhongkun He
2022-10-13 13:17                 ` Michal Hocko
2022-10-13 13:42                   ` Zhongkun He
2022-10-12  4:16 ` Bagas Sanjaya
2022-10-12  8:18   ` Zhongkun He [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff8d0223-a3da-35a4-8bde-f19d93e06f35@bytedance.com \
    --to=hezhongkun.hzk@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=bagasdotme@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=wuyun.abel@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox