From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6209FC433FE for ; Wed, 12 Oct 2022 08:18:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D48F2900002; Wed, 12 Oct 2022 04:18:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF8356B0073; Wed, 12 Oct 2022 04:18:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B98DD900002; Wed, 12 Oct 2022 04:18:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A63046B0071 for ; Wed, 12 Oct 2022 04:18:25 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 75ED9A0B51 for ; Wed, 12 Oct 2022 08:18:25 +0000 (UTC) X-FDA: 80011595370.29.01CF055 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf05.hostedemail.com (Postfix) with ESMTP id 8F96010001E for ; Wed, 12 Oct 2022 08:18:23 +0000 (UTC) Received: by mail-pj1-f44.google.com with SMTP id p3-20020a17090a284300b0020a85fa3ffcso1341938pjf.2 for ; Wed, 12 Oct 2022 01:18:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=k6wvax6xdleLIogOVxphI+p7KC3o8r8c1dELYoXr9hY=; b=dKjX4wI7FmvcEatL5fzicJyLD4MGqRQPcnOWpHvI2TOgeKguwY5oB/N35Gd+XaYzIE imxmsv23JJkIP3sJlPBlirvPE10vBB/CmG6q6aZ3FhfwFX8BkiDCcJBLulcYk+48g88v le/gpsSXDe9lml8uJXC7pptngxhyDYUybYn3f/6+d4lmVNlDw1NA8+axYTip3h1G+adw BuQsHRNbJH+DlKqpUjdw28OtjE6jAoxooHuHGWA53c62q/9ZtRfIeW/oRBleFz8giFGz 3e0lm3a0oqLtEP/3qzVDWUqzKAXQoIXVJOJtn10uNVGru3Qwnn0H7LbVehXDsoq5UvV1 FpBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=k6wvax6xdleLIogOVxphI+p7KC3o8r8c1dELYoXr9hY=; b=5dsg138J+AWqwT+FF4GDXm+fDJNyoGLNPQJ4qjvY2inFiHsp3CANqXvnyVjzA01JAO Yr9GkmafO5n4gD5JhGaMDqx22aq5gWv2KyHoGoLAmIkpKIv08d6HLt2jP/iH1wwMhH/w 7nFguqJ57Sy5v+I3zBE64BrK+T1qwtXPfqAfMSKkAzqU1cuJG5+hMjpZoBLWwjSXYijH qXbr/GbWf0Zq1p5sTCeBQ0QgKxLt3ZZ+6SL+jeMywfsyrRcW0IUUH7B27MTlRPxUFYFg +x3xgyuqlVwKfGSfE6kpV4x92S4mpdhfRJ807JXA0E5wPdT16LD4DcnWoWwTXdeLISQT SWjw== X-Gm-Message-State: ACrzQf0SRsISuI+VOQDFwKOMCq8/9Am/8DVxYOvbb2Be+8CMKiKat1j6 +KGULZcS17bgh3xjdLHD/54PYA== X-Google-Smtp-Source: AMsMyM4BrhP5M0dfFBEgnl7LfCox54+lT0TY3RMzEVeouBlVgF/azt/m6BbPLXRlNB/QsQr/qpdJYg== X-Received: by 2002:a17:90b:33d0:b0:20d:5edf:8b92 with SMTP id lk16-20020a17090b33d000b0020d5edf8b92mr3691313pjb.117.1665562702254; Wed, 12 Oct 2022 01:18:22 -0700 (PDT) Received: from [10.68.76.92] ([139.177.225.229]) by smtp.gmail.com with ESMTPSA id om15-20020a17090b3a8f00b0020087d7e778sm897552pjb.37.2022.10.12.01.18.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Oct 2022 01:18:21 -0700 (PDT) Message-ID: Date: Wed, 12 Oct 2022 16:18:16 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [External] Re: [RFC] mm: add new syscall pidfd_set_mempolicy() To: Bagas Sanjaya Cc: corbet@lwn.net, mhocko@suse.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, wuyun.abel@bytedance.com References: <20221010094842.4123037-1-hezhongkun.hzk@bytedance.com> From: Zhongkun He In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=dKjX4wI7; spf=pass (imf05.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665562703; a=rsa-sha256; cv=none; b=41UCfKN8kMeQkZxkywH9rzqp1QJGFDqtDfl9hH7RgVFLJVKKyv+KMDN5GRJ6NfC3AnW5IF 0C5n1wA8Z359lOvT0gI/59BD8F60MqY5SjRzD+KGwPf3+aj+d2k8HXrVFSSDbx2wFaVe12 R+lNpyd2bZEZcpA9QPEFjPnXuwei/MM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665562703; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k6wvax6xdleLIogOVxphI+p7KC3o8r8c1dELYoXr9hY=; b=cJxWsWMjr8qYkzvYTPJw34+XeC3b0uXUA2KSWEb5dhZJt5VJn613J+7Y5XVDshDYHpHsP8 dlVafH6hyE/a7Gqx521tvyV6+bfptm2sMc1kO95xtfaxtPm4WaUizUqmu+CIyt8WmavsrS IlCxqswmkL3WUu5Y1tuulhsKC0V4DmY= X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=dKjX4wI7; spf=pass (imf05.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8F96010001E X-Stat-Signature: 6koys9yxuqx98o74use17wudhu7xnabp X-HE-Tag: 1665562703-936302 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Mon, Oct 10, 2022 at 05:48:42PM +0800, Zhongkun He wrote: >> There is usecase that System Management Software(SMS) want to give a >> memory policy to other processes to make better use of memory. >> > > Better say "There are usecases when system management utilities > want to apply memory policy to processes to make better use of memory". > >> The information about how to use memory is not known to the app. >> Instead, it is known to the userspace daemon(SMS), and that daemon >> will decide the memory usage policy based on different factors. >> > > Better say "These utilities doesn't set memory usage policy, but > rather the job of reporting memory usage and setting the policy is > offloaded to an userspace daemon." > >> To solve the issue, this patch introduces a new syscall >> pidfd_set_mempolicy(2). it sets the NUMA memory policy of the thread >> specified in pidfd. >> > > Better say "To solve the issue above, introduce new syscall > pidfd_set_mempolicy(2). The syscall sets NUMA memory policy for the > thread specified in pidfd". > >> In current process context there is no locking because only the process >> accesses its own memory policy, so task_work is used in >> pidfd_set_mempolicy() to update the mempolicy of the process specified >> in pidfd, avoid using locks and race conditions. >> > > Better say "In current process context there is no locking because > only processes access their own memory policy. For this reason, task_work > is used in pidfd_set_mempolicy() to set or update the mempolicy of process > specified in pid. Thuse, it avoids into race conditions." > >> The API is as follows, >> >> long pidfd_set_mempolicy(int pidfd, int mode, >> const unsigned long __user *nmask, >> unsigned long maxnode, >> unsigned int flags); >> >> Set's the [pidfd] task's "task/process memory policy". The pidfd argument >> is a PID file descriptor (see pidfd_open(2) man page) that specifies the >> process to which the mempolicy is to be applied. The flags argument is >> reserved for future use; currently, this argument must be specified as 0. >> Please see the set_mempolicy(2) man page for more details about >> other's arguments. >> > > Why duplicating from the Documentation/ below? > >> Suggested-by: Michal Hocko >> Signed-off-by: Zhongkun He >> --- >> .../admin-guide/mm/numa_memory_policy.rst | 21 ++++- >> arch/alpha/kernel/syscalls/syscall.tbl | 1 + >> arch/arm/tools/syscall.tbl | 1 + >> arch/arm64/include/asm/unistd.h | 2 +- >> arch/arm64/include/asm/unistd32.h | 3 +- >> arch/ia64/kernel/syscalls/syscall.tbl | 1 + >> arch/m68k/kernel/syscalls/syscall.tbl | 1 + >> arch/microblaze/kernel/syscalls/syscall.tbl | 1 + >> arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + >> arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + >> arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + >> arch/parisc/kernel/syscalls/syscall.tbl | 1 + >> arch/powerpc/kernel/syscalls/syscall.tbl | 1 + >> arch/s390/kernel/syscalls/syscall.tbl | 1 + >> arch/sh/kernel/syscalls/syscall.tbl | 1 + >> arch/sparc/kernel/syscalls/syscall.tbl | 1 + >> arch/x86/entry/syscalls/syscall_32.tbl | 1 + >> arch/x86/entry/syscalls/syscall_64.tbl | 1 + >> arch/xtensa/kernel/syscalls/syscall.tbl | 1 + >> include/linux/mempolicy.h | 11 +++ >> include/linux/syscalls.h | 4 + >> include/uapi/asm-generic/unistd.h | 5 +- >> kernel/sys_ni.c | 1 + >> mm/mempolicy.c | 89 +++++++++++++++++++ >> 24 files changed, 146 insertions(+), 6 deletions(-) >> >> diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst >> index 5a6afecbb0d0..b864dd88b2d2 100644 >> --- a/Documentation/admin-guide/mm/numa_memory_policy.rst >> +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst >> @@ -408,9 +408,10 @@ follows: >> Memory Policy APIs >> ================== >> >> -Linux supports 4 system calls for controlling memory policy. These APIS >> -always affect only the calling task, the calling task's address space, or >> -some shared object mapped into the calling task's address space. >> +Linux supports 5 system calls for controlling memory policy. The first four >> +APIS affect only the calling task, the calling task's address space, or some >> +shared object mapped into the calling task's address space. The last one can >> +set the mempolicy of task specified in pidfd. >> >> .. note:: >> the headers that define these APIs and the parameter data types for >> @@ -473,6 +474,20 @@ closest to which page allocation will come from. Specifying the home node overri >> the default allocation policy to allocate memory close to the local node for an >> executing CPU. >> >> +Set [pidfd Task] Memory Policy:: >> + >> + long sys_pidfd_set_mempolicy(int pidfd, int mode, >> + const unsigned long __user *nmask, >> + unsigned long maxnode, >> + unsigned int flags); >> + >> +Set's the [pidfd] task's "task/process memory policy". The pidfd argument is >> +a PID file descriptor (see pidfd_open(2) man page) that specifies the process >> +to which the mempolicy is to be applied. The flags argument is reserved for >> +future use; currently, this argument must be specified as 0. Please see the >> +set_mempolicy(2) man page for more details about other's arguments. >> + >> + >> >> Memory Policy Command Line Interface >> ==================================== > > The wording can be improved: > > ---- >8 ---- > > diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst > index b864dd88b2d236..6df35bf4f960bd 100644 > --- a/Documentation/admin-guide/mm/numa_memory_policy.rst > +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst > @@ -410,8 +410,8 @@ Memory Policy APIs > > Linux supports 5 system calls for controlling memory policy. The first four > APIS affect only the calling task, the calling task's address space, or some > -shared object mapped into the calling task's address space. The last one can > -set the mempolicy of task specified in pidfd. > +shared object mapped into the calling task's address space. The last one > +sets the mempolicy of task specified in the pidfd. > > .. note:: > the headers that define these APIs and the parameter data types for > @@ -481,11 +481,11 @@ Set [pidfd Task] Memory Policy:: > unsigned long maxnode, > unsigned int flags); > > -Set's the [pidfd] task's "task/process memory policy". The pidfd argument is > -a PID file descriptor (see pidfd_open(2) man page) that specifies the process > -to which the mempolicy is to be applied. The flags argument is reserved for > -future use; currently, this argument must be specified as 0. Please see the > -set_mempolicy(2) man page for more details about other's arguments. > +Sets the task/process memory policy for the [pidfd] task. The pidfd argument > +is a PID file descriptor (see pidfd_open(2) man page for details) that > +specifies the process for which the mempolicy is applied to. The flags > +argument is reserved for future use; currently, it must be specified as 0. > +For the description of all other arguments, see set_mempolicy(2) man page. > > > > > Thanks. > Hi Bagas I got it, thanks for your suggestions.