From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F90FC4332F for ; Wed, 12 Oct 2022 07:55:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB8476B0071; Wed, 12 Oct 2022 03:55:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E68316B0073; Wed, 12 Oct 2022 03:55:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D08BB6B0074; Wed, 12 Oct 2022 03:55:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B9B2B6B0071 for ; Wed, 12 Oct 2022 03:55:53 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 84FBB1411F5 for ; Wed, 12 Oct 2022 07:55:53 +0000 (UTC) X-FDA: 80011538586.04.38E64A2 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf28.hostedemail.com (Postfix) with ESMTP id 02987C0031 for ; Wed, 12 Oct 2022 07:55:51 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id l4so15529265plb.8 for ; Wed, 12 Oct 2022 00:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=qiqNvIkOrZUzletpRV3bkCPEPfM+NFMYt7swiI2PMTg=; b=eWwavECkNBx7QDz3LrB5tUigO6qg+o/VFkgPbH/i4E9uOLTWrmHuji/EqOwlqmNZ/4 CHijdWChIMxdO+ZtQu5JBp58xwglXywYbjPhiebzNPDrl7xTDNW90igd/aWecF9xZsch u37EX40PeR89u8lripyec1NSK0qBIWu+4c0teNsZLb6C+x/jfW+1ZMmSi6H9M1ONW+mB yOzvpQ2j7dhknBUg4YVPqzRD/qspgPW4P8l7dBuAdNkIt/dVmD9H1faPStcHveMXCd6C j2zQzaPXtLdgDFu2PDUxM/a4EazEqNJRLSS+DIvuR36jw/zYDfCk1bU6aOjoqxQVa+vJ S5kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=qiqNvIkOrZUzletpRV3bkCPEPfM+NFMYt7swiI2PMTg=; b=0aTZdDCeFf9xfwzRjm6pGkNTfTPI7/r3sTSyTXY5BPZv26N8KxDhsD4HocvESz2ycy HbBWY91fYpIqRSivTY3IQ4yDVnX8KxaAo/bvQBLZKs8qLzKwz+UZIZMc1LBUHI7sXyKC laHhR4YMiOIfbF6STmTVkmdrIw9BCMf2KqQcu7rENi3CvdrU+DIWS3YdU+EZnjHR34n9 btc6Pa/WV5Ben0PoQuIfuw7JA4rt4bXfv2UyrcWzy/0lJAVrnIns6cXtxyfhNwRrBaeQ tmAO+rRozAJTzjnSr6ybQDSwhrI9tU+YrO+JLK2brVtxVl4h4IoR+6S9U/zxz4yK4whS Wmlg== X-Gm-Message-State: ACrzQf0+PuliULj03phuWm9cq8eiO4Vj4bCfBP/E3t7veY98mPQsD6GW qyQEYdmNeYBPGt/e5MyK0gmiZA== X-Google-Smtp-Source: AMsMyM6Mop5fQSWO9SSyVhxQQI6VpyWF9Lwlbp0HTZdM2K3anCfuy8BCtnUThbozq5SvynKT1CbmCw== X-Received: by 2002:a17:902:c94d:b0:17d:9ea9:9c02 with SMTP id i13-20020a170902c94d00b0017d9ea99c02mr29512846pla.68.1665561350354; Wed, 12 Oct 2022 00:55:50 -0700 (PDT) Received: from [10.68.76.92] ([139.177.225.229]) by smtp.gmail.com with ESMTPSA id o11-20020a17090ab88b00b0020af2411721sm789475pjr.34.2022.10.12.00.55.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Oct 2022 00:55:49 -0700 (PDT) Message-ID: <582cf257-bc0d-c96e-e72e-9164cff4fce1@bytedance.com> Date: Wed, 12 Oct 2022 15:55:44 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [External] Re: [RFC] mm: add new syscall pidfd_set_mempolicy() To: Michal Hocko Cc: corbet@lwn.net, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, wuyun.abel@bytedance.com References: <20221010094842.4123037-1-hezhongkun.hzk@bytedance.com> From: Zhongkun He In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665561353; a=rsa-sha256; cv=none; b=ogrVrt+bfblUQRiBQ5HGzl5lMglNHpdWoe3u+qtjQCqvTr/56F5uxEV/b6CVVxNZmdPG9f o1k10xPDdE2vqhm4oRBmCvkjQCwP7F0QX05niG+0h6IQHxK74rXVH4saMU8qB8UdkthEya 5+QcUGm9XkzQGTiqa5bmLlDfGWwYdZs= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=eWwavECk; spf=pass (imf28.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665561353; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qiqNvIkOrZUzletpRV3bkCPEPfM+NFMYt7swiI2PMTg=; b=167zqWP263IADF90w/zg/agAaqv10jgOK6dKRNCPUopjl/hsubtO6x8Qf4xdAVM00DR528 ejWD9J1BbvCCuYDr7x1vF99+uren/dyiEiHcF6kyuYaYomd/q0AD4c7KeURczUBj1Hxyeh +VEvcE4HwimST4pUDZK1tFv734TMIaE= X-Stat-Signature: opa4h1yzs6hp8azicot9y6q69apmhpqt X-Rspamd-Queue-Id: 02987C0031 X-Rspam-User: X-Rspamd-Server: rspam08 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=eWwavECk; spf=pass (imf28.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-HE-Tag: 1665561351-470886 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi michal, thanks for your reply and suggestiones. > Please add some explanation why the cpuset interface is not usable for > that usecase. OK. >> To solve the issue, this patch introduces a new syscall >> pidfd_set_mempolicy(2). it sets the NUMA memory policy of the thread >> specified in pidfd. >> >> In current process context there is no locking because only the process >> accesses its own memory policy, so task_work is used in >> pidfd_set_mempolicy() to update the mempolicy of the process specified >> in pidfd, avoid using locks and race conditions. > > Why cannot you alter kernel_set_mempolicy (and do_set_mempolicy) to > accept a task rather than operate on current? I have tried it before this patch, but I found a problem.The allocation and update of mempolicy are in the current context, so it is not protected by any lock.But when the mempolicy is modified by other processes, the race condition appears. Say something like the following pidfd_set_mempolicy target task stack alloc_pages mpol = get_task_policy; task_lock(task); old = task->mempolicy; task->mempolicy = new; task_unlock(task); mpol_put(old); page = __alloc_pages(mpol); There is a situation that when the old mempolicy is released, the target task is still using the policy.It would be better if there are suggestions on this case. > I have to really say that I dislike the task_work approach because it > detaches the syscall from the actual operation and the caller simply > doesn't know when the operation has been completed. I agree with you.This is indeed a problem. > Please also describe the security model.got it.