From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C03DC00140 for ; Thu, 18 Aug 2022 09:53:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D756A6B0073; Thu, 18 Aug 2022 05:53:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D256A6B0074; Thu, 18 Aug 2022 05:53:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEDD78D0001; Thu, 18 Aug 2022 05:53:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B2CAA6B0073 for ; Thu, 18 Aug 2022 05:53:51 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 644511408C3 for ; Thu, 18 Aug 2022 09:53:51 +0000 (UTC) X-FDA: 79812251862.31.8734CF9 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf31.hostedemail.com (Postfix) with ESMTP id 57C21206E5 for ; Thu, 18 Aug 2022 09:46:58 +0000 (UTC) Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1660805844; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f9JWaZ+Pf+0lJTRMwBclTF/cLsLmXbQ/pjz4rtVLgyA=; b=vmswnGUGMWUucEFmUSB5GO9rLMbcxg45sdkXMf86ZY8aYc+Rqm6/g/qGftkBc3BPiK5tGB Jspoz9YhNhWN8BUeUpTS4914fD9jYzQX5Bxc3uZ/U9QL20QdtZsod26edTtCnq4aalb5RT t5kNsEQCegx7oVR6AOtOSW8xzdXhHSo= MIME-Version: 1.0 Subject: Re: [PATCH v2] mm/mempolicy: fix lock contention on mems_allowed X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <20220811124157.74888-1-wuyun.abel@bytedance.com> Date: Thu, 18 Aug 2022 14:56:40 +0800 Cc: Andrew Morton , Vlastimil Babka , Michal Hocko , Mel Gorman , Muchun Song , Linux MM , linux-kernel@vger.kernel.org Content-Transfer-Encoding: 7bit Message-Id: References: <20220811124157.74888-1-wuyun.abel@bytedance.com> To: Abel Wu X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660816019; a=rsa-sha256; cv=none; b=iIAHJnxU0YSkprj1UHnssZJGgd3UN+krHcfrml9Jwe29jSFOU2yFJwTeFMfKZcIuvtSxql Yhy+Wu5YGX1c6xOFR6NoCE3l4TxnLFCdKHnlwujC6NSLkfMHBxfZTgnvV34REVW7U6tmLA gr6CmkSsrkiCe9zQa7IxuOUe5pKOwHM= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=vmswnGUG; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf31.hostedemail.com: domain of muchun.song@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660816019; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f9JWaZ+Pf+0lJTRMwBclTF/cLsLmXbQ/pjz4rtVLgyA=; b=auouKW/49ULRSuggwT44OJ03AV3yODtc2avCY8UzkMBE5aN1vYKbYjKeEVlZH5/Jj9gsSk bIzlTgm0YCclgQi8rIUyX8cjvvVnhN5IHJKQM9fC0AYOiIG25nl3gDuMFE3AbQCWGS33/g VyhFK++Xz/lfkppzPjlxsonMfnb/Rys= X-Rspam-User: Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=vmswnGUG; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf31.hostedemail.com: domain of muchun.song@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=muchun.song@linux.dev X-Rspamd-Queue-Id: 57C21206E5 X-Rspamd-Server: rspam11 X-Stat-Signature: bnq633o77ztwwy6sxd56obtztz69mcwc X-HE-Tag: 1660816018-550253 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Aug 11, 2022, at 20:41, Abel Wu wrote: > > The mems_allowed field can be modified by other tasks, so it isn't > safe to access it with alloc_lock unlocked even in the current > process context. > > Say there are two tasks: A from cpusetA is performing set_mempolicy(2), > and B is changing cpusetA's cpuset.mems: > > A (set_mempolicy) B (echo xx > cpuset.mems) > ------------------------------------------------------- > pol = mpol_new(); > update_tasks_nodemask(cpusetA) { > foreach t in cpusetA { > cpuset_change_task_nodemask(t) { > mpol_set_nodemask(pol) { > task_lock(t); // t could be A > new = f(A->mems_allowed); > update t->mems_allowed; > pol.create(pol, new); > task_unlock(t); > } > } > } > } > task_lock(A); > A->mempolicy = pol; > task_unlock(A); > > In this case A's pol->nodes is computed by old mems_allowed, and could > be inconsistent with A's new mems_allowed. > > While it is different when replacing vmas' policy: the pol->nodes is > gone wild only when current_cpuset_is_being_rebound(): > > A (mbind) B (echo xx > cpuset.mems) > ------------------------------------------------------- > pol = mpol_new(); > mmap_write_lock(A->mm); > cpuset_being_rebound = cpusetA; > update_tasks_nodemask(cpusetA) { > foreach t in cpusetA { > cpuset_change_task_nodemask(t) { > mpol_set_nodemask(pol) { > task_lock(t); // t could be A > mask = f(A->mems_allowed); > update t->mems_allowed; > pol.create(pol, mask); > task_unlock(t); > } > } > foreach v in A->mm { > if (cpuset_being_rebound == cpusetA) > pol.rebind(pol, cpuset.mems); > v->vma_policy = pol; > } > mmap_write_unlock(A->mm); > mmap_write_lock(t->mm); > mpol_rebind_mm(t->mm); > mmap_write_unlock(t->mm); > } > } > cpuset_being_rebound = NULL; > > In this case, the cpuset.mems, which has already done updating, is > finally used for calculating pol->nodes, rather than A->mems_allowed. > So it is OK to call mpol_set_nodemask() with alloc_lock unlocked when > doing mbind(2). > > Fixes: 78b132e9bae9 ("mm/mempolicy: remove or narrow the lock on current") > Signed-off-by: Abel Wu Reviewed-by: Muchun Song Thanks.