From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21B80C433FE for ; Wed, 26 Oct 2022 14:36:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80E858E0002; Wed, 26 Oct 2022 10:36:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BF918E0001; Wed, 26 Oct 2022 10:36:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 686C48E0002; Wed, 26 Oct 2022 10:36:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 594758E0001 for ; Wed, 26 Oct 2022 10:36:41 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 30F43A02DF for ; Wed, 26 Oct 2022 14:36:41 +0000 (UTC) X-FDA: 80063351802.23.3DB16A7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id D083840022 for ; Wed, 26 Oct 2022 14:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666794999; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RuQ473Ae4BWJHrbZgeWqzD+yGxNzK8Ar53dsUNjWpn4=; b=aKs9H8iLObVx3yCLgeQm1UUUgjNA7T8XFainsf0tDzbW7tj88UXzcNvbmW/48j+0QAmVnx xZnngoMc6VbiqImEcN/S+2IdOdlD5+tOQL+1UsdVnj4uLzvN750Arqkkqg1y+CpnpCqOkk epOE0pUzhLtao5wlyZCcw6+utkTZkNE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-336-lH-Nv13GPUiXSAnlzXpOYA-1; Wed, 26 Oct 2022 10:36:34 -0400 X-MC-Unique: lH-Nv13GPUiXSAnlzXpOYA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B39F43817A68; Wed, 26 Oct 2022 14:36:33 +0000 (UTC) Received: from [10.18.17.153] (dhcp-17-153.bos.redhat.com [10.18.17.153]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0AC6C2024CB7; Wed, 26 Oct 2022 14:36:33 +0000 (UTC) Message-ID: Date: Wed, 26 Oct 2022 10:36:32 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH] mm/vmscan: respect cpuset policy during page demotion Content-Language: en-US To: Feng Tang , Andrew Morton , Johannes Weiner , Michal Hocko , Tejun Heo , Zefan Li , ying.huang@intel.com, aneesh.kumar@linux.ibm.com, linux-mm@kvack.org, cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org, dave.hansen@intel.com, tim.c.chen@intel.com, fengwei.yin@intel.com References: <20221026074343.6517-1-feng.tang@intel.com> From: Waiman Long In-Reply-To: <20221026074343.6517-1-feng.tang@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666795000; a=rsa-sha256; cv=none; b=02eNtScZogVWUn5eZtV1TS5Yi1YPfAxfMaW68P8afO1szBeBP8esjhz0ciBiH6pUvAuHgZ 0aaZi0t7q369JaUSR0eW4GEK9943Tb6seR9nyQ+fWURJ0WfPQAZYFPuRPlwLp1OGHYVTiy dAGQQ6nXMtTw6YHNoo4WZlWCl8MX8oc= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aKs9H8iL; spf=pass (imf04.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666795000; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RuQ473Ae4BWJHrbZgeWqzD+yGxNzK8Ar53dsUNjWpn4=; b=dUoi9cEQH7BTj7AaZqaijCxREcKWg/bguO8u7/hCTd06tSxd8ABYpcMAAX7F7JpzglBoiJ RyQG/diKhmx4YtlLt4knUM/PVp7kFuRQRLAoQbBdQbuQuMY0PINpsbebi5wR3pFHmfGMyr koHqt2hWg2JdmZeX4jM7JY//kzpEhbg= X-Stat-Signature: 7waux9fxq479jdmuysf7kfsqnzwkpbza X-Rspamd-Queue-Id: D083840022 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aKs9H8iL; spf=pass (imf04.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1666794999-771373 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/26/22 03:43, Feng Tang wrote: > In page reclaim path, memory could be demoted from faster memory tier > to slower memory tier. Currently, there is no check about cpuset's > memory policy, that even if the target demotion node is not allowd > by cpuset, the demotion will still happen, which breaks the cpuset > semantics. > > So add cpuset policy check in the demotion path and skip demotion > if the demotion targets are not allowed by cpuset. > > Signed-off-by: Feng Tang > --- > Hi reviewers, > > For easy bisectable, I combined the cpuset change and mm change > in one patch, if you prefer to separate them, I can turn it into > 2 patches. > > Thanks, > Feng > > include/linux/cpuset.h | 6 ++++++ > kernel/cgroup/cpuset.c | 29 +++++++++++++++++++++++++++++ > mm/vmscan.c | 35 ++++++++++++++++++++++++++++++++--- > 3 files changed, 67 insertions(+), 3 deletions(-) > > diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h > index d58e0476ee8e..6fcce2bd2631 100644 > --- a/include/linux/cpuset.h > +++ b/include/linux/cpuset.h > @@ -178,6 +178,8 @@ static inline void set_mems_allowed(nodemask_t nodemask) > task_unlock(current); > } > > +extern void cpuset_get_allowed_mem_nodes(struct cgroup *cgroup, > + nodemask_t *nmask); > #else /* !CONFIG_CPUSETS */ > > static inline bool cpusets_enabled(void) { return false; } > @@ -299,6 +301,10 @@ static inline bool read_mems_allowed_retry(unsigned int seq) > return false; > } > > +static inline void cpuset_get_allowed_mem_nodes(struct cgroup *cgroup, > + nodemask_t *nmask) > +{ > +} > #endif /* !CONFIG_CPUSETS */ > > #endif /* _LINUX_CPUSET_H */ > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 3ea2e836e93e..cbb118c0502f 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -3750,6 +3750,35 @@ nodemask_t cpuset_mems_allowed(struct task_struct *tsk) > return mask; > } > > +/* > + * Retrieve the allowed memory nodemask for a cgroup. > + * > + * Set *nmask to cpuset's effective allowed nodemask for cgroup v2, > + * and NODE_MASK_ALL (means no constraint) for cgroup v1 where there > + * is no guaranteed association from a cgroup to a cpuset. > + */ > +void cpuset_get_allowed_mem_nodes(struct cgroup *cgroup, nodemask_t *nmask) > +{ > + struct cgroup_subsys_state *css; > + struct cpuset *cs; > + > + if (!is_in_v2_mode()) { > + *nmask = NODE_MASK_ALL; > + return; > + } You are allowing all nodes to be used for cgroup v1. Is there a reason why you ignore v1? > + > + rcu_read_lock(); > + css = cgroup_e_css(cgroup, &cpuset_cgrp_subsys); > + if (css) { > + css_get(css); > + cs = css_cs(css); > + *nmask = cs->effective_mems; > + css_put(css); > + } Since you are holding an RCU read lock and copying out the whole nodemask, you probably don't need to do a css_get/css_put pair. > + > + rcu_read_unlock(); > +} > + Cheers, Longman