From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2DC9C433E0 for ; Wed, 10 Feb 2021 10:08:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 39E4B60235 for ; Wed, 10 Feb 2021 10:08:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39E4B60235 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A0E0B6B0006; Wed, 10 Feb 2021 05:08:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BEF46B006C; Wed, 10 Feb 2021 05:08:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D4AB6B006E; Wed, 10 Feb 2021 05:08:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id 771256B0006 for ; Wed, 10 Feb 2021 05:08:18 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 38E1C4DB7 for ; Wed, 10 Feb 2021 10:08:18 +0000 (UTC) X-FDA: 77801933076.10.2950E81 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf30.hostedemail.com (Postfix) with ESMTP id EA2ADE0001B6 for ; Wed, 10 Feb 2021 10:08:16 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1612951696; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Ig9aYUdY0A4bRb/viKrqlRI25TdZpdS3B9s1FAQ8d5M=; b=RmTN3T8/hICAOC6l1XMuobPQa8aGQBEt6sUVbhhqfPzbXPp+lhGVtVyXxTbsWfvsph963Y Y1kkh53nLblH/DL0vjjcxYLKFTWG2ENavtqu0YDAbpmAWSF91HpdIw5A9ksXepM6G9BQTb 2HvhpCJwB3kbezOlEomfoSNoAl56rhQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5AA3DAE14; Wed, 10 Feb 2021 10:08:16 +0000 (UTC) Date: Wed, 10 Feb 2021 11:08:15 +0100 From: Michal Hocko To: Tim Chen Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , Dave Hansen , Ying Huang , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] mm: Fix missing mem cgroup soft limit tree updates Message-ID: References: <3b6e4e9aa8b3ee1466269baf23ed82d90a8f791c.1612902157.git.tim.c.chen@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3b6e4e9aa8b3ee1466269baf23ed82d90a8f791c.1612902157.git.tim.c.chen@linux.intel.com> X-Stat-Signature: jyceo8okq51scnn3xde4fd35ttw975cz X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: EA2ADE0001B6 Received-SPF: none (suse.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1612951696-953558 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 09-02-21 12:29:47, Tim Chen wrote: > On a per node basis, the mem cgroup soft limit tree on each node tracks > how much a cgroup has exceeded its soft limit memory limit and sorts > the cgroup by its excess usage. On page release, the trees are not > updated right away, until we have gathered a batch of pages belonging to > the same cgroup. This reduces the frequency of updating the soft limit tree > and locking of the tree and associated cgroup. > > However, the batch of pages could contain pages from multiple nodes but > only the soft limit tree from one node would get updated. Change the > logic so that we update the tree in batch of pages, with each batch of > pages all in the same mem cgroup and memory node. An update is issued for > the batch of pages of a node collected till now whenever we encounter > a page belonging to a different node. I do agree with Johannes here. This shouldn't be done unconditionally for all memcgs. Wouldn't it be much better to do the fix up in the mem_cgroup_soft_reclaim path instead. Simply check the excess before doing any reclaim? Btw. have you seen this triggering a noticeable misbehaving? I would expect this to have a rather small effect considering how many sources of memcg_check_events we have. Unless I have missed something this has been introduced by 747db954cab6 ("mm: memcontrol: use page lists for uncharge batching"). Please add Fixes tag as well if this is really worth fixing. > Reviewed-by: Ying Huang > Signed-off-by: Tim Chen > --- > mm/memcontrol.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d72449eeb85a..f5a4a0e4e2ec 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6804,6 +6804,7 @@ struct uncharge_gather { > unsigned long pgpgout; > unsigned long nr_kmem; > struct page *dummy_page; > + int nid; > }; > > static inline void uncharge_gather_clear(struct uncharge_gather *ug) > @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > * exclusive access to the page. > */ > > - if (ug->memcg != page_memcg(page)) { > + if (ug->memcg != page_memcg(page) || > + /* uncharge batch update soft limit tree on a node basis */ > + (ug->dummy_page && ug->nid != page_to_nid(page))) { > if (ug->memcg) { > uncharge_batch(ug); > uncharge_gather_clear(ug); > @@ -6869,6 +6872,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > ug->pgpgout++; > > ug->dummy_page = page; > + ug->nid = page_to_nid(page); > page->memcg_data = 0; > css_put(&ug->memcg->css); > } > -- > 2.20.1 -- Michal Hocko SUSE Labs