From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D71DC433F5 for ; Wed, 12 Jan 2022 13:23:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 978ED6B0166; Wed, 12 Jan 2022 08:23:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9275E6B0167; Wed, 12 Jan 2022 08:23:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A0C66B0168; Wed, 12 Jan 2022 08:23:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0245.hostedemail.com [216.40.44.245]) by kanga.kvack.org (Postfix) with ESMTP id 55F096B0166 for ; Wed, 12 Jan 2022 08:23:16 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C2244951A8 for ; Wed, 12 Jan 2022 13:23:15 +0000 (UTC) X-FDA: 79021701150.31.419C2F0 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by imf09.hostedemail.com (Postfix) with ESMTP id 4D897140004 for ; Wed, 12 Jan 2022 13:23:14 +0000 (UTC) Received: by mail-yb1-f179.google.com with SMTP id n68so6313712ybg.6 for ; Wed, 12 Jan 2022 05:23:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ZvDWrOdsVL4BOKMYpdO/DNmNYh/wCYm8X3avXlAB3QI=; b=TH2+3tnzeR1GyZg1J16O9I9i7mV9b3PVye495KnEfWywjmr7Q/bqysBwsQADqi23Fs GEFJQbtX9+KILIkNCU5gsGR0e5d7GsiOBDPvAYwQIOXbE0qhsN654BLc8zgCiFG3mPjd aAdJV27ygEPpJfbuMzu/sADpS/SxQjywTR17duEyuN3vfFAXvGrcM5Ga5c8gre3Ow+/Q UqVt0EBh25e8NgEgXAoAarX7Yols3zbFpXGpV57tEGafhJjCDo0b4TedOOmYfeRjF+j8 VP3RF675+P494CTjh+GmgbWBkQgD2ppZGbBQs4hnNhBoGz5jOR7zpkCo/Cv9tVohAtcQ O4gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ZvDWrOdsVL4BOKMYpdO/DNmNYh/wCYm8X3avXlAB3QI=; b=xRcdG6HR5r9ORdwY5Vbj1iHqWcKtjWZlTFjfkbPRNWaey/vKqCVETjEj6A4wr1uqUG uZI3d550s4e80JlVGizOW3tUfRb02K1+mqLoQPDEfLeuANneF5XNf7k0OKszh1Kx1kiQ b/+5N1r1VMwpyzajzg5QvQ81CzB1I0V60mJ37+WEn4XOD5pXAYUX6z4/Ygt+hSeOLDSK 7hcbIvU4OFJOp1HIOQyuWwTHCaQSeniszB8zabjBMOQHOACyqlLEbDqYrvnHhNNYH6fo lH+2UaO0cW/wsv/R6J+6yrYIimU39kPVqWbXFnhwGA0E7c4NbqI7GNr23QLnu6b+1HNa Q/YQ== X-Gm-Message-State: AOAM533GtBY2WAT0nw5nsaXWzG0hu/m0QNDFkCsD63WM9eCmX0UlsoEa ZtNRXbUTL7OA6BiGDJNlbkYNPY2Xr0J7pKhUbEfj0Q== X-Google-Smtp-Source: ABdhPJzn6jkkLg1oD5rgWaN1+oUd7J+lgyPMPUiyJxuZXWEbG1a/3aIxBVbuAwx14gtjCubSA0jhhFLbb7lrqZhsDNQ= X-Received: by 2002:a25:77cb:: with SMTP id s194mr274565ybc.485.1641993793293; Wed, 12 Jan 2022 05:23:13 -0800 (PST) MIME-Version: 1.0 References: <20211220085649.8196-1-songmuchun@bytedance.com> <20211220085649.8196-11-songmuchun@bytedance.com> <20220106110051.GA470@blackbody.suse.cz> In-Reply-To: <20220106110051.GA470@blackbody.suse.cz> From: Muchun Song Date: Wed, 12 Jan 2022 21:22:36 +0800 Message-ID: Subject: Re: [PATCH v5 10/16] mm: list_lru: allocate list_lru_one only when needed To: =?UTF-8?Q?Michal_Koutn=C3=BD?= Cc: Matthew Wilcox , Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Shakeel Butt , Roman Gushchin , Yang Shi , Alex Shi , Wei Yang , Dave Chinner , trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, jaegeuk@kernel.org, chao@kernel.org, Kari Argillander , linux-fsdevel , LKML , Linux Memory Management List , linux-nfs@vger.kernel.org, Qi Zheng , Xiongchun duan , Fam Zheng , Muchun Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4D897140004 X-Stat-Signature: cy5myne8ttpcekgeh7nkz5gmbxgm77y4 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=TH2+3tnz; spf=pass (imf09.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-HE-Tag: 1641993794-115787 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 6, 2022 at 7:00 PM Michal Koutn=C3=BD wrote: > > On Mon, Dec 20, 2021 at 04:56:43PM +0800, Muchun Song wrote: > (Thanks for pointing me here.) > > > -void memcg_drain_all_list_lrus(int src_idx, struct mem_cgroup *dst_mem= cg) > > +void memcg_drain_all_list_lrus(struct mem_cgroup *src, struct mem_cgro= up *dst) > > { > > + struct cgroup_subsys_state *css; > > struct list_lru *lru; > > + int src_idx =3D src->kmemcg_id; > > + > > + /* > > + * Change kmemcg_id of this cgroup and all its descendants to the > > + * parent's id, and then move all entries from this cgroup's list= _lrus > > + * to ones of the parent. > > + * > > + * After we have finished, all list_lrus corresponding to this cg= roup > > + * are guaranteed to remain empty. So we can safely free this cgr= oup's > > + * list lrus in memcg_list_lru_free(). > > + * > > + * Changing ->kmemcg_id to the parent can prevent memcg_list_lru_= alloc() > > + * from allocating list lrus for this cgroup after memcg_list_lru= _free() > > + * call. > > + */ > > + rcu_read_lock(); > > + css_for_each_descendant_pre(css, &src->css) { > > + struct mem_cgroup *memcg; > > + > > + memcg =3D mem_cgroup_from_css(css); > > + memcg->kmemcg_id =3D dst->kmemcg_id; > > + } > > + rcu_read_unlock(); > > Do you envision using this function anywhere else beside offlining? > If not, you shouldn't need traversing whole subtree because normally > parents are offlined only after children (see cgroup_subsys_state.online_= cnt). > > > > > mutex_lock(&list_lrus_mutex); > > list_for_each_entry(lru, &memcg_list_lrus, list) > > - memcg_drain_list_lru(lru, src_idx, dst_memcg); > > + memcg_drain_list_lru(lru, src_idx, dst); > > mutex_unlock(&list_lrus_mutex); > > If you do, then here you only drain list_lru of the subtree root but not > the descendants anymore. > > So I do get that mem_cgroup.kmemcg_id refernces the "effective" > kmemcg_id after offlining, so that proper list_lrus are used afterwards. > > I wonder -- is this necessary when objcgs are reparented too? IOW would > anyone query the offlined child's kmemcg_id? > (Maybe it's worth explaining better in the commit message, I think even > current approach is OK (better safe than sorry).) > Sorry for the late reply. Image a bunch of memcg as follows. C's parent is B, B's parent is A and A's parent is root. The numbers in parentheses are their kmemcg_id respectively. root(-1) -> A(0) -> B(1) -> C(2) CPU0: CPU1: memcg_list_lru_alloc(C) memcg_drain_all_list_lrus(C) memcg_drain_all_list_lrus(B) // Now C and B are offline. The // kmemcg_id becomes the following = if // we do not the kmemcg_id of its // descendants in // memcg_drain_all_list_lrus(). // // root(-1) -> A(0) -> B(0) -> C(1) for (i =3D 0; memcg; memcg =3D parent_mem_cgroup(memcg), i++) { // allocate struct list_lru_per_memcg for memcg C table[i].mlru =3D memcg_init_list_lru_one(gfp); } spin_lock_irqsave(&lru->lock, flags); while (i--) { // here index =3D 1 int index =3D table[i].memcg->kmemcg_id; struct list_lru_per_memcg *mlru =3D table[i].mlru; if (index < 0 || rcu_dereference_protected(mlrus->mlru[index], true)) kfree(mlru); else // mlrus->mlru[index] will be assigned a new value regardless // memcg C is already offline. rcu_assign_pointer(mlrus->mlru[index], mlru); } spin_unlock_irqrestore(&lru->lock, flags); So changing ->kmemcg_id of all its descendants can prevent memcg_list_lru_alloc() from allocating list lrus for the offlined cgroup after memcg_list_lru_free() calling. Thanks.