From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C1D8C4345F for ; Mon, 15 Apr 2024 18:32:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 029596B0098; Mon, 15 Apr 2024 14:32:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EF4606B0099; Mon, 15 Apr 2024 14:32:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6ECD6B009A; Mon, 15 Apr 2024 14:32:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B30F36B0098 for ; Mon, 15 Apr 2024 14:32:10 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4A472807BA for ; Mon, 15 Apr 2024 18:32:10 +0000 (UTC) X-FDA: 82012610820.27.F19AD89 Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by imf03.hostedemail.com (Postfix) with ESMTP id 8213A2001E for ; Mon, 15 Apr 2024 18:32:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FnhLuGXx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.167.53 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713205928; a=rsa-sha256; cv=none; b=63wPmiM8crOmp2BPEBQ/RqA/4H6LNd6OH0fEx/hJrQWthlvOxEaSRLv+V6f1OjWFAM7OGB mmAZsMDs8dnIyNSuhRl2Qr/vR6BtH2gDDuEs0Mj2e1nZPLXWFRyoh+yVCgd9nrKWQXaQMq fhK0YKMIuxiCugXuukwK/zIInU5b25k= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FnhLuGXx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of shy828301@gmail.com designates 209.85.167.53 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713205928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7BObtzxYiY3PmGcxZPMQAa4vztvNYgpogv+uT8X9uMc=; b=uklszeHgJB36BYmvK3vREykvyi9vl1bOTk2jTBGxAS28J36EBPNnInf+riZBOjBYDNjD5j 9XlvrT+7y/VIWYqS/2nX+G92dwPI5ZPyzrxBZi0mJpEPG7uXzcUx8vTfgtRkht9hbLYQAg eUxrBOCZLEO5PS5ZQI/5PucK/Bkzpl0= Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-516d6e23253so3867969e87.1 for ; Mon, 15 Apr 2024 11:32:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713205927; x=1713810727; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7BObtzxYiY3PmGcxZPMQAa4vztvNYgpogv+uT8X9uMc=; b=FnhLuGXxtZb9pi4TU0sRRhqMxDnPq0+jmXu3E8bwNCBOCRNI0mVo2UJptbNyI2jr/J oFGNKULf/cYko67YWnFRETueVnDERp1qvr6j+K7cejXGxjL6gzvihTxH7g+N8uzPTyy1 cmhV3m5cN5NigElLkJdabr4kaRCISl9WuiPJ1rKukXFH2g2QiGbdahVtmMmsbzjdOQK9 mX0ceVxYH09Ey438tFmdVP0tN6fTjKcrBCUytWZ9tm4UZWkt48bZz/REDuvlBpcoD5M8 OAchti6Dmx/0OAUpu87LJTvvbw8Z9kw9JsmrtvXwqsd/lz55Xmnewv7jMpSq3qtVHwvo WxLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713205927; x=1713810727; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7BObtzxYiY3PmGcxZPMQAa4vztvNYgpogv+uT8X9uMc=; b=Sb1DSmFVPOSQ/dqcNL6wMmOwc6zNvnZqyf82Qpw8QAACK1UkAhB50mcp//ERuMrs1x UDyWUqnC0o87HLyl8ceEGn0Ay6D/t1y6HIaBRXLucuzzTpEnX6ZGcU6Fo2uIYHYI3s7V oTV99JoNsRzELsTO7umt6WgYBuFH+OwoQ/OivMIfsllQKla6zF4HMpZvgobi9OsyCzOR lGu1yJl6INt5+VApMYn8nkQIW0u3LLZPPsP6R4/zt4R7EhPI1HPwFD/lCR8pXK8o5gQW sIlMV3vf1ENeTUJUIYD/eI1VE0PDZLHrJJUT2cEcRuoKv1Sv/gy4FJBDgFMemvotN4M6 5n1A== X-Forwarded-Encrypted: i=1; AJvYcCVcwsJuO5d4aFMYjbolvT7mR9A/x6/rXzB0X4pL4BiDjduH4Ece+S/6xe5pW8aeur6PXjQmOdFxZjP2Ky6Aps208fI= X-Gm-Message-State: AOJu0YxxnyGygEh5vRaRxoFp1HTGjXfPOcBtlwebMAmpU7Raft7SG/W5 lqr9s+EHh/c5E/pKVZqArKs84oRqE7JHZSmnS/pORZ/Lf/cE9kjTRBft/CSfA3oHzcpcNo0440W 1CFjmnNsHh8Jlet91D4a5BjSFXlA= X-Google-Smtp-Source: AGHT+IELpdHs5oFeoJ4R5hMXIeoaGuXsJr1uljvlhUTakFT+LkeDspHihOeO/Bws+LtHWr6XekRPiGt1x2fxr1ncTWU= X-Received: by 2002:ac2:51a7:0:b0:518:bb6e:7985 with SMTP id f7-20020ac251a7000000b00518bb6e7985mr4869628lfk.51.1713205926682; Mon, 15 Apr 2024 11:32:06 -0700 (PDT) MIME-Version: 1.0 References: <20240415111123924s9IbQkgHF8S4yZv4su8LI@zte.com.cn> In-Reply-To: From: Yang Shi Date: Mon, 15 Apr 2024 11:31:55 -0700 Message-ID: Subject: Re: [PATCH] mm: thp: makes the memcg THP deferred split shrinker aware of node_id To: Barry Song <21cnbao@gmail.com> Cc: xu.xin16@zte.com.cn, yang.shi@linux.alibaba.com, v-songbaohua@oppo.com, mhocko@kernel.org, ryan.roberts@arm.com, david@redhat.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, ran.xiaokai@zte.com.cn, yang.yang29@zte.com.cn, lu.zhongjun@zte.com.cn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8213A2001E X-Stat-Signature: ke3hgwswg5bwsmcec1wwu6rx4psrh4gx X-HE-Tag: 1713205928-935331 X-HE-Meta: U2FsdGVkX1+7oOsvnErPBQyocYwZ7ncZRoH3SFgEIelx8UC5k6lNjd4B38jintD1MgJumJjDebxmTGu0n5FmsZIxDDkK4FwggNVgIppSMRRcLCWf5nIv+7kJSLiRXQzezYQUfP892oYEYBVoC7Edo/UEPddgaurdP7PsdCX6PnzN2vAnnOPRjxXeGk1lbV47PTaKq6lSqLJrMHUpbfMCZV+p3uxXr/yL0963lA/Cod9DV6zFo43cGbX4fvrXfTL30FlhR5FzIBUqOC0B8cn0NhIIjglzli8gFf9/8vn3WqRM7erV9n6i1YW8CIEPCHqWuf+TRZ6c96UQJmoGsES9BoDj4ZLPEq/c22VmNOAU6ZNsCwq31QaUc30cs167PqTJ7xNTZjHdNJqujRdip8jio+q+EP3hvtJl9o61nikoIkQMK93DmTXLlI1MUQYbaVKaw7n7j5t6F2QrMQFTlBLQMqK0m6zjftLnxirI6xjbyEPApvy9KafxWxQ+4m8q/+15iu+0czypZV9apUdxcpkmGkop7YClKi3Rm0DVl9DLAt6osUHjwJl0TC7+hjZwxXtbHBdQHrFCWomy+vR790QPDE/NrvywZ3LsCxFW5ddKv0aOY9vp/+yNCJwo9rHFoZ0UpJlKQWA22us4M8TKJFTXaLs89V4EUu1Cq2C7ed5OCDN/6/HT5hBuGa3FiZfgrwxIWUbYMF5k3CcVwQ4gBm6CTgjw4J0RdTfFPliss2ToK4ieFYmmtdmBvy/4Pm5vZRvlBz6fdCx2mjVpgFrtp27k+naPiH+80wK+CnoyVd2QSLcHDArXThGCTDeUa5di8G/TA2QQcNv1bB96rL5AAJfbaXbVaQzIvMlrOV8IkBmYxeO0C+9lvuEUwn0xccm4A6eqn6qu6DApe2oLyZX+drTD7fI231g63bdOwtncOM+m/bKlV5WhMuuwQYbPjMIAbIw9GvPncMJdmDv7riLu+is v0m196TW YYeRAs17bbU9Xe7hywzfe9ZbFoh+Z2dYLyv/Z/FLrE9l5uex47AcE3phQKm13JhSk4MVASYwf55TwuldsvFaWxgwGkQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Apr 14, 2024 at 8:30=E2=80=AFPM Barry Song <21cnbao@gmail.com> wrot= e: > > On Mon, Apr 15, 2024 at 3:11=E2=80=AFPM wrote: > > > > From: Ran Xiaokai > > > > Since commit 87eaceb3faa5 ("mm: thp: make deferred split shrinker > > memcg aware"), the THP deferred split queue is per memcg but not > > per mem_cgroup_per_node. This has two aspects of impact: > > > > Impact1: for kswapd reclaim > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > kswapd > > balance_pgdat > > kswapd_shrink_node > > shrink_node(pgdat, sc); > > shrink_node_memcgs(pgdat, sc); > > shrink_slab(sc->gfp_mask, pgdat->node_id, memcg...); > > the parameter "pgdat->node_id" does not take effectct for > > THP deferred_split_shrinker, as the deferred_split_queue of > > specified memcg is not for a certain numa node but for all the nodes. > > We want to makes the memcg THP deferred split shrinker aware of > > node_id. > > > > Impact2: thp-deferred_split shrinker debugfs interface > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > for the "count" file: > > > > the output is acctually the sum of all numa nodes. > > for the "scan" file: > > > > Also the "numa id" input does not take effect here. > > > > This patch makes memcg deferred_split_queue per mem_cgroup_per_node > > so try to conform to semantic logic. I used to have a similar patch before, https://lore.kernel.org/linux-mm/1569968203-64647-1-git-send-email-yang.shi= @linux.alibaba.com/ But it was somehow lost in discussion. I have no objection to this patch. However, I was thinking about using list_lru for deferred split queue, but I didn't have time to look deeper. Maybe we should try now? > > This seems to be a correct fix to me, + Yang Shi, the original author of > commit 87eaceb3faa5. > > > > > Reviewed-by: Lu Zhongjun > > Signed-off-by: Ran Xiaokai > > Cc: xu xin > > Cc: Yang Yang > > --- > > include/linux/memcontrol.h | 7 +++---- > > mm/huge_memory.c | 6 +++--- > > mm/memcontrol.c | 11 +++++------ > > 3 files changed, 11 insertions(+), 13 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 394fd0a887ae..7282861d5a5d 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -130,6 +130,9 @@ struct mem_cgroup_per_node { > > bool on_tree; > > struct mem_cgroup *memcg; /* Back pointer, we can= not */ > > /* use container_of = */ > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > + struct deferred_split deferred_split_queue; > > +#endif > > }; > > > > struct mem_cgroup_threshold { > > @@ -327,10 +330,6 @@ struct mem_cgroup { > > struct list_head event_list; > > spinlock_t event_list_lock; > > > > -#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > - struct deferred_split deferred_split_queue; > > -#endif > > - > > #ifdef CONFIG_LRU_GEN_WALKS_MMU > > /* per-memcg mm_struct list */ > > struct lru_gen_mm_list mm_list; > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index 9859aa4f7553..338d071070a6 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -774,7 +774,7 @@ struct deferred_split *get_deferred_split_queue(str= uct folio *folio) > > struct pglist_data *pgdat =3D NODE_DATA(folio_nid(folio)); > > > > if (memcg) > > - return &memcg->deferred_split_queue; > > + return &memcg->nodeinfo[pgdat->node_id]->deferred_split= _queue; > > else > > return &pgdat->deferred_split_queue; > > } > > @@ -3305,7 +3305,7 @@ static unsigned long deferred_split_count(struct = shrinker *shrink, > > > > #ifdef CONFIG_MEMCG > > if (sc->memcg) > > - ds_queue =3D &sc->memcg->deferred_split_queue; > > + ds_queue =3D &sc->memcg->nodeinfo[sc->nid]->deferred_sp= lit_queue; > > #endif > > return READ_ONCE(ds_queue->split_queue_len); > > } > > @@ -3322,7 +3322,7 @@ static unsigned long deferred_split_scan(struct s= hrinker *shrink, > > > > #ifdef CONFIG_MEMCG > > if (sc->memcg) > > - ds_queue =3D &sc->memcg->deferred_split_queue; > > + ds_queue =3D &sc->memcg->nodeinfo[sc->nid]->deferred_sp= lit_queue; > > #endif > > > > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index fabce2b50c69..cdf9f5fa3b8e 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -5445,7 +5445,11 @@ static int alloc_mem_cgroup_per_node_info(struct= mem_cgroup *memcg, int node) > > kfree(pn); > > return 1; > > } > > - > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > + spin_lock_init(&pn->deferred_split_queue.split_queue_lock); > > + INIT_LIST_HEAD(&pn->deferred_split_queue.split_queue); > > + pn->deferred_split_queue.split_queue_len =3D 0; > > +#endif > > lruvec_init(&pn->lruvec); > > pn->memcg =3D memcg; > > > > @@ -5545,11 +5549,6 @@ static struct mem_cgroup *mem_cgroup_alloc(struc= t mem_cgroup *parent) > > for (i =3D 0; i < MEMCG_CGWB_FRN_CNT; i++) > > memcg->cgwb_frn[i].done =3D > > __WB_COMPLETION_INIT(&memcg_cgwb_frn_waitq); > > -#endif > > -#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > - spin_lock_init(&memcg->deferred_split_queue.split_queue_lock); > > - INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue); > > - memcg->deferred_split_queue.split_queue_len =3D 0; > > #endif > > lru_gen_init_memcg(memcg); > > return memcg; > > -- > > 2.15.2 > > > > Thanks > Barry >