From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63552E7717F for ; Thu, 12 Dec 2024 23:17:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C05E96B0099; Thu, 12 Dec 2024 18:17:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BB5DD6B009B; Thu, 12 Dec 2024 18:17:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7D7A6B009D; Thu, 12 Dec 2024 18:17:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8379B6B0099 for ; Thu, 12 Dec 2024 18:17:48 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF129160929 for ; Thu, 12 Dec 2024 23:17:47 +0000 (UTC) X-FDA: 82887870660.12.EDE1748 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) by imf27.hostedemail.com (Postfix) with ESMTP id C714D40015 for ; Thu, 12 Dec 2024 23:17:17 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ybm1zwp2; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734045454; a=rsa-sha256; cv=none; b=kX/sQlECLGw9l7kHMnndMN3rXChshP6y01KvUOmFlCSqHJHArtuvvKsp6KqorbKgRmp6N8 UuKDrkFOdxG6FPELJ/gqPSCDVYoiq14BYmag8216hVRgMkBSzvqID/7WvBkc9cvRDiRI3e u18lvbKbPLDr19w7j/Ay118eNd1MNoE= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ybm1zwp2; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734045454; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=R/jz3iLY9FPUrbViSXQDcEBqhi/rKZAYxJ7EbbAWC1c=; b=QBhPYlb37fiqLaDFm7Q/KrV6FqKwi3SLnW/8GSfCp+GfV0YV1bc1hlljbZb/jHSK/a4AZr OYvr8SZb0OIaxL9dDmZ5wvm7/Jy2pGF3AN3lnCUlh6FcxlPMojniL2SCVA8huEtekZLuzU ugj5O7hJiRNZ5GDUf+Cs+HK67RrZvgs= Received: by mail-vs1-f50.google.com with SMTP id ada2fe7eead31-4aff5b3845eso307509137.2 for ; Thu, 12 Dec 2024 15:17:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734045465; x=1734650265; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=R/jz3iLY9FPUrbViSXQDcEBqhi/rKZAYxJ7EbbAWC1c=; b=Ybm1zwp2WlVj8zk+eOPSRJr0h+iNt0kJtym/TfL0cc3QnwQb2vo7AOfXXVX09Zo2d8 CUe8VSmMIU9Ks9DkgnZwQn45FPZ2jYahy5A0Swl+BGzblcqIeaAJP3ljueasVYW0TgII ZPRPrxhWuMe2402AeeTHDm/6Ll0vyoTwMaFscYJJ5Li8JbWB4y9n7TfMgBRg5lxzk/fU Lu/G11h+iz7lOG9seugf6ACWw91yOpoDzDO30tjsT9rXvuUYNrSuM1e6Tq4g0VtWINcj I7yOCRd4oU3qT7x4vldqU+DKSRw1YQBIkRHyzPqj1bpDGX5JJKHlTjINhkwZPyAgxPSo r5CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734045465; x=1734650265; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R/jz3iLY9FPUrbViSXQDcEBqhi/rKZAYxJ7EbbAWC1c=; b=f6KdmSvvwE1lpsRAFDZWylfMj6O/e+KtjbkoRIvP9pts5ktcAOXDCKscYITQrHq/Iw HN4Zlp6CBTRBm3E8WEG1L+T9FX/qavYXjilmBxb69FpXUrDknapOgp4G5uxadS4tLb0P ahZ22RSZlsyz7LAisgCPnx/LIvXydY5DXyYq3HNlB+P5m/aVifxxkVzMQH3OCPZTaKe4 Of09YVb+lMxREes8qGAGcHfFwVU9FcAMSjlCw/k/IyZqRZeFjfqhWTJ3t77O3h6er6Ft 6FfJ5UXPHlEn/2e5MsWCgNw8kz7UNC4kBkItpjZ2upWLucxe1RKIOYIJjsDEHmIAn5YA cGqA== X-Forwarded-Encrypted: i=1; AJvYcCUgFTNc+dJ5chYo0cREejhK93LQ1XL5jLDxkRiUQQCpz22jzlxga2c1svZWI8wKHCEX+mI7ZAc3dg==@kvack.org X-Gm-Message-State: AOJu0YxWgvWpPcqf6iMNJGVNOa/E78dKw460rFMG/ZAuQDmJXAIh+2p+ 2sp3mvLJOU/ZxfYDfOlPt75qPQ15rX1B6ccfMD0zBoxllZHhbwvTod76dSCMxEWEGh3uqpS6YQm a/a+vE9J2Dq9ktzP4w/iiXYIgMsw= X-Gm-Gg: ASbGncsdIP1QGz+eq6c8rv9FLfqG4+7DOPJ7+RBU06y+KMeezKEuvICVSYnq1s+q47I 5Wbd0Hb3C4CmTF3rSCOmLkZ2TrdbSNbVKWPthYCwQ1zjnAYv6ET7B1ll0VVMY7vsVevAvneXZ X-Google-Smtp-Source: AGHT+IGjXhkoJpMfUPiKacf7Kfh7S4rbUd/mbjfvqu3riWZKz8Ep5ExuJUBuS6a4N4VuNuZBGNw9SFLcoaPzLFm8hic= X-Received: by 2002:a05:6102:3ecd:b0:4b1:1a11:16e0 with SMTP id ada2fe7eead31-4b25db8a923mr1029988137.27.1734045464815; Thu, 12 Dec 2024 15:17:44 -0800 (PST) MIME-Version: 1.0 References: <20241209083618.2889145-1-chenridong@huaweicloud.com> <20241209083618.2889145-2-chenridong@huaweicloud.com> <13223d50-6218-49db-8356-700a1907e224@huawei.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 13 Dec 2024 12:17:33 +1300 Message-ID: Subject: Re: [PATCH v4 1/1] mm: vmascan: retry folios written back while isolated for traditional LRU To: chenridong Cc: Chen Ridong , akpm@linux-foundation.org, mhocko@suse.com, hannes@cmpxchg.org, yosryahmed@google.com, yuzhao@google.com, david@redhat.com, willy@infradead.org, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, wangweiyang2@huawei.com, xieym_ict@hotmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C714D40015 X-Stat-Signature: hautshtc79o6csg6eup46imjr9u9zwtx X-HE-Tag: 1734045437-494362 X-HE-Meta: U2FsdGVkX1+k1dYwbSCkZLXFHzbtqOB48xKYcFhSv/Fcs+il5h4HpHw6k7X2iiLebXIj2H4tT0DcPRiOMjm8tQMrqY6CMY1kdEdcW0Yomu6PKg0/p0M76Xw1wUW3l4qzrWh+NJa5I1fNiNB765iGVYb7NEQTIoi8rEitYGlQAHRwOebvfcmmfbVJp3jUojIJCsCta0vuzf4XNzAksKfY3lW/Jgg+uGw1b/NYAMl0yDbDwsvNnHpZKr/OC7CCi1OI4uHczUroigYnOSbLBROmSVrFCDWW8J0EXLP38LdDb9q2l5NbcXxgzdqlSTEWA28hJ3c+BhRD14263YgkmvNFIZQ6QKFWfNmu6mQLDISxIwa8Gsjfqx0b2TAeZU+Mib7gAf/e4XGtsgipeFeMwsLPRAD0Itvh1lLGHAhTiPZ4WJv/u/td2AiobCU7jHSdT63/ULgD1EM9yZJ1pCG+ILXwi34yMSoutAqfeFdKBywkJzaieqDLTEHjtFG03vMpUBTJotib4N4wV4WOTlsBNt6JCINgPPy7Ed41Fs0rveq6zyfOIaOLSy+SoRYeit1ay5MixOQbMcc+uUU+Pa9WnfGbd21jD+1ct+TLZdieEJ20JFIWACx4H5UStBSNHoPZcuJ5X/i8U/BVwU2H0dCfcfbEIaMVBDYXyBnHiSvw3B8DZk8MXH3TuRRZAnozjmWQkuetqSZ6QOGL0i8WBQZmygHZDfAeO3Tynw/+kDJJ74TZtAb9nmyl3eCdebmljEwNHfcYzIohQnES3xFVLMNayo3ck9fG7GhBF4Xd2ympSt9Ru20hY5Y3+uWLtjtiwomiFro/maoGS8ytfdgeQiNytIHvGP8XVwKRLal29Sk3nNt8LznEQSyJDe4rFuqrRqDnYBXm1VVNV4QjIbem5zLSr5IIBmVDKx/Xvk58vJBLpbadqxFjJ00jQTragNHNTGgDIChai2qFbF62qhFadx/2gq+ s2drEX0N tPZy8bITnNCZrTWsM1J9c3CvXU1C+/zLzVqDYPibkOrqPDp5D0BNYR+fpZXkzFSom8wSKhGFd4WGTDvC1yVJcdHv7v5AhTIOp05oppA8sdrKceT7gSeGyWUxTWlS9g2AzZiKwQXICUq7eyWH8WiEAirRv6rK7UBozd4tuqQOWt6/6qc1r5DwQRnIS1hauG0jRE4+tM6g91g4aYPgNi0agET7LtvcgbeNxaB2sdhtrizYcLooSkWMy2VUFK6lwyfVLhxc25S39IDjagINsut9dcyAR7NmJBeSkpwxeuq7tCozqa4rZpoZVS7jAzCOTerC6KvJBQGAahrLaElablypP28JrIvIXPQFHVR/KMBXh9r1A9XMxc7/aFwTb0Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.004446, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 11, 2024 at 1:11=E2=80=AFAM chenridong = wrote: > > > > On 2024/12/10 16:24, Barry Song wrote: > > On Tue, Dec 10, 2024 at 2:41=E2=80=AFPM chenridong wrote: > >> > >> > >> > >> On 2024/12/10 12:54, Barry Song wrote: > >>> On Mon, Dec 9, 2024 at 4:46=E2=80=AFPM Chen Ridong wrote: > >>>> > >>>> From: Chen Ridong > >>>> > >>>> The commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written ba= ck > >>>> while isolated") only fixed the issue for mglru. However, this issue > >>>> also exists in the traditional active/inactive LRU. This issue will = be > >>>> worse if THP is split, which makes the list longer and needs longer = time > >>>> to finish a batch of folios reclaim. > >>>> > >>>> This issue should be fixed in the same way for the traditional LRU. > >>>> Therefore, the common logic was extracted to the 'find_folios_writte= n_back' > >>>> function firstly, which is then reused in the 'shrink_inactive_list' > >>>> function. Finally, retry reclaiming those folios that may have misse= d the > >>>> rotation for traditional LRU. > >>> > >>> let's drop the cover-letter and refine the changelog. > >>> > >> Will update. > >> > >>>> > >>>> Signed-off-by: Chen Ridong > >>>> --- > >>>> include/linux/mmzone.h | 3 +- > >>>> mm/vmscan.c | 108 +++++++++++++++++++++++++++++---------= --- > >>>> 2 files changed, 77 insertions(+), 34 deletions(-) > >>>> > >>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >>>> index b36124145a16..47c6e8c43dcd 100644 > >>>> --- a/include/linux/mmzone.h > >>>> +++ b/include/linux/mmzone.h > >>>> @@ -391,6 +391,7 @@ struct page_vma_mapped_walk; > >>>> > >>>> #define LRU_GEN_MASK ((BIT(LRU_GEN_WIDTH) - 1) << LRU_GEN= _PGOFF) > >>>> #define LRU_REFS_MASK ((BIT(LRU_REFS_WIDTH) - 1) << LRU_RE= FS_PGOFF) > >>>> +#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_working= set)) > >>>> > >>>> #ifdef CONFIG_LRU_GEN > >>>> > >>>> @@ -406,8 +407,6 @@ enum { > >>>> NR_LRU_GEN_CAPS > >>>> }; > >>>> > >>>> -#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_working= set)) > >>>> - > >>>> #define MIN_LRU_BATCH BITS_PER_LONG > >>>> #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) > >>>> > >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>>> index 76378bc257e3..1f0d194f8b2f 100644 > >>>> --- a/mm/vmscan.c > >>>> +++ b/mm/vmscan.c > >>>> @@ -283,6 +283,48 @@ static void set_task_reclaim_state(struct task_= struct *task, > >>>> task->reclaim_state =3D rs; > >>>> } > >>>> > >>>> +/** > >>>> + * find_folios_written_back - Find and move the written back folios= to a new list. > >>>> + * @list: filios list > >>>> + * @clean: the written back folios list > >>>> + * @skip: whether skip to move the written back folios to clean lis= t. > >>>> + */ > >>>> +static inline void find_folios_written_back(struct list_head *list, > >>>> + struct list_head *clean, bool skip) > >>>> +{ > >>>> + struct folio *folio; > >>>> + struct folio *next; > >>>> + > >>>> + list_for_each_entry_safe_reverse(folio, next, list, lru) { > >>>> + if (!folio_evictable(folio)) { > >>>> + list_del(&folio->lru); > >>>> + folio_putback_lru(folio); > >>>> + continue; > >>>> + } > >>>> + > >>>> + if (folio_test_reclaim(folio) && > >>>> + (folio_test_dirty(folio) || folio_test_writeback= (folio))) { > >>>> + /* restore LRU_REFS_FLAGS cleared by isolate= _folio() */ > >>>> + if (lru_gen_enabled() && folio_test_workings= et(folio)) > >>>> + folio_set_referenced(folio); > >>>> + continue; > >>>> + } > >>>> + > >>>> + if (skip || folio_test_active(folio) || folio_test_r= eferenced(folio) || > >>>> + folio_mapped(folio) || folio_test_locked(folio) = || > >>>> + folio_test_dirty(folio) || folio_test_writeback(= folio)) { > >>>> + /* don't add rejected folios to the oldest g= eneration */ > >>>> + if (lru_gen_enabled()) > >>>> + set_mask_bits(&folio->flags, LRU_REF= S_MASK | LRU_REFS_FLAGS, > >>>> + BIT(PG_active)); > >>>> + continue; > >>>> + } > >>>> + > >>>> + /* retry folios that may have missed folio_rotate_re= claimable() */ > >>>> + list_move(&folio->lru, clean); > >>>> + } > >>>> +} > >>>> + > >>>> /* > >>>> * flush_reclaim_state(): add pages reclaimed outside of LRU-based = reclaim to > >>>> * scan_control->nr_reclaimed. > >>>> @@ -1907,6 +1949,25 @@ static int current_may_throttle(void) > >>>> return !(current->flags & PF_LOCAL_THROTTLE); > >>>> } > >>>> > >>>> +static inline void acc_reclaimed_stat(struct reclaim_stat *stat, > >>>> + struct reclaim_stat *curr) > >>>> +{ > >>>> + int i; > >>>> + > >>>> + stat->nr_dirty +=3D curr->nr_dirty; > >>>> + stat->nr_unqueued_dirty +=3D curr->nr_unqueued_dirty; > >>>> + stat->nr_congested +=3D curr->nr_congested; > >>>> + stat->nr_writeback +=3D curr->nr_writeback; > >>>> + stat->nr_immediate +=3D curr->nr_immediate; > >>>> + stat->nr_pageout +=3D curr->nr_pageout; > >>>> + stat->nr_ref_keep +=3D curr->nr_ref_keep; > >>>> + stat->nr_unmap_fail +=3D curr->nr_unmap_fail; > >>>> + stat->nr_lazyfree_fail +=3D curr->nr_lazyfree_fail; > >>>> + stat->nr_demoted +=3D curr->nr_demoted; > >>>> + for (i =3D 0; i < ANON_AND_FILE; i++) > >>>> + stat->nr_activate[i] =3D curr->nr_activate[i]; > >>>> +} > >>> > >>> you had no this before, what's the purpose of this=EF=BC=9F > >>> > >> > >> We may call shrink_folio_list twice, and the 'stat curr' will reset in > >> the shrink_folio_list function. We should accumulate the stats as a > >> whole, which will then be used to calculate the cost and return it to > >> the caller. > > > > Does mglru have the same issue? If so, we may need to send a patch to > > fix mglru's stat accounting as well. By the way, the code is rather > > messy=E2=80=94could it be implemented as shown below instead? > > > > I have checked the code (in the evict_folios function) again, and it > appears that 'reclaimed' should correspond to sc->nr_reclaimed, which > accumulates the results twice. Should I address this issue with a > separate patch? I don't think there is any problem. reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false); sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; reclaimed is always the number of pages we have reclaimed in shrink_folio_list() no matter if it is retry or not. > > if (!cgroup_reclaim(sc)) > __count_vm_events(item, reclaimed); > __count_memcg_events(memcg, item, reclaimed); > __count_vm_events(PGSTEAL_ANON + type, reclaimed); > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 1f0d194f8b2f..40d2ddde21f5 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1094,7 +1094,6 @@ static unsigned int shrink_folio_list(struct > > list_head *folio_list, > > struct swap_iocb *plug =3D NULL; > > > > folio_batch_init(&free_folios); > > - memset(stat, 0, sizeof(*stat)); > > cond_resched(); > > do_demote_pass =3D can_demote(pgdat->node_id, sc); > > > > @@ -1949,25 +1948,6 @@ static int current_may_throttle(void) > > return !(current->flags & PF_LOCAL_THROTTLE); > > } > > > > -static inline void acc_reclaimed_stat(struct reclaim_stat *stat, > > - struct reclaim_stat *curr) > > -{ > > - int i; > > - > > - stat->nr_dirty +=3D curr->nr_dirty; > > - stat->nr_unqueued_dirty +=3D curr->nr_unqueued_dirty; > > - stat->nr_congested +=3D curr->nr_congested; > > - stat->nr_writeback +=3D curr->nr_writeback; > > - stat->nr_immediate +=3D curr->nr_immediate; > > - stat->nr_pageout +=3D curr->nr_pageout; > > - stat->nr_ref_keep +=3D curr->nr_ref_keep; > > - stat->nr_unmap_fail +=3D curr->nr_unmap_fail; > > - stat->nr_lazyfree_fail +=3D curr->nr_lazyfree_fail; > > - stat->nr_demoted +=3D curr->nr_demoted; > > - for (i =3D 0; i < ANON_AND_FILE; i++) > > - stat->nr_activate[i] =3D curr->nr_activate[i]; > > -} > > - > > /* > > * shrink_inactive_list() is a helper for shrink_node(). It returns t= he number > > * of reclaimed pages > > @@ -1981,7 +1961,7 @@ static unsigned long > > shrink_inactive_list(unsigned long nr_to_scan, > > unsigned long nr_scanned; > > unsigned int nr_reclaimed =3D 0; > > unsigned long nr_taken; > > - struct reclaim_stat stat, curr; > > + struct reclaim_stat stat; > > bool file =3D is_file_lru(lru); > > enum vm_event_item item; > > struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); > > @@ -2022,9 +2002,8 @@ static unsigned long > > shrink_inactive_list(unsigned long nr_to_scan, > > > > memset(&stat, 0, sizeof(stat)); > > retry: > > - nr_reclaimed +=3D shrink_folio_list(&folio_list, pgdat, sc, &curr, fa= lse); > > + nr_reclaimed +=3D shrink_folio_list(&folio_list, pgdat, sc, &stat, fa= lse); > > find_folios_written_back(&folio_list, &clean_list, skip_retry); > > - acc_reclaimed_stat(&stat, &curr); > > > > spin_lock_irq(&lruvec->lru_lock); > > move_folios_to_lru(lruvec, &folio_list); > > > > This seems much better. But we have extras works to do: > > 1. In the shrink_folio_list function: > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1089,12 +1089,12 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > LIST_HEAD(ret_folios); > LIST_HEAD(demote_folios); > unsigned int nr_reclaimed =3D 0; > - unsigned int pgactivate =3D 0; > + unsigned int pgactivate =3D stat->nr_activate[0] + > stat->nr_activate[1]; > + unsigned int nr_demote =3D 0; > bool do_demote_pass; > struct swap_iocb *plug =3D NULL; > > folio_batch_init(&free_folios); > - memset(stat, 0, sizeof(*stat)); > cond_resched(); > do_demote_pass =3D can_demote(pgdat->node_id, sc); > > @@ -1558,7 +1558,8 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > > /* Migrate folios selected for demotion */ > stat->nr_demoted =3D demote_folio_list(&demote_folios, pgdat); > - nr_reclaimed +=3D stat->nr_demoted; > + stat->nr_demoted +=3D nr_demote; > + nr_reclaimed +=3D nr_demote; > /* Folios that could not be demoted are still in @demote_folios *= / > if (!list_empty(&demote_folios)) { > /* Folios which weren't demoted go back on @folio_list */ > @@ -1586,7 +1587,7 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > } > } > > - pgactivate =3D stat->nr_activate[0] + stat->nr_activate[1]; > + pgactivate =3D stat->nr_activate[0] + stat->nr_activate[1] - > pgactivate; > > mem_cgroup_uncharge_folios(&free_folios); > try_to_unmap_flush(); > > > 2. Outsize of the shrink_folio_list function, The callers should memset > the stat. > > If you think this will be better, I will update like this. no. Please goto retry after you have collected all you need from `stat` just as mglru is doing, drop the "curr" and acc_reclaimed_stat(). sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; move_folios_to_lru() has helped moving all uninterested folios back to lruv= ec before you retry. There is no duplicated counting. > > >> > >> Thanks, > >> Ridong > >> > >>>> + > >>>> /* > >>>> * shrink_inactive_list() is a helper for shrink_node(). It return= s the number > >>>> * of reclaimed pages > >>>> @@ -1916,14 +1977,16 @@ static unsigned long shrink_inactive_list(un= signed long nr_to_scan, > >>>> enum lru_list lru) > >>>> { > >>>> LIST_HEAD(folio_list); > >>>> + LIST_HEAD(clean_list); > >>>> unsigned long nr_scanned; > >>>> unsigned int nr_reclaimed =3D 0; > >>>> unsigned long nr_taken; > >>>> - struct reclaim_stat stat; > >>>> + struct reclaim_stat stat, curr; > >>>> bool file =3D is_file_lru(lru); > >>>> enum vm_event_item item; > >>>> struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); > >>>> bool stalled =3D false; > >>>> + bool skip_retry =3D false; > >>>> > >>>> while (unlikely(too_many_isolated(pgdat, file, sc))) { > >>>> if (stalled) > >>>> @@ -1957,10 +2020,20 @@ static unsigned long shrink_inactive_list(un= signed long nr_to_scan, > >>>> if (nr_taken =3D=3D 0) > >>>> return 0; > >>>> > >>>> - nr_reclaimed =3D shrink_folio_list(&folio_list, pgdat, sc, &= stat, false); > >>>> + memset(&stat, 0, sizeof(stat)); > >>>> +retry: > >>>> + nr_reclaimed +=3D shrink_folio_list(&folio_list, pgdat, sc, = &curr, false); > >>>> + find_folios_written_back(&folio_list, &clean_list, skip_retr= y); > >>>> + acc_reclaimed_stat(&stat, &curr); > >>>> > >>>> spin_lock_irq(&lruvec->lru_lock); > >>>> move_folios_to_lru(lruvec, &folio_list); > >>>> + if (!list_empty(&clean_list)) { > >>>> + list_splice_init(&clean_list, &folio_list); > >>>> + skip_retry =3D true; > >>>> + spin_unlock_irq(&lruvec->lru_lock); > >>>> + goto retry; > > > > This is rather confusing. We're still jumping to retry even though > > skip_retry=3Dtrue is set. Can we find a clearer approach for this? > > > > It was somewhat acceptable before we introduced the extracted > > function find_folios_written_back(). However, it has become > > harder to follow now that skip_retry is passed across functions. > > > > I find renaming skip_retry to is_retry more intuitive. The logic > > is that since we are already retrying, find_folios_written_back() > > shouldn=E2=80=99t move folios to the clean list again. The intended sem= antics > > are: we have retris, don=E2=80=99t retry again. > > > > Reasonable. Will update. > > Thanks, > Ridong > > > > >>>> + } > >>>> > >>>> __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offse= t(), > >>>> stat.nr_demoted); > >>>> @@ -4567,8 +4640,6 @@ static int evict_folios(struct lruvec *lruvec,= struct scan_control *sc, int swap > >>>> int reclaimed; > >>>> LIST_HEAD(list); > >>>> LIST_HEAD(clean); > >>>> - struct folio *folio; > >>>> - struct folio *next; > >>>> enum vm_event_item item; > >>>> struct reclaim_stat stat; > >>>> struct lru_gen_mm_walk *walk; > >>>> @@ -4597,34 +4668,7 @@ static int evict_folios(struct lruvec *lruvec= , struct scan_control *sc, int swap > >>>> scanned, reclaimed, &stat, sc->priority, > >>>> type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON= ); > >>>> > >>>> - list_for_each_entry_safe_reverse(folio, next, &list, lru) { > >>>> - if (!folio_evictable(folio)) { > >>>> - list_del(&folio->lru); > >>>> - folio_putback_lru(folio); > >>>> - continue; > >>>> - } > >>>> - > >>>> - if (folio_test_reclaim(folio) && > >>>> - (folio_test_dirty(folio) || folio_test_writeback= (folio))) { > >>>> - /* restore LRU_REFS_FLAGS cleared by isolate= _folio() */ > >>>> - if (folio_test_workingset(folio)) > >>>> - folio_set_referenced(folio); > >>>> - continue; > >>>> - } > >>>> - > >>>> - if (skip_retry || folio_test_active(folio) || folio_= test_referenced(folio) || > >>>> - folio_mapped(folio) || folio_test_locked(folio) = || > >>>> - folio_test_dirty(folio) || folio_test_writeback(= folio)) { > >>>> - /* don't add rejected folios to the oldest g= eneration */ > >>>> - set_mask_bits(&folio->flags, LRU_REFS_MASK |= LRU_REFS_FLAGS, > >>>> - BIT(PG_active)); > >>>> - continue; > >>>> - } > >>>> - > >>>> - /* retry folios that may have missed folio_rotate_re= claimable() */ > >>>> - list_move(&folio->lru, &clean); > >>>> - } > >>>> - > >>>> + find_folios_written_back(&list, &clean, skip_retry); > >>>> spin_lock_irq(&lruvec->lru_lock); > >>>> > >>>> move_folios_to_lru(lruvec, &list); > >>>> -- > >>>> 2.34.1 > >>>> > >>> > > > > Thanks > > Barry