From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10A7AC433F5 for ; Fri, 18 Mar 2022 01:16:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7FD4C8D0002; Thu, 17 Mar 2022 21:16:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7AD2E8D0001; Thu, 17 Mar 2022 21:16:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64F5B8D0002; Thu, 17 Mar 2022 21:16:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 550908D0001 for ; Thu, 17 Mar 2022 21:16:06 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1069AA45A7 for ; Fri, 18 Mar 2022 01:16:06 +0000 (UTC) X-FDA: 79255740732.30.14DAF21 Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com [209.85.219.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 943BF180008 for ; Fri, 18 Mar 2022 01:16:05 +0000 (UTC) Received: by mail-yb1-f172.google.com with SMTP id y142so13318078ybe.11 for ; Thu, 17 Mar 2022 18:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=VRsb5Zmb/JCOeENoTDRg7wIKUPQgA0/d6iKWZvPMkaw=; b=gxFBnfWRS6wjleZvzFQNXoTfaiX0E6pjWIupBHBHr0SfVAYDgZiI+XgDsV7PpB4U67 4y8iSEE+2ly4nmLzGztFYRJBtHjgJQvoyuGvqtUvwyC9gh3qycNrJ4lA7A/pvnX4ZacD DxXxxLvuchL/+vp/rItPOgGmEEn+NuAc5WJx4lBt9E9CyTd5hlBSrEyMLFUrb6ECAy43 mVQH5bYzRijK+Q6tdNGM6qLrKzZCDs5Cjv4QM39KllWc+dOFyZcNDKPGlSIGgynuhjZJ hjUAOJbtI3oOSpzwT6+makiViBL4VWYpaFa37QI2iDg6ENewo7KGQjLNjMo9fIupbIk4 RBLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=VRsb5Zmb/JCOeENoTDRg7wIKUPQgA0/d6iKWZvPMkaw=; b=ohwlQH8AUNfXn4fJ9YSdPVQAgdo242xq6IFXoiC6hvXIdIDH7YzZG6bHRonJqX4nkx EGXwjMqzUJi2Ov5AvsNsfd1xr4UjDtnR1NmiHKNLxHGlKGW67XgsaTZgejHjD4SEm9aw GyKrlNUi07sBEWH8oVhiBZvA3bX51diQTe0Kh8ZcAsxzDOGmxVZtcLZy0pWSRJ4g7/xF TNNphB63eG0q1VcTSwAF/uk13nrXnaZo/UpJdmtB7XDnoduiktlQsQqySa/fWij0cdVF JJYfPJOXfduP2TChPsXkAq12x+xdYVm+bl+6DXWC6Fzh0sJtTVG2i5+FvT9J2KhdGtNS JjdQ== X-Gm-Message-State: AOAM5316bxTQio847Xv1/8aLEuPq0mimX4LUHudUhVNbK0+BTl7I0woh ZJVaQM0fGr6ycBNOGkZUS6QlTTKmd/ZPAEDOzUU= X-Google-Smtp-Source: ABdhPJwg2ySuj+xOjHdgdurPQW4/xpvbPONWKSpFY/ZbtyRbsY+xD+fgXfSSRGGveSsQWmt4JiYb/ZKNVUN9rC1yMhw= X-Received: by 2002:a25:a223:0:b0:621:1238:68b1 with SMTP id b32-20020a25a223000000b00621123868b1mr8097719ybi.370.1647566164753; Thu, 17 Mar 2022 18:16:04 -0700 (PDT) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-4-yuzhao@google.com> In-Reply-To: <20220309021230.721028-4-yuzhao@google.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 18 Mar 2022 14:15:48 +1300 Message-ID: Subject: Re: [PATCH v9 03/14] mm/vmscan.c: refactor shrink_node() To: Yu Zhao Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , Kernel Page Reclaim v2 , x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: wxbdas69eo9r1dnxuty1fa89bowgdtiq Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=gxFBnfWR; spf=pass (imf06.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.219.172 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 943BF180008 X-HE-Tag: 1647566165-129194 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao wrote: > > This patch refactors shrink_node() to improve readability for the > upcoming changes to mm/vmscan.c. > > Signed-off-by: Yu Zhao > Acked-by: Brian Geffon > Acked-by: Jan Alexander Steffens (heftig) > Acked-by: Oleksandr Natalenko > Acked-by: Steven Barrett > Acked-by: Suleiman Souhlal > Tested-by: Daniel Byrne > Tested-by: Donald Carr > Tested-by: Holger Hoffst=C3=A4tte > Tested-by: Konstantin Kharlamov > Tested-by: Shuang Zhai > Tested-by: Sofia Trinh > Tested-by: Vaibhav Jain Reviewed-by: Barry Song seems nice refactoring since we are going to skip the whole function for lru_gen later: static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc) { unsigned long file; struct lruvec *target_lruvec; if (lru_gen_enabled()) return; ... } > --- > mm/vmscan.c | 198 +++++++++++++++++++++++++++------------------------- > 1 file changed, 104 insertions(+), 94 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 59b14e0d696c..8e744cdf802f 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2718,6 +2718,109 @@ enum scan_balance { > SCAN_FILE, > }; > > +static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc= ) > +{ > + unsigned long file; > + struct lruvec *target_lruvec; > + > + target_lruvec =3D mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat)= ; > + > + /* > + * Flush the memory cgroup stats, so that we read accurate per-me= mcg > + * lruvec stats for heuristics. > + */ > + mem_cgroup_flush_stats(); > + > + /* > + * Determine the scan balance between anon and file LRUs. > + */ > + spin_lock_irq(&target_lruvec->lru_lock); > + sc->anon_cost =3D target_lruvec->anon_cost; > + sc->file_cost =3D target_lruvec->file_cost; > + spin_unlock_irq(&target_lruvec->lru_lock); > + > + /* > + * Target desirable inactive:active list ratios for the anon > + * and file LRU lists. > + */ > + if (!sc->force_deactivate) { > + unsigned long refaults; > + > + refaults =3D lruvec_page_state(target_lruvec, > + WORKINGSET_ACTIVATE_ANON); > + if (refaults !=3D target_lruvec->refaults[0] || > + inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)= ) > + sc->may_deactivate |=3D DEACTIVATE_ANON; > + else > + sc->may_deactivate &=3D ~DEACTIVATE_ANON; > + > + /* > + * When refaults are being observed, it means a new > + * workingset is being established. Deactivate to get > + * rid of any stale active pages quickly. > + */ > + refaults =3D lruvec_page_state(target_lruvec, > + WORKINGSET_ACTIVATE_FILE); > + if (refaults !=3D target_lruvec->refaults[1] || > + inactive_is_low(target_lruvec, LRU_INACTIVE_FILE)) > + sc->may_deactivate |=3D DEACTIVATE_FILE; > + else > + sc->may_deactivate &=3D ~DEACTIVATE_FILE; > + } else > + sc->may_deactivate =3D DEACTIVATE_ANON | DEACTIVATE_FILE; > + > + /* > + * If we have plenty of inactive file pages that aren't > + * thrashing, try to reclaim those first before touching > + * anonymous pages. > + */ > + file =3D lruvec_page_state(target_lruvec, NR_INACTIVE_FILE); > + if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FIL= E)) > + sc->cache_trim_mode =3D 1; > + else > + sc->cache_trim_mode =3D 0; > + > + /* > + * Prevent the reclaimer from falling into the cache trap: as > + * cache pages start out inactive, every cache fault will tip > + * the scan balance towards the file LRU. And as the file LRU > + * shrinks, so does the window for rotation from references. > + * This means we have a runaway feedback loop where a tiny > + * thrashing file LRU becomes infinitely more attractive than > + * anon pages. Try to detect this based on file LRU size. > + */ > + if (!cgroup_reclaim(sc)) { > + unsigned long total_high_wmark =3D 0; > + unsigned long free, anon; > + int z; > + > + free =3D sum_zone_node_page_state(pgdat->node_id, NR_FREE= _PAGES); > + file =3D node_page_state(pgdat, NR_ACTIVE_FILE) + > + node_page_state(pgdat, NR_INACTIVE_FILE); > + > + for (z =3D 0; z < MAX_NR_ZONES; z++) { > + struct zone *zone =3D &pgdat->node_zones[z]; > + > + if (!managed_zone(zone)) > + continue; > + > + total_high_wmark +=3D high_wmark_pages(zone); > + } > + > + /* > + * Consider anon: if that's low too, this isn't a > + * runaway file reclaim problem, but rather just > + * extreme pressure. Reclaim as per usual then. > + */ > + anon =3D node_page_state(pgdat, NR_INACTIVE_ANON); > + > + sc->file_is_tiny =3D > + file + free <=3D total_high_wmark && > + !(sc->may_deactivate & DEACTIVATE_ANON) && > + anon >> sc->priority; > + } > +} > + > /* > * Determine how aggressively the anon and file LRU lists should be > * scanned. The relative value of each set of LRU lists is determined > @@ -3188,109 +3291,16 @@ static void shrink_node(pg_data_t *pgdat, struct= scan_control *sc) > unsigned long nr_reclaimed, nr_scanned; > struct lruvec *target_lruvec; > bool reclaimable =3D false; > - unsigned long file; > > target_lruvec =3D mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat)= ; > > again: > - /* > - * Flush the memory cgroup stats, so that we read accurate per-me= mcg > - * lruvec stats for heuristics. > - */ > - mem_cgroup_flush_stats(); > - > memset(&sc->nr, 0, sizeof(sc->nr)); > > nr_reclaimed =3D sc->nr_reclaimed; > nr_scanned =3D sc->nr_scanned; > > - /* > - * Determine the scan balance between anon and file LRUs. > - */ > - spin_lock_irq(&target_lruvec->lru_lock); > - sc->anon_cost =3D target_lruvec->anon_cost; > - sc->file_cost =3D target_lruvec->file_cost; > - spin_unlock_irq(&target_lruvec->lru_lock); > - > - /* > - * Target desirable inactive:active list ratios for the anon > - * and file LRU lists. > - */ > - if (!sc->force_deactivate) { > - unsigned long refaults; > - > - refaults =3D lruvec_page_state(target_lruvec, > - WORKINGSET_ACTIVATE_ANON); > - if (refaults !=3D target_lruvec->refaults[0] || > - inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)= ) > - sc->may_deactivate |=3D DEACTIVATE_ANON; > - else > - sc->may_deactivate &=3D ~DEACTIVATE_ANON; > - > - /* > - * When refaults are being observed, it means a new > - * workingset is being established. Deactivate to get > - * rid of any stale active pages quickly. > - */ > - refaults =3D lruvec_page_state(target_lruvec, > - WORKINGSET_ACTIVATE_FILE); > - if (refaults !=3D target_lruvec->refaults[1] || > - inactive_is_low(target_lruvec, LRU_INACTIVE_FILE)) > - sc->may_deactivate |=3D DEACTIVATE_FILE; > - else > - sc->may_deactivate &=3D ~DEACTIVATE_FILE; > - } else > - sc->may_deactivate =3D DEACTIVATE_ANON | DEACTIVATE_FILE; > - > - /* > - * If we have plenty of inactive file pages that aren't > - * thrashing, try to reclaim those first before touching > - * anonymous pages. > - */ > - file =3D lruvec_page_state(target_lruvec, NR_INACTIVE_FILE); > - if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FIL= E)) > - sc->cache_trim_mode =3D 1; > - else > - sc->cache_trim_mode =3D 0; > - > - /* > - * Prevent the reclaimer from falling into the cache trap: as > - * cache pages start out inactive, every cache fault will tip > - * the scan balance towards the file LRU. And as the file LRU > - * shrinks, so does the window for rotation from references. > - * This means we have a runaway feedback loop where a tiny > - * thrashing file LRU becomes infinitely more attractive than > - * anon pages. Try to detect this based on file LRU size. > - */ > - if (!cgroup_reclaim(sc)) { > - unsigned long total_high_wmark =3D 0; > - unsigned long free, anon; > - int z; > - > - free =3D sum_zone_node_page_state(pgdat->node_id, NR_FREE= _PAGES); > - file =3D node_page_state(pgdat, NR_ACTIVE_FILE) + > - node_page_state(pgdat, NR_INACTIVE_FILE); > - > - for (z =3D 0; z < MAX_NR_ZONES; z++) { > - struct zone *zone =3D &pgdat->node_zones[z]; > - if (!managed_zone(zone)) > - continue; > - > - total_high_wmark +=3D high_wmark_pages(zone); > - } > - > - /* > - * Consider anon: if that's low too, this isn't a > - * runaway file reclaim problem, but rather just > - * extreme pressure. Reclaim as per usual then. > - */ > - anon =3D node_page_state(pgdat, NR_INACTIVE_ANON); > - > - sc->file_is_tiny =3D > - file + free <=3D total_high_wmark && > - !(sc->may_deactivate & DEACTIVATE_ANON) && > - anon >> sc->priority; > - } > + prepare_scan_count(pgdat, sc); > > shrink_node_memcgs(pgdat, sc); > > -- > 2.35.1.616.g0bdcbb4464-goog > Thanks Barry