From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19441C4742C for ; Thu, 5 Nov 2020 14:16:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4F86C2071A for ; Thu, 5 Nov 2020 14:16:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WnmIS7ql" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4F86C2071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A7BA76B0123; Thu, 5 Nov 2020 09:16:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A2CA66B0124; Thu, 5 Nov 2020 09:16:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 942046B0125; Thu, 5 Nov 2020 09:16:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0047.hostedemail.com [216.40.44.47]) by kanga.kvack.org (Postfix) with ESMTP id 51EDD6B0123 for ; Thu, 5 Nov 2020 09:16:48 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E7E92181AC9CB for ; Thu, 5 Nov 2020 14:16:47 +0000 (UTC) X-FDA: 77450565654.07.week46_530b296272ca Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id D0D491803F9A8 for ; Thu, 5 Nov 2020 14:16:47 +0000 (UTC) X-HE-Tag: week46_530b296272ca X-Filterd-Recvd-Size: 10007 Received: from mail-il1-f194.google.com (mail-il1-f194.google.com [209.85.166.194]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Thu, 5 Nov 2020 14:16:47 +0000 (UTC) Received: by mail-il1-f194.google.com with SMTP id p10so1508703ile.3 for ; Thu, 05 Nov 2020 06:16:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=v/SpFOspr5R5iFml4e7LMPnMkQqSJ8KwbnAZ8NbyJ5k=; b=WnmIS7qlXVhHZAD0XVgSdx4/a2C32nTfApJtRmHuGOuoMXGMQq4urKFZp5CrVfRaeZ XQ30qHyZpgdunrrRQQ7Pu2azvmmGJN3XrDHTDVrUv/wQLfVisCgXnq6pINyJcZAPueug c/cnRvnNyMOgGtRPeUhY7T4s69daS6m/4lB2oGyNguswIvbXPKjzV80OPaiVz9tGumcb a5UZMWezKdBcC0VQFJn/iQoHCIOqJw16uYs0Ki7e5IhdqtbIR+VRi1LU2JHXlf7E1qnS cENblmyjlxyPjuEDlOIeodjXr6/ZfLnySGny4j6RpmxRjnP+rWrEDHOVkOiUeRVKNcWa WgIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=v/SpFOspr5R5iFml4e7LMPnMkQqSJ8KwbnAZ8NbyJ5k=; b=SKD14oPnY/BRF3YIF6At9iByob6iP4N9bZe8CGd+7scU5vd6+ZqgxyHdot22D/UWm6 dKhsCEy2zkA74yMIAJg+YIAlPs5HEtRc5NONL1RNw6C65Z5Chc8U2cUccJ5atC5xh2Xk Ox6BvxQ6bFt1g2YwHlo5Ce+uBUpOCd06EKirkCBvlrNL8VcgK5tqJVyAQJibPBvfIPtS fCCamE0TkjVZduA1X17mrPKsedXHREWBqE8LXUOKdZegLETBsI1MU8Pna7LV04Ac2iIT uRYkrC9/FcwvwVBPhweYo+JIdvpjQ7pQHsz+u7hwrGZ5UMbTtWYCi+w8/RRZmS6t5URq BYHA== X-Gm-Message-State: AOAM532ROHYkvQMigs3Q5eU4Mq/1R22UaLHQ7ruLvR0HqRtjJacxA9nh gFNNnmQkpeJHTQ1vAT3oBbV4tht1ZAHNKNZpuf8= X-Google-Smtp-Source: ABdhPJwjiVWmR0sy3mNXjroPoBose6ifjSzh8Qkh+P6X2YVcapInbMqkpheYLyagatPDyRM6zdMDsyT+/vFnQuw0dvg= X-Received: by 2002:a92:6703:: with SMTP id b3mr1691136ilc.142.1604585806629; Thu, 05 Nov 2020 06:16:46 -0800 (PST) MIME-Version: 1.0 References: <20201105131012.82457-1-laoar.shao@gmail.com> <20201105133536.GJ21348@dhcp22.suse.cz> In-Reply-To: <20201105133536.GJ21348@dhcp22.suse.cz> From: Yafang Shao Date: Thu, 5 Nov 2020 22:16:10 +0800 Message-ID: Subject: Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES To: Michal Hocko Cc: Andrew Morton , minchan@kernel.org, Johannes Weiner , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 5, 2020 at 9:35 PM Michal Hocko wrote: > > On Thu 05-11-20 21:10:12, Yafang Shao wrote: > > The memory utilization (Used / Total) is used to monitor the memory > > pressure by us. If it is too high, it means the system may be OOM sooner > > or later when swap is off, then we will make adjustment on this system. > > > > However, this method is broken since MADV_FREE is introduced, because > > these lazily free anonymous can be reclaimed under memory pressure while > > they are still accounted in NR_ANON_MAPPED. > > > > Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into > > LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved > > from anon lru list into file lru list. That means > > (Inactive(file) + Active(file)) may be much larger than Cached in > > /proc/meminfo. That makes our users confused. > > > > So we'd better account the lazily freed anonoymous pages in > > NR_FILE_PAGES as well. > > Can you simply subtract lazyfree pages in the userspace? Could you pls. tell me how to subtract lazyfree pages in the userspace? Pls. note that we can't use (pglazyfree - pglazyfreed) because pglazyfreed is only counted in the regular reclaim path while the process exit path is not counted, that means we have to introduce another counter like LazyPage.... > I am afraid your > patch just makes the situation even more muddy. NR_ANON_MAPPED is really > meant to tell how many anonymous pages are mapped. And MADV_FREE pages > are mapped until they are freed. NR_*_FILE are reflecting size of LRU > lists and NR_FILE_PAGES reflects the number of page cache pages but > madvfree pages are not a page cache. They are aged together with file > pages but they are not the same thing. Same like shmem pages are page > cache that is living on anon LRUs. > > Confusing? Tricky? Yes, likely. But I do not think we want to bend those > counters even further. > > > Signed-off-by: Yafang Shao > > Cc: Minchan Kim > > Cc: Johannes Weiner > > Cc: Michal Hocko > > --- > > mm/memcontrol.c | 11 +++++++++-- > > mm/rmap.c | 26 ++++++++++++++++++-------- > > mm/swap.c | 2 ++ > > mm/vmscan.c | 2 ++ > > 4 files changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 3dcbf24d2227..217a6f10fa8d 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -5659,8 +5659,15 @@ static int mem_cgroup_move_account(struct page *page, > > > > if (PageAnon(page)) { > > if (page_mapped(page)) { > > - __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > > - __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > > + if (!PageSwapBacked(page) && !PageSwapCache(page) && > > + !PageUnevictable(page)) { > > + __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); > > + __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages); > > + } else { > > + __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > > + __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > > + } > > + > > if (PageTransHuge(page)) { > > __mod_lruvec_state(from_vec, NR_ANON_THPS, > > -nr_pages); > > diff --git a/mm/rmap.c b/mm/rmap.c > > index 1b84945d655c..690ca7ff2392 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page) > > if (unlikely(PageMlocked(page))) > > clear_page_mlock(page); > > > > - if (nr) > > - __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); > > + if (nr) { > > + if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) && > > + !PageSwapCache(page) && !PageUnevictable(page)) > > + __mod_lruvec_page_state(page, NR_FILE_PAGES, -nr); > > + else > > + __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); > > + } > > } > > > > /** > > @@ -1341,12 +1346,17 @@ void page_remove_rmap(struct page *page, bool compound) > > if (!atomic_add_negative(-1, &page->_mapcount)) > > goto out; > > > > - /* > > - * We use the irq-unsafe __{inc|mod}_zone_page_stat because > > - * these counters are not modified in interrupt context, and > > - * pte lock(a spinlock) is held, which implies preemption disabled. > > - */ > > - __dec_lruvec_page_state(page, NR_ANON_MAPPED); > > + if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) && > > + !PageSwapCache(page) && !PageUnevictable(page)) { > > + __dec_lruvec_page_state(page, NR_FILE_PAGES); > > + } else { > > + /* > > + * We use the irq-unsafe __{inc|mod}_zone_page_stat because > > + * these counters are not modified in interrupt context, and > > + * pte lock(a spinlock) is held, which implies preemption disabled. > > + */ > > + __dec_lruvec_page_state(page, NR_ANON_MAPPED); > > + } > > > > if (unlikely(PageMlocked(page))) > > clear_page_mlock(page); > > diff --git a/mm/swap.c b/mm/swap.c > > index 47a47681c86b..340c5276a0f3 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -601,6 +601,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec, > > > > del_page_from_lru_list(page, lruvec, > > LRU_INACTIVE_ANON + active); > > + __mod_lruvec_state(lruvec, NR_ANON_MAPPED, -nr_pages); > > ClearPageActive(page); > > ClearPageReferenced(page); > > /* > > @@ -610,6 +611,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec, > > */ > > ClearPageSwapBacked(page); > > add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE); > > + __mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_pages); > > > > __count_vm_events(PGLAZYFREE, nr_pages); > > __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 1b8f0e059767..4821124c70f7 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1428,6 +1428,8 @@ static unsigned int shrink_page_list(struct list_head *page_list, > > goto keep_locked; > > } > > > > + mod_lruvec_page_state(page, NR_ANON_MAPPED, nr_pages); > > + mod_lruvec_page_state(page, NR_FILE_PAGES, -nr_pages); > > count_vm_event(PGLAZYFREED); > > count_memcg_page_event(page, PGLAZYFREED); > > } else if (!mapping || !__remove_mapping(mapping, page, true, > > -- > > 2.18.4 > > > > -- > Michal Hocko > SUSE Labs -- Thanks Yafang