From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74C8BC55178 for ; Thu, 5 Nov 2020 16:24:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9D25D20936 for ; Thu, 5 Nov 2020 16:24:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="1kBqI9Gz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D25D20936 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 01D226B0141; Thu, 5 Nov 2020 11:24:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F0F3F6B0142; Thu, 5 Nov 2020 11:24:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFD676B0143; Thu, 5 Nov 2020 11:24:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id B35B86B0141 for ; Thu, 5 Nov 2020 11:24:07 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4C2A2824999B for ; Thu, 5 Nov 2020 16:24:07 +0000 (UTC) X-FDA: 77450886534.11.tooth66_530e3e9272cb Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 1E690180F8B80 for ; Thu, 5 Nov 2020 16:24:07 +0000 (UTC) X-HE-Tag: tooth66_530e3e9272cb X-Filterd-Recvd-Size: 5294 Received: from mail-qt1-f195.google.com (mail-qt1-f195.google.com [209.85.160.195]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Thu, 5 Nov 2020 16:24:06 +0000 (UTC) Received: by mail-qt1-f195.google.com with SMTP id g17so1475445qts.5 for ; Thu, 05 Nov 2020 08:24:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+AZbw9hIYEXTKi8V9/g5fzMpjCzQuYoApMGWT9XrxgE=; b=1kBqI9Gz9yM60yTpCEEfo2G3CsXCjHWE544dvnxyE6pnpwtglivU6vg4oWbt6ZdhiO UZW6A7mi4jKXYfqzFVR8uQTUsVlVKYFwM9FBPrN6R1HkaC0T3qg+d3WKAZ8hpvn7sVcz k2GTTA3izqlIYOrZvmdbDOT8e0xmBomYsZrAUKRfut2UP0K4N4JC5uyfrW+KLTja53MJ QgWYRDIv+mroZbRrOp03FUohF++4p8wRcS5MTUuKvEeeUVfS3fsjWVYAEfOMA8HRhzeB AwLxOOGMwvA+5zQU/z4XGE4lsUzRa9YkUw5HXMyBxxTtdEUr/9620Gc3YWu8vp/HgkK8 l9iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+AZbw9hIYEXTKi8V9/g5fzMpjCzQuYoApMGWT9XrxgE=; b=mOP5xgXBkRq3IC8eAvVMMOeIwFSNkAzoB7KGPP4xj8CSbXmVXdY67JB5sXpI5/4dIy YF3Iyq9whlWzfxtupF65Tw0jPfGac8Nvq8mDI7CaoRgZd2Dvp5CU3c3nrDStGtaJsR0c V7YaiHrPS7PKI9mWP11EBdLVo2/a8kCx+WpwghS2SDwgNx/kgEtaKwWQrCI2jcGgmeid Kaety1z1fZsNyJacRxvmrt9bFdtGW/8G4Q1VFSMmD2ygWB526jJuFjkYWwua3F6LkAPD WYP9NDmqWzsqGussYsAmX9HJXokr4zbksF0FiXSm/1IVuBx/zprE+vZ0A8+CVGSq8Jjj mDlg== X-Gm-Message-State: AOAM531niRP/Xc2qp6KK/mw1OUPkfKp1LrhlWnr6KglQtCl7mT5blZqA lN17ZP3xiRGvTRronLgyhf8pZQ== X-Google-Smtp-Source: ABdhPJxMDDkEw1/zxEIQQFC0WuQs7IdXdtc8/53ej+YAi8WIN/Py93udSej/IM9XeJRsqanwOFcbig== X-Received: by 2002:ac8:74c:: with SMTP id k12mr2619499qth.32.1604593445664; Thu, 05 Nov 2020 08:24:05 -0800 (PST) Received: from localhost ([2620:10d:c091:480::1:fc05]) by smtp.gmail.com with ESMTPSA id w45sm1230993qtw.96.2020.11.05.08.24.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Nov 2020 08:24:04 -0800 (PST) Date: Thu, 5 Nov 2020 11:22:19 -0500 From: Johannes Weiner To: Yafang Shao Cc: akpm@linux-foundation.org, mhocko@suse.com, minchan@kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES Message-ID: <20201105162219.GG744831@cmpxchg.org> References: <20201105131012.82457-1-laoar.shao@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201105131012.82457-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 05, 2020 at 09:10:12PM +0800, Yafang Shao wrote: > The memory utilization (Used / Total) is used to monitor the memory > pressure by us. If it is too high, it means the system may be OOM sooner > or later when swap is off, then we will make adjustment on this system. > > However, this method is broken since MADV_FREE is introduced, because > these lazily free anonymous can be reclaimed under memory pressure while > they are still accounted in NR_ANON_MAPPED. > > Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into > LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved > from anon lru list into file lru list. That means > (Inactive(file) + Active(file)) may be much larger than Cached in > /proc/meminfo. That makes our users confused. > > So we'd better account the lazily freed anonoymous pages in > NR_FILE_PAGES as well. What about the share of pages that have been reused? After all, the idea behind deferred reclaim is cheap reuse of already allocated and faulted in pages. Anywhere between 0% and 100% of MADV_FREEd pages may be dirty and need swap-out to reclaim. That means even after this patch, your formula would still have an error margin of 100%. The tradeoff with saving the reuse fault and relying on the MMU is that the kernel simply *cannot do* lazy free accounting. Userspace needs to do it. E.g. if a malloc implementation or similar uses MADV_FREE, it has to keep track of what is and isn't used and make those stats available. If that's not practical, I don't see an alternative to trapping minor faults upon page reuse, eating the additional TLB flush, and doing the accounting properly inside the kernel. > @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page) > if (unlikely(PageMlocked(page))) > clear_page_mlock(page); > > - if (nr) > - __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); > + if (nr) { > + if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) && > + !PageSwapCache(page) && !PageUnevictable(page)) > + __mod_lruvec_page_state(page, NR_FILE_PAGES, -nr); > + else > + __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); I don't think this would work. The page can be temporarily off-LRU for compaction, migration, reclaim etc. and then you'd misaccount it here.