From: Johannes Weiner <hannes@cmpxchg.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: akpm@linux-foundation.org, mhocko@suse.com, minchan@kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES
Date: Thu, 5 Nov 2020 11:22:19 -0500 [thread overview]
Message-ID: <20201105162219.GG744831@cmpxchg.org> (raw)
In-Reply-To: <20201105131012.82457-1-laoar.shao@gmail.com>
On Thu, Nov 05, 2020 at 09:10:12PM +0800, Yafang Shao wrote:
> The memory utilization (Used / Total) is used to monitor the memory
> pressure by us. If it is too high, it means the system may be OOM sooner
> or later when swap is off, then we will make adjustment on this system.
>
> However, this method is broken since MADV_FREE is introduced, because
> these lazily free anonymous can be reclaimed under memory pressure while
> they are still accounted in NR_ANON_MAPPED.
>
> Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into
> LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved
> from anon lru list into file lru list. That means
> (Inactive(file) + Active(file)) may be much larger than Cached in
> /proc/meminfo. That makes our users confused.
>
> So we'd better account the lazily freed anonoymous pages in
> NR_FILE_PAGES as well.
What about the share of pages that have been reused? After all, the
idea behind deferred reclaim is cheap reuse of already allocated and
faulted in pages.
Anywhere between 0% and 100% of MADV_FREEd pages may be dirty and need
swap-out to reclaim. That means even after this patch, your formula
would still have an error margin of 100%.
The tradeoff with saving the reuse fault and relying on the MMU is
that the kernel simply *cannot do* lazy free accounting. Userspace
needs to do it. E.g. if a malloc implementation or similar uses
MADV_FREE, it has to keep track of what is and isn't used and make
those stats available.
If that's not practical, I don't see an alternative to trapping minor
faults upon page reuse, eating the additional TLB flush, and doing the
accounting properly inside the kernel.
> @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page)
> if (unlikely(PageMlocked(page)))
> clear_page_mlock(page);
>
> - if (nr)
> - __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> + if (nr) {
> + if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> + !PageSwapCache(page) && !PageUnevictable(page))
> + __mod_lruvec_page_state(page, NR_FILE_PAGES, -nr);
> + else
> + __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
I don't think this would work. The page can be temporarily off-LRU for
compaction, migration, reclaim etc. and then you'd misaccount it here.
next prev parent reply other threads:[~2020-11-05 16:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-05 13:10 Yafang Shao
2020-11-05 13:35 ` Michal Hocko
2020-11-05 14:16 ` Yafang Shao
2020-11-05 15:22 ` Michal Hocko
2020-11-05 17:47 ` Michal Hocko
2020-11-05 15:18 ` Vlastimil Babka
2020-11-06 1:57 ` Yafang Shao
2020-11-05 16:22 ` Johannes Weiner [this message]
2020-11-06 2:09 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201105162219.GG744831@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox