linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: akpm@linux-foundation.org, minchan@kernel.org,
	hannes@cmpxchg.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES
Date: Thu, 5 Nov 2020 14:35:36 +0100	[thread overview]
Message-ID: <20201105133536.GJ21348@dhcp22.suse.cz> (raw)
In-Reply-To: <20201105131012.82457-1-laoar.shao@gmail.com>

On Thu 05-11-20 21:10:12, Yafang Shao wrote:
> The memory utilization (Used / Total) is used to monitor the memory
> pressure by us. If it is too high, it means the system may be OOM sooner
> or later when swap is off, then we will make adjustment on this system.
> 
> However, this method is broken since MADV_FREE is introduced, because
> these lazily free anonymous can be reclaimed under memory pressure while
> they are still accounted in NR_ANON_MAPPED.
> 
> Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into
> LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved
> from anon lru list into file lru list. That means
> (Inactive(file) + Active(file)) may be much larger than Cached in
> /proc/meminfo. That makes our users confused.
> 
> So we'd better account the lazily freed anonoymous pages in
> NR_FILE_PAGES as well.

Can you simply subtract lazyfree pages in the userspace? I am afraid your
patch just makes the situation even more muddy. NR_ANON_MAPPED is really
meant to tell how many anonymous pages are mapped. And MADV_FREE pages
are mapped until they are freed. NR_*_FILE are reflecting size of LRU
lists and NR_FILE_PAGES reflects the number of page cache pages but
madvfree pages are not a page cache. They are aged together with file
pages but they are not the same thing. Same like shmem pages are page
cache that is living on anon LRUs.

Confusing? Tricky? Yes, likely. But I do not think we want to bend those
counters even further.

> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memcontrol.c | 11 +++++++++--
>  mm/rmap.c       | 26 ++++++++++++++++++--------
>  mm/swap.c       |  2 ++
>  mm/vmscan.c     |  2 ++
>  4 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3dcbf24d2227..217a6f10fa8d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5659,8 +5659,15 @@ static int mem_cgroup_move_account(struct page *page,
>  
>  	if (PageAnon(page)) {
>  		if (page_mapped(page)) {
> -			__mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> -			__mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> +			if (!PageSwapBacked(page) && !PageSwapCache(page) &&
> +			    !PageUnevictable(page)) {
> +				__mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages);
> +				__mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages);
> +			} else {
> +				__mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> +				__mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> +			}
> +
>  			if (PageTransHuge(page)) {
>  				__mod_lruvec_state(from_vec, NR_ANON_THPS,
>  						   -nr_pages);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1b84945d655c..690ca7ff2392 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page)
>  	if (unlikely(PageMlocked(page)))
>  		clear_page_mlock(page);
>  
> -	if (nr)
> -		__mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> +	if (nr) {
> +		if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> +		    !PageSwapCache(page) && !PageUnevictable(page))
> +			__mod_lruvec_page_state(page, NR_FILE_PAGES, -nr);
> +		else
> +			__mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> +	}
>  }
>  
>  /**
> @@ -1341,12 +1346,17 @@ void page_remove_rmap(struct page *page, bool compound)
>  	if (!atomic_add_negative(-1, &page->_mapcount))
>  		goto out;
>  
> -	/*
> -	 * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> -	 * these counters are not modified in interrupt context, and
> -	 * pte lock(a spinlock) is held, which implies preemption disabled.
> -	 */
> -	__dec_lruvec_page_state(page, NR_ANON_MAPPED);
> +	if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> +	    !PageSwapCache(page) && !PageUnevictable(page)) {
> +		__dec_lruvec_page_state(page, NR_FILE_PAGES);
> +	} else {
> +		/*
> +		 * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> +		 * these counters are not modified in interrupt context, and
> +		 * pte lock(a spinlock) is held, which implies preemption disabled.
> +		 */
> +		__dec_lruvec_page_state(page, NR_ANON_MAPPED);
> +	}
>  
>  	if (unlikely(PageMlocked(page)))
>  		clear_page_mlock(page);
> diff --git a/mm/swap.c b/mm/swap.c
> index 47a47681c86b..340c5276a0f3 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -601,6 +601,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
>  
>  		del_page_from_lru_list(page, lruvec,
>  				       LRU_INACTIVE_ANON + active);
> +		__mod_lruvec_state(lruvec, NR_ANON_MAPPED, -nr_pages);
>  		ClearPageActive(page);
>  		ClearPageReferenced(page);
>  		/*
> @@ -610,6 +611,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
>  		 */
>  		ClearPageSwapBacked(page);
>  		add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> +		__mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_pages);
>  
>  		__count_vm_events(PGLAZYFREE, nr_pages);
>  		__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 1b8f0e059767..4821124c70f7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1428,6 +1428,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  				goto keep_locked;
>  			}
>  
> +			mod_lruvec_page_state(page, NR_ANON_MAPPED, nr_pages);
> +			mod_lruvec_page_state(page, NR_FILE_PAGES, -nr_pages);
>  			count_vm_event(PGLAZYFREED);
>  			count_memcg_page_event(page, PGLAZYFREED);
>  		} else if (!mapping || !__remove_mapping(mapping, page, true,
> -- 
> 2.18.4
> 

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2020-11-05 13:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-05 13:10 Yafang Shao
2020-11-05 13:35 ` Michal Hocko [this message]
2020-11-05 14:16   ` Yafang Shao
2020-11-05 15:22     ` Michal Hocko
2020-11-05 17:47       ` Michal Hocko
2020-11-05 15:18 ` Vlastimil Babka
2020-11-06  1:57   ` Yafang Shao
2020-11-05 16:22 ` Johannes Weiner
2020-11-06  2:09   ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201105133536.GJ21348@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox