linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Yafang Shao <laoar.shao@gmail.com>,
	akpm@linux-foundation.org, mhocko@suse.com, minchan@kernel.org,
	hannes@cmpxchg.org
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] mm: account lazily freed anon pages in NR_FILE_PAGES
Date: Thu, 5 Nov 2020 16:18:16 +0100	[thread overview]
Message-ID: <4c0a7ea6-4817-2dae-7473-8d0fe6110a45@suse.cz> (raw)
In-Reply-To: <20201105131012.82457-1-laoar.shao@gmail.com>

On 11/5/20 2:10 PM, Yafang Shao wrote:
> The memory utilization (Used / Total) is used to monitor the memory
> pressure by us. If it is too high, it means the system may be OOM sooner
> or later when swap is off, then we will make adjustment on this system.

Hmm I would say that any system looking just at memory utilization (Used / 
Total) and not looking at file lru size is flawed.
There's a reason MemAvailable exists, and does count file lru sizes.

> However, this method is broken since MADV_FREE is introduced, because
> these lazily free anonymous can be reclaimed under memory pressure while
> they are still accounted in NR_ANON_MAPPED.
> 
> Furthermore, since commit f7ad2a6cb9f7 ("mm: move MADV_FREE pages into
> LRU_INACTIVE_FILE list"), these lazily free anonymous pages are moved
> from anon lru list into file lru list. That means
> (Inactive(file) + Active(file)) may be much larger than Cached in
> /proc/meminfo. That makes our users confused.

Yeah the counters are tricky for multiple reasons as Michal said...

> So we'd better account the lazily freed anonoymous pages in
> NR_FILE_PAGES as well.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> ---
>   mm/memcontrol.c | 11 +++++++++--
>   mm/rmap.c       | 26 ++++++++++++++++++--------
>   mm/swap.c       |  2 ++
>   mm/vmscan.c     |  2 ++
>   4 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3dcbf24d2227..217a6f10fa8d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5659,8 +5659,15 @@ static int mem_cgroup_move_account(struct page *page,
>   
>   	if (PageAnon(page)) {
>   		if (page_mapped(page)) {
> -			__mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> -			__mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> +			if (!PageSwapBacked(page) && !PageSwapCache(page) &&
> +			    !PageUnevictable(page)) {
> +				__mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages);
> +				__mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages);
> +			} else {
> +				__mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages);
> +				__mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages);
> +			}
> +
>   			if (PageTransHuge(page)) {
>   				__mod_lruvec_state(from_vec, NR_ANON_THPS,
>   						   -nr_pages);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1b84945d655c..690ca7ff2392 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1312,8 +1312,13 @@ static void page_remove_anon_compound_rmap(struct page *page)
>   	if (unlikely(PageMlocked(page)))
>   		clear_page_mlock(page);
>   
> -	if (nr)
> -		__mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> +	if (nr) {
> +		if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> +		    !PageSwapCache(page) && !PageUnevictable(page))
> +			__mod_lruvec_page_state(page, NR_FILE_PAGES, -nr);
> +		else
> +			__mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr);
> +	}
>   }
>   
>   /**
> @@ -1341,12 +1346,17 @@ void page_remove_rmap(struct page *page, bool compound)
>   	if (!atomic_add_negative(-1, &page->_mapcount))
>   		goto out;
>   
> -	/*
> -	 * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> -	 * these counters are not modified in interrupt context, and
> -	 * pte lock(a spinlock) is held, which implies preemption disabled.
> -	 */
> -	__dec_lruvec_page_state(page, NR_ANON_MAPPED);
> +	if (PageLRU(page) && PageAnon(page) && !PageSwapBacked(page) &&
> +	    !PageSwapCache(page) && !PageUnevictable(page)) {
> +		__dec_lruvec_page_state(page, NR_FILE_PAGES);
> +	} else {
> +		/*
> +		 * We use the irq-unsafe __{inc|mod}_zone_page_stat because
> +		 * these counters are not modified in interrupt context, and
> +		 * pte lock(a spinlock) is held, which implies preemption disabled.
> +		 */
> +		__dec_lruvec_page_state(page, NR_ANON_MAPPED);
> +	}
>   
>   	if (unlikely(PageMlocked(page)))
>   		clear_page_mlock(page);
> diff --git a/mm/swap.c b/mm/swap.c
> index 47a47681c86b..340c5276a0f3 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -601,6 +601,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
>   
>   		del_page_from_lru_list(page, lruvec,
>   				       LRU_INACTIVE_ANON + active);
> +		__mod_lruvec_state(lruvec, NR_ANON_MAPPED, -nr_pages);
>   		ClearPageActive(page);
>   		ClearPageReferenced(page);
>   		/*
> @@ -610,6 +611,7 @@ static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
>   		 */
>   		ClearPageSwapBacked(page);
>   		add_page_to_lru_list(page, lruvec, LRU_INACTIVE_FILE);
> +		__mod_lruvec_state(lruvec, NR_FILE_PAGES, nr_pages);
>   
>   		__count_vm_events(PGLAZYFREE, nr_pages);
>   		__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 1b8f0e059767..4821124c70f7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1428,6 +1428,8 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>   				goto keep_locked;
>   			}
>   
> +			mod_lruvec_page_state(page, NR_ANON_MAPPED, nr_pages);
> +			mod_lruvec_page_state(page, NR_FILE_PAGES, -nr_pages);
>   			count_vm_event(PGLAZYFREED);
>   			count_memcg_page_event(page, PGLAZYFREED);
>   		} else if (!mapping || !__remove_mapping(mapping, page, true,
> 



  parent reply	other threads:[~2020-11-05 15:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-05 13:10 Yafang Shao
2020-11-05 13:35 ` Michal Hocko
2020-11-05 14:16   ` Yafang Shao
2020-11-05 15:22     ` Michal Hocko
2020-11-05 17:47       ` Michal Hocko
2020-11-05 15:18 ` Vlastimil Babka [this message]
2020-11-06  1:57   ` Yafang Shao
2020-11-05 16:22 ` Johannes Weiner
2020-11-06  2:09   ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c0a7ea6-4817-2dae-7473-8d0fe6110a45@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox