linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Linux MM <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/4] memcg, inode: protect page cache from freeing inode
Date: Tue, 17 Dec 2019 11:54:22 -0500	[thread overview]
Message-ID: <20191217165422.GA213613@cmpxchg.org> (raw)
In-Reply-To: <CALOAHbBQ+XkQk6HN53O4e1=qfFiow2kvQO3ajDj=fwQEhcZ3uw@mail.gmail.com>

CCing Dave

On Tue, Dec 17, 2019 at 08:19:08PM +0800, Yafang Shao wrote:
> On Tue, Dec 17, 2019 at 7:56 PM Michal Hocko <mhocko@kernel.org> wrote:
> > What do you mean by this exactly. Are those inodes reclaimed by the
> > regular memory reclaim or by other means? Because shrink_node does
> > exclude shrinking slab for protected memcgs.
> 
> By the regular memory reclaim, kswapd, direct reclaimer or memcg reclaimer.
> IOW, the current->reclaim_state it set.
> 
> Take an example for you.
> 
> kswapd
>     balance_pgdat
>         shrink_node_memcgs
>             switch (mem_cgroup_protected)  <<<< memory.current= 1024M
> memory.min = 512M a file has 800M page caches
>                 case MEMCG_PROT_NONE:  <<<< hard limit is not reached.
>                       beak;
>             shrink_lruvec
>             shrink_slab <<< it may free the inode and the free all its
> page caches (800M)

This problem exists independent of cgroup protection.

The inode shrinker may take down an inode that's still holding a ton
of (potentially active) page cache pages when the inode hasn't been
referenced recently.

IMO we shouldn't be dropping data that the VM still considers hot
compared to other data, just because the inode object hasn't been used
as recently as other inode objects (e.g. drowned in a stream of
one-off inode accesses).

I've carried the below patch in my private tree for testing cache
aging decisions that the shrinker interfered with. (It would be nicer
if page cache pages could pin the inode of course, but reclaim cannot
easily participate in the inode refcounting scheme.)

Thoughts?

diff --git a/fs/inode.c b/fs/inode.c
index fef457a42882..bfcaaaf6314f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -753,7 +753,13 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
 		return LRU_ROTATE;
 	}
 
-	if (inode_has_buffers(inode) || inode->i_data.nrpages) {
+	/* Leave the pages to page reclaim */
+	if (inode->i_data.nrpages) {
+		spin_unlock(&inode->i_lock);
+		return LRU_ROTATE;
+	}
+
+	if (inode_has_buffers(inode)) {
 		__iget(inode);
 		spin_unlock(&inode->i_lock);
 		spin_unlock(lru_lock);


  reply	other threads:[~2019-12-17 16:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 11:29 Yafang Shao
2019-12-17 11:29 ` [PATCH 1/4] mm, memcg: reduce size of struct mem_cgroup by using bit field Yafang Shao
2019-12-17 11:29 ` [PATCH 2/4] mm, memcg: introduce MEMCG_PROT_SKIP for memcg zero usage case Yafang Shao
2019-12-17 11:29 ` [PATCH 3/4] mm, memcg: reset memcg's memory.{min, low} for reclaiming itself Yafang Shao
2019-12-17 14:20   ` Chris Down
2019-12-18  1:13     ` Yafang Shao
2019-12-17 11:29 ` [PATCH 4/4] memcg, inode: protect page cache from freeing inode Yafang Shao
2019-12-18  2:21   ` Dave Chinner
2019-12-18  2:33     ` Yafang Shao
2019-12-18 17:53   ` Roman Gushchin
2019-12-19  1:45     ` Yafang Shao
2019-12-17 11:56 ` [PATCH 0/4] " Michal Hocko
2019-12-17 12:19   ` Yafang Shao
2019-12-17 16:54     ` Johannes Weiner [this message]
2019-12-18  1:17       ` Yafang Shao
2019-12-18  1:37       ` Andrew Morton
2019-12-18  1:51       ` Dave Chinner
2019-12-18  4:37         ` Johannes Weiner
2019-12-18 10:16           ` Dave Chinner
2019-12-18 21:38             ` Johannes Weiner
2019-12-19  2:04               ` Yafang Shao
2020-01-10  2:08               ` Dave Chinner
2019-12-18 17:27       ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191217165422.GA213613@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox