Re: [PATCH] mm: workingset: ignore slab memory size when calculating shadows pressure

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, =Shakeel Butt <shakeelb@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	kernel-team@fb.com, linux-kernel@vger.kernel.org,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH] mm: workingset: ignore slab memory size when calculating shadows pressure
Date: Wed, 9 Sep 2020 10:55:34 -0400	[thread overview]
Message-ID: <20200909145534.GA100698@cmpxchg.org> (raw)
In-Reply-To: <20200903230055.1245058-1-guro@fb.com>

On Thu, Sep 03, 2020 at 04:00:55PM -0700, Roman Gushchin wrote:
> In the memcg case count_shadow_nodes() sums the number of pages in lru
> lists and the amount of slab memory (reclaimable and non-reclaimable)
> as a baseline for the allowed number of shadow entries.
> 
> It seems to be a good analogy for the !memcg case, where
> node_present_pages() is used. However, it's not quite true, as there
> two problems:
> 
> 1) Due to slab reparenting introduced by commit fb2f2b0adb98 ("mm:
> memcg/slab: reparent memcg kmem_caches on cgroup removal") local
> per-lruvec slab counters might be inaccurate on non-leaf levels.
> It's the only place where local slab counters are used.

Hm, that sounds like a bug tbh. We're reparenting the kmem caches and
the individual objects on the list_lru when a cgroup is removed -
shouldn't we also reparent the corresponding memory counters?

> 2) Shadow nodes by themselves are backed by slabs. So there is a loop
> dependency: the more shadow entries are there, the less pressure the
> kernel applies to reclaim them.

This effect is negligible in practice.

The permitted shadow nodes are a tiny percentage of memory consumed by
the cgroup. If shadow nodes make up a significant part of the cgroup's
footprint, or are the only thing left, they will be pushed out fast.

The formula is max_nodes = total_pages >> 3, and one page can hold 28
nodes. So if the cgroup holds nothing but 262,144 pages (1G) of shadow
nodes, the shrinker target is 32,768 nodes, which is 32,768 pages
(128M) in the worst packing case and 1,170 pages (4M) at best.

However, if you don't take slab into account here, it can evict shadow
entries with undue aggression when they are needed the most. If, say,
the inode or dentry cache explode temporarily and displace the page
cache, it would be a big problem to drop the cache's non-resident info
at the same time! This is when it's at its most important.

Let's drop this patch, please.

next prev parent reply	other threads:[~2020-09-09 14:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 23:00 Roman Gushchin
2020-09-04  4:10 ` Andrew Morton
2020-09-04  5:02   ` Roman Gushchin
2020-09-09 14:55 ` Johannes Weiner [this message]
2020-09-09 16:55   ` Roman Gushchin
2020-09-10 17:50     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200909145534.GA100698@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox