linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "T.J. Mercier" <tjmercier@google.com>
To: hailong <hailong.liu@oppo.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, yuzhao@google.com,
	21cnbao@gmail.com
Subject: Re: [RFC PATCH] mm/mglru: keep the root_memcg reclaim behavior the same as memcg reclaim
Date: Mon, 16 Dec 2024 09:13:46 -0800	[thread overview]
Message-ID: <CABdmKX1XRw3z9-vXKzin+Ee601vw2remSHheXxeVwv51r2Nxiw@mail.gmail.com> (raw)
In-Reply-To: <20241216015414.ujbwsr6mtwgo4goe@oppo.com>

On Sun, Dec 15, 2024 at 5:54 PM hailong <hailong.liu@oppo.com> wrote:
>
> On Fri, 13. Dec 09:06, T.J. Mercier wrote:
> > On Thu, Dec 12, 2024 at 6:26 PM hailong <hailong.liu@oppo.com> wrote:
> > >
> > > On Thu, 12. Dec 10:22, T.J. Mercier wrote:
> > > > On Thu, Dec 12, 2024 at 1:57 AM hailong <hailong.liu@oppo.com> wrote:
> > > > >
> > > > > From: Hailong Liu <hailong.liu@oppo.com>
> > > > >
> > > > > commit a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard") said
> > > > > Note that memcg LRU only applies to global reclaim. For memcg reclaim,
> > > > > the eviction will continue, even if it is overshooting. This becomes
> > > > > unconditional due to code simplification.
> > > > >
> > > > > Howeven, if we reclaim a root memcg by sysfs (memory.reclaim), the behavior acts
> > > > > as a kswapd or direct reclaim.
> > > >
> > > > Hi Hailong,
> > > >
> > > > Why do you think this is a problem?
> > > >
> > > > > Fix this by remove the condition of mem_cgroup_is_root in
> > > > > root_reclaim().
> > > > > Signed-off-by: Hailong Liu <hailong.liu@oppo.com>
> > > > > ---
> > > > >  mm/vmscan.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index 76378bc257e3..1f74f3ba0999 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -216,7 +216,7 @@ static bool cgroup_reclaim(struct scan_control *sc)
> > > > >   */
> > > > >  static bool root_reclaim(struct scan_control *sc)
> > > > >  {
> > > > > -       return !sc->target_mem_cgroup || mem_cgroup_is_root(sc->target_mem_cgroup);
> > > > > +       return !sc->target_mem_cgroup;
> > > > >  }
> > > > >
> > > > >  /**
> > > > > --
> > > > > Actually we switch to mglru on kernel-6.1 and see different behavior on
> > > > > root_mem_cgroup reclaim. so is there any background fot this?
> > > >
> > > > Reclaim behavior differs with MGLRU.
> > > > https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@google.com/
> > > >
> > > > On even more recent kernels, regular LRU reclaim has also changed.
> > > > https://lore.kernel.org/lkml/20240514202641.2821494-1-hannes@cmpxchg.org/
> > >
> > > Thanks for the details.
> > >
> > > Take this as a example.
> > >                root
> > >              /  |   \
> > >         /       |    \
> > >            a    b     c
> > >                     | \
> > >                     |  \
> > >                     d   e
> > > IIUC, the mglru can resolve the direct reclaim latency due to the
> > > sharding. However, for the proactive reclaim, if we want to reclaim
> > > b, b->d->e, however, if reclaiming the root, the reclaim path is
> > > uncertain. The call stack is as follows:
> > > lru_gen_shrink_node()->shrink_many()->hlist_nulls_for_each_entry_rcu()->shrink_one()
> > >
> > > So, for the proactive reclaim of root_memcg, whether it is mglru or
> > > regular lru, calling shrink_node_memcgs() makes the behavior certain
> > > and reasonable for me.
> >
> > The ordering is uncertain, but ordering has never been specified as
> > part of that interface AFAIK, and you'll still get what you ask for (X
> > bytes from the root or under). Assuming partial reclaim of a cgroup
> > (which I hope is true if you're reclaiming from the root?) if I have
> > the choice I'd rather have the memcg LRU ordering to try to reclaim
> > from colder memcgs first, rather than a static pre-order traversal
> > that always hits the same children first.
> >
> > The reason it's a choice only for the root is because the memcg LRU is
> > maintained at the pgdat level, not at each individual cgroup. So there
> > is no mechanism to get memcg LRU ordering from a subset of cgroups,
> > which would be pretty cool but that sounds expensive.
>
> Got it, thanks for clarifying. From the perspective of memcg, it
> behaves differently. But if we change the perspective to the global
> reclaim, it is reasonable because root memcg is another way of global
> reclaim. It makes global reclaim consistent. NACK myself :)

Yeah, that's another way to look at it. :)

> >
> > - T.J.
> >
> > > Help you, Help me,
> > > Hailong.
> --
> Help you, Help me,
> Hailong.


      reply	other threads:[~2024-12-16 17:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12  9:56 hailong
2024-12-12 18:22 ` T.J. Mercier
2024-12-13  2:26   ` hailong
2024-12-13 17:06     ` T.J. Mercier
2024-12-16  1:54       ` hailong
2024-12-16 17:13         ` T.J. Mercier [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABdmKX1XRw3z9-vXKzin+Ee601vw2remSHheXxeVwv51r2Nxiw@mail.gmail.com \
    --to=tjmercier@google.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hailong.liu@oppo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox