linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "T.J. Mercier" <tjmercier@google.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Shakeel Butt <shakeelb@google.com>,
	Muchun Song <muchun.song@linux.dev>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Efly Young <yangyifei03@kuaishou.com>,
	 android-mm@google.com, yuzhao@google.com, mkoutny@suse.com,
	 Yosry Ahmed <yosryahmed@google.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] mm: memcg: Use larger batches for proactive reclaim
Date: Mon, 5 Feb 2024 20:01:40 -0800	[thread overview]
Message-ID: <CABdmKX35GV3VFar0_pNR_vAXLpvxo+APALXMharsXh6TO+0mrQ@mail.gmail.com> (raw)
In-Reply-To: <ZcFQMru5_oATGbuP@tiehlicka>

On Mon, Feb 5, 2024 at 1:16 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Mon 05-02-24 12:47:47, T.J. Mercier wrote:
> > On Mon, Feb 5, 2024 at 12:36 PM Michal Hocko <mhocko@suse.com> wrote:
> [...]
> > > This of something like
> > > timeout $TIMEOUT echo $TARGET > $MEMCG_PATH/memory.reclaim
> > > where timeout acts as a stop gap if the reclaim cannot finish in
> > > TIMEOUT.
> >
> > Yeah I get the desired behavior, but using sc->nr_reclaimed to achieve
> > it is what's bothering me.
>
> I am not really happy about this subtlety. If we have a better way then
> let's do it. Better in its own patch, though.
>
> > It's already wired up that way though, so if you want to make this
> > change now then I can try to test for the difference using really
> > large reclaim targets.
>
> Yes, please. If you want it a separate patch then no objection from me
> of course. If you do no like the nr_to_reclaim bailout then maybe we can
> go with a simple break out flag in scan_control.
>
> Thanks!

It's a bit difficult to test under the too_many_isolated check, so I
moved the fatal_signal_pending check outside and tried with that.
Performing full reclaim on the /uid_0 cgroup with a 250ms delay before
SIGKILL, I got an average of 16ms better latency with
sc->nr_to_reclaim across 20 runs ignoring one 1s outlier with
SWAP_CLUSTER_MAX.

The return values from memory_reclaim are different since with
sc->nr_to_reclaim we "succeed" and don't reach the signal_pending
check to return -EINTR, but I don't think it matters since the return
code is 137 (SIGKILL) in both cases.

With SWAP_CLUSTER_MAX there was an outlier at nearly 1s, and in
general the latency numbers were noiser: 2% RSD vs 13% RSD. I'm
guessing that's a function of nr_to_scan being occasionally much less
than SWAP_CLUSTER_MAX causing nr[lru] to drain slowly. But it could
also have simply been scheduled out more often at the cond_resched in
shrink_lruvec, and that would help explain the 1s outlier. I don't
have enough debug info on the outlier to say much more.

With sc->nr_to_reclaim, the largest sc->nr_reclaimed value I saw was
about 2^53 for a sc->nr_to_reclaim of 2^51, but for large memcg
hierarchies I think it's possible to get more than that. There were
only 15 cgroups under /uid_0. This is the only thing that gives me
pause, since we could touch more than 2k cgroups in
shrink_node_memcgs, each one adding 4 * 2^51, potentially overflowing
sc->nr_to_reclaim. Looks testable but I didn't get to it.


  reply	other threads:[~2024-02-06  4:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 23:38 T.J. Mercier
2024-02-04 16:17 ` Shakeel Butt
2024-02-05 10:01 ` Michal Koutný
2024-02-05 10:40 ` Michal Hocko
2024-02-05 19:29   ` T.J. Mercier
2024-02-05 19:40     ` Michal Hocko
2024-02-05 20:26       ` T.J. Mercier
2024-02-05 20:36         ` Michal Hocko
2024-02-05 20:47           ` T.J. Mercier
2024-02-05 21:16             ` Michal Hocko
2024-02-06  4:01               ` T.J. Mercier [this message]
2024-02-06  8:58                 ` Michal Hocko
2024-02-19 12:11                   ` Michal Hocko
2024-02-19 16:39                     ` T.J. Mercier
2024-02-19 19:33                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABdmKX35GV3VFar0_pNR_vAXLpvxo+APALXMharsXh6TO+0mrQ@mail.gmail.com \
    --to=tjmercier@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=android-mm@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=yangyifei03@kuaishou.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox