linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Rik van Riel <riel@surriel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	 Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	 cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  kernel-team@meta.com,
	Nhat Pham <nphamcs@gmail.com>
Subject: Re: [PATCH] memcg: allow exiting tasks to write back data to swap
Date: Wed, 11 Dec 2024 09:30:24 -0800	[thread overview]
Message-ID: <CAJD7tkZ9gSxdPUCgz_NaHSDPTC+HEhxNRbinp619sNSshScJ0A@mail.gmail.com> (raw)
In-Reply-To: <6bc895883abca3522c9efc0c56189741194581e5.camel@surriel.com>

On Wed, Dec 11, 2024 at 9:20 AM Rik van Riel <riel@surriel.com> wrote:
>
> On Wed, 2024-12-11 at 09:00 -0800, Yosry Ahmed wrote:
> > On Wed, Dec 11, 2024 at 8:34 AM Rik van Riel <riel@surriel.com>
> > wrote:
> > >
> > > On Wed, 2024-12-11 at 08:26 -0800, Yosry Ahmed wrote:
> > > > On Wed, Dec 11, 2024 at 7:54 AM Rik van Riel <riel@surriel.com>
> > > > wrote:
> > > > >
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -5371,6 +5371,15 @@ bool
> > > > > mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> > > > >         if (!zswap_is_enabled())
> > > > >                 return true;
> > > > >
> > > > > +       /*
> > > > > +        * Always allow exiting tasks to push data to swap. A
> > > > > process in
> > > > > +        * the middle of exit cannot get OOM killed, but may
> > > > > need
> > > > > to push
> > > > > +        * uncompressible data to swap in order to get the
> > > > > cgroup
> > > > > memory
> > > > > +        * use below the limit, and make progress with the
> > > > > exit.
> > > > > +        */
> > > > > +       if ((current->flags & PF_EXITING) && memcg ==
> > > > > mem_cgroup_from_task(current))
> > > > > +               return true;
> > > > > +
> > > >
> > > > I have a few questions:
> > > > (a) If the task is being OOM killed it should be able to charge
> > > > memory
> > > > beyond memory.max, so why do we need to get the usage down below
> > > > the
> > > > limit?
> > > >
> > > If it is a kernel directed memcg OOM kill, that is
> > > true.
> > >
> > > However, if the exit comes from somewhere else,
> > > like a userspace oomd kill, we might not hit that
> > > code path.
> >
> > Why do we treat dying tasks differently based on the source of the
> > kill?
> >
> Are you saying we should fail allocations for
> every dying task, and add a check for PF_EXITING
> in here?

I am asking, not really suggesting anything :)

Does it matter from the kernel perspective if the task is dying due to
a kernel OOM kill or a userspace SIGKILL?

>
>
>         if (unlikely(task_in_memcg_oom(current)))
>                 goto nomem;
>
>
> > > However, we don't know until the attempted zswap write
> > > whether the memory is compressible, and whether doing
> > > a bunch of zswap writes will help us bring our memcg
> > > down below its memory.max limit.
> >
> > If we are at memory.max (or memory.zswap.max), we can't compress
> > pages
> > into zswap anyway, regardless of their compressibility.
> >
> Wait, this is news to me.
>
> This seems like something we should fix, rather
> than live with, since compressing the data to
> a smaller size could bring us below memory.max.
>
> Is this "cannot compress when at memory.max"
> behavior intentional, or just a side effect of
> how things happen to be?
>
> Won't the allocations made from zswap_store
> ignore the memory.max limit because PF_MEMALLOC
> is set?

My bad, obj_cgroup_may_zswap() only checks the zswap limit, not
memory.max. Please ignore this.

The scenario I described where we scan the LRUs needlessly is if the
*zswap limit* is hit, and writeback is disabled. I am guessing this is
not the case you're running into.

So yeah my only outstanding question is the one above about handling
userspace OOM kills differently.

Thanks for bearing with me.


  reply	other threads:[~2024-12-11 17:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-11 15:53 Rik van Riel
2024-12-11 16:26 ` Yosry Ahmed
2024-12-11 16:34   ` Rik van Riel
2024-12-11 17:00     ` Yosry Ahmed
2024-12-11 17:19       ` Rik van Riel
2024-12-11 17:30         ` Yosry Ahmed [this message]
2024-12-11 17:49           ` Rik van Riel
2024-12-11 23:15 ` Balbir Singh
2024-12-12  1:21   ` Rik van Riel
2024-12-12  3:25     ` Balbir Singh
2024-12-12 14:03       ` Rik van Riel
2024-12-12 23:39         ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkZ9gSxdPUCgz_NaHSDPTC+HEhxNRbinp619sNSshScJ0A@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox