linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbirs@nvidia.com>
To: Rik van Riel <riel@surriel.com>,
	Roman Gushchin <roman.gushchin@linux.dev>
Cc: Yosry Ahmed <yosryahmed@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	hakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@meta.com,
	Nhat Pham <nphamcs@gmail.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap
Date: Fri, 13 Dec 2024 13:54:10 +1100	[thread overview]
Message-ID: <28af480e-b99a-447d-999e-1dcb961d2709@nvidia.com> (raw)
In-Reply-To: <20241212150003.1a0ed845@fangorn>

On 12/13/24 07:00, Rik van Riel wrote:
> On Thu, 12 Dec 2024 18:31:57 +0000
> Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
>> Is it about a single task or groups of tasks or the entire cgroup?
>> If former, why it's a problem? A tight memcg limit can slow things down
>> in general and I don't see why we should treat the exit() path differently.
>>
> I think the exit path does need to be treated a little differently,
> since this exit may be the only way such a cgroup can free up memory.
> 
>> If it's about the entire cgroup and we have essentially a deadlock,
>> I feel like we need to look into the oom reaper side.
> 
> You mean something like the below?
> 
> I have not tested it yet, because we don't have any stuck
> cgroups right now among the workloads that I'm monitoring.
> 
> ---8<---
> 
> From c0e545fd45bd3ee24cd79b3d3e9b375e968ef460 Mon Sep 17 00:00:00 2001
> From: Rik van Riel <riel@surriel.com>
> Date: Thu, 12 Dec 2024 14:50:49 -0500
> Subject: [PATCH] memcg,oom: speed up reclaim for exiting tasks
> 
> When a memcg reaches its memory limit, and reclaim becomes unavailable
> or slow for some reason, for example only zswap is available, but zswap
> writeback is disabled, it can take a long time for tasks to exit, and
> for the cgroup to get back to normal (or cleaned up).
> 
> Speed up memcg reclaim for exiting tasks by limiting how much work
> reclaim does, and by invoking the OOM reaper if reclaim does not
> free up enough memory to allow the task to make progress.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
>  include/linux/oom.h |  8 ++++++++
>  mm/memcontrol.c     | 11 +++++++++++
>  mm/oom_kill.c       |  6 +-----
>  3 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 1e0fc6931ce9..b2d9cf936664 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -111,4 +111,12 @@ extern void oom_killer_enable(void);
>  
>  extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>  
> +#ifdef CONFIG_MMU
> +extern void queue_oom_reaper(struct task_struct *tsk);
> +#else
> +static intern void queue_oom_reaper(struct task_struct *tsk)
> +{
> +}
> +#endif
> +
>  #endif /* _INCLUDE_LINUX_OOM_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7b3503d12aaf..21f42758d430 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2231,6 +2231,9 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	if (!gfpflags_allow_blocking(gfp_mask))
>  		goto nomem;
>  
> +	if (unlikely(current->flags & PF_EXITING))
> +		gfp_mask |= __GFP_NORETRY;
> +

if (task_is_dying())
	gfp_mask |= __GFP_NORETRY


>  	memcg_memory_event(mem_over_limit, MEMCG_MAX);
>  	raised_max_event = true;
>  
> @@ -2284,6 +2287,14 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  		goto retry;
>  	}
>  nomem:
> +	/*
> +	 * We ran out of memory while inside exit. Maybe the OOM
> +	 * reaper can help reduce cgroup memory use and get things
> +	 * moving again?
> +	 */
> +	if (unlikely(current->flags & PF_EXITING))
> +		queue_oom_reaper(current);
> +

I am not sure this is helpful without task_will_free_mem(), the dying
task shouldn't get sent to the OOM killer when we run out of memory.

I did notice that we have heuristics around task_is_dying() and
passed_oom, sounds like the end result of your changes would be to
ignore the heuristics of passed_oom


Balbir Singh.


      parent reply	other threads:[~2024-12-13  2:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12 16:57 Rik van Riel
2024-12-12 17:06 ` Yosry Ahmed
2024-12-12 17:51   ` Shakeel Butt
2024-12-12 18:02     ` Rik van Riel
2024-12-12 18:18       ` Nhat Pham
2024-12-12 18:11   ` Nhat Pham
2024-12-12 18:30   ` Johannes Weiner
2024-12-12 21:35     ` Shakeel Butt
2024-12-12 21:41       ` Yosry Ahmed
2024-12-13  0:32     ` Roman Gushchin
2024-12-13  4:42       ` Johannes Weiner
2024-12-16 15:39     ` Michal Hocko
2025-01-14 16:09       ` Johannes Weiner
2025-01-14 16:46         ` Michal Hocko
2025-01-14 16:51           ` Rik van Riel
2025-01-14 17:00             ` Michal Hocko
2025-01-14 17:11               ` Rik van Riel
2025-01-14 18:13                 ` Michal Hocko
2025-01-14 19:23                   ` Johannes Weiner
2025-01-14 19:42                     ` Michal Hocko
2025-01-15 17:35                       ` Rik van Riel
2025-01-15 19:41                         ` Michal Hocko
2025-01-14 16:54           ` Michal Hocko
2025-01-14 16:56             ` Rik van Riel
2025-01-14 16:56             ` Michal Hocko
2024-12-12 18:31 ` Roman Gushchin
2024-12-12 20:00   ` Rik van Riel
2024-12-13  0:49     ` Roman Gushchin
2024-12-13  2:54     ` Balbir Singh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28af480e-b99a-447d-999e-1dcb961d2709@nvidia.com \
    --to=balbirs@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox