From: Balbir Singh <balbirs@nvidia.com>
To: Rik van Riel <riel@surriel.com>,
Roman Gushchin <roman.gushchin@linux.dev>
Cc: Yosry Ahmed <yosryahmed@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
hakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
Nhat Pham <nphamcs@gmail.com>
Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap
Date: Fri, 13 Dec 2024 13:54:10 +1100 [thread overview]
Message-ID: <28af480e-b99a-447d-999e-1dcb961d2709@nvidia.com> (raw)
In-Reply-To: <20241212150003.1a0ed845@fangorn>
On 12/13/24 07:00, Rik van Riel wrote:
> On Thu, 12 Dec 2024 18:31:57 +0000
> Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
>> Is it about a single task or groups of tasks or the entire cgroup?
>> If former, why it's a problem? A tight memcg limit can slow things down
>> in general and I don't see why we should treat the exit() path differently.
>>
> I think the exit path does need to be treated a little differently,
> since this exit may be the only way such a cgroup can free up memory.
>
>> If it's about the entire cgroup and we have essentially a deadlock,
>> I feel like we need to look into the oom reaper side.
>
> You mean something like the below?
>
> I have not tested it yet, because we don't have any stuck
> cgroups right now among the workloads that I'm monitoring.
>
> ---8<---
>
> From c0e545fd45bd3ee24cd79b3d3e9b375e968ef460 Mon Sep 17 00:00:00 2001
> From: Rik van Riel <riel@surriel.com>
> Date: Thu, 12 Dec 2024 14:50:49 -0500
> Subject: [PATCH] memcg,oom: speed up reclaim for exiting tasks
>
> When a memcg reaches its memory limit, and reclaim becomes unavailable
> or slow for some reason, for example only zswap is available, but zswap
> writeback is disabled, it can take a long time for tasks to exit, and
> for the cgroup to get back to normal (or cleaned up).
>
> Speed up memcg reclaim for exiting tasks by limiting how much work
> reclaim does, and by invoking the OOM reaper if reclaim does not
> free up enough memory to allow the task to make progress.
>
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
> include/linux/oom.h | 8 ++++++++
> mm/memcontrol.c | 11 +++++++++++
> mm/oom_kill.c | 6 +-----
> 3 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 1e0fc6931ce9..b2d9cf936664 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -111,4 +111,12 @@ extern void oom_killer_enable(void);
>
> extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>
> +#ifdef CONFIG_MMU
> +extern void queue_oom_reaper(struct task_struct *tsk);
> +#else
> +static intern void queue_oom_reaper(struct task_struct *tsk)
> +{
> +}
> +#endif
> +
> #endif /* _INCLUDE_LINUX_OOM_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7b3503d12aaf..21f42758d430 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2231,6 +2231,9 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
> if (!gfpflags_allow_blocking(gfp_mask))
> goto nomem;
>
> + if (unlikely(current->flags & PF_EXITING))
> + gfp_mask |= __GFP_NORETRY;
> +
if (task_is_dying())
gfp_mask |= __GFP_NORETRY
> memcg_memory_event(mem_over_limit, MEMCG_MAX);
> raised_max_event = true;
>
> @@ -2284,6 +2287,14 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
> goto retry;
> }
> nomem:
> + /*
> + * We ran out of memory while inside exit. Maybe the OOM
> + * reaper can help reduce cgroup memory use and get things
> + * moving again?
> + */
> + if (unlikely(current->flags & PF_EXITING))
> + queue_oom_reaper(current);
> +
I am not sure this is helpful without task_will_free_mem(), the dying
task shouldn't get sent to the OOM killer when we run out of memory.
I did notice that we have heuristics around task_is_dying() and
passed_oom, sounds like the end result of your changes would be to
ignore the heuristics of passed_oom
Balbir Singh.
prev parent reply other threads:[~2024-12-13 2:54 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-12 16:57 Rik van Riel
2024-12-12 17:06 ` Yosry Ahmed
2024-12-12 17:51 ` Shakeel Butt
2024-12-12 18:02 ` Rik van Riel
2024-12-12 18:18 ` Nhat Pham
2024-12-12 18:11 ` Nhat Pham
2024-12-12 18:30 ` Johannes Weiner
2024-12-12 21:35 ` Shakeel Butt
2024-12-12 21:41 ` Yosry Ahmed
2024-12-13 0:32 ` Roman Gushchin
2024-12-13 4:42 ` Johannes Weiner
2024-12-16 15:39 ` Michal Hocko
2025-01-14 16:09 ` Johannes Weiner
2025-01-14 16:46 ` Michal Hocko
2025-01-14 16:51 ` Rik van Riel
2025-01-14 17:00 ` Michal Hocko
2025-01-14 17:11 ` Rik van Riel
2025-01-14 18:13 ` Michal Hocko
2025-01-14 19:23 ` Johannes Weiner
2025-01-14 19:42 ` Michal Hocko
2025-01-15 17:35 ` Rik van Riel
2025-01-15 19:41 ` Michal Hocko
2025-01-14 16:54 ` Michal Hocko
2025-01-14 16:56 ` Rik van Riel
2025-01-14 16:56 ` Michal Hocko
2024-12-12 18:31 ` Roman Gushchin
2024-12-12 20:00 ` Rik van Riel
2024-12-13 0:49 ` Roman Gushchin
2024-12-13 2:54 ` Balbir Singh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28af480e-b99a-447d-999e-1dcb961d2709@nvidia.com \
--to=balbirs@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox