From: Michal Hocko <mhocko@suse.com>
To: zhongjinji <zhongjinji@honor.com>
Cc: akpm@linux-foundation.org, feng.han@honor.com, lenb@kernel.org,
liam.howlett@oracle.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-pm@vger.kernel.org,
liulu.liu@honor.com, lorenzo.stoakes@oracle.com,
pavel@kernel.org, rafael@kernel.org, rientjes@google.com,
shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de
Subject: Re: [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims
Date: Tue, 9 Sep 2025 16:02:57 +0200 [thread overview]
Message-ID: <aMAzkQQ4XAFh9xlm@tiehlicka> (raw)
In-Reply-To: <20250909135152.20477-1-zhongjinji@honor.com>
On Tue 09-09-25 21:51:52, zhongjinji wrote:
> > On Tue 09-09-25 19:41:31, zhongjinji wrote:
> > > > On Tue 09-09-25 17:06:57, zhongjinji wrote:
> > > > > OOM killer is a mechanism that selects and kills processes when the system
> > > > > runs out of memory to reclaim resources and keep the system stable.
> > > > > However, the oom victim cannot terminate on its own when it is frozen,
> > > > > because __thaw_task() only thaws one thread of the victim, while
> > > > > the other threads remain in the frozen state.
> > > > >
> > > > > Since __thaw_task did not fully thaw the OOM victim for self-termination,
> > > > > introduce thaw_oom_process() to properly thaw OOM victims.
> > > >
> > > > You will need s@thaw_oom_process@thaw_processes@
> > >
> > > The reason for using thaw_oom_process is that the TIF_MEMDIE flag of the
> > > thawed thread will be set, which means this function can only be used to
> > > thaw processes terminated by the OOM killer.
> >
> > Just do not set the flag inside the function. I would even say do not
> > set TIF_MEMDIE to the rest of the thread group at all. More on that
> > below
> >
> > > thaw_processes has already been defined in kernel/power/process.c.
> > > Would it be better to use thaw_process instead?
> >
> > Sorry I meant thaw_process as thaw_processes is handling all the
> > processes.
> >
> > > I am concerned that others might misunderstand the thaw_process function.
> > > thaw_process sets all threads to the TIF_MEMDIE state, so it can only be
> > > used to thaw processes killed by the OOM killer.
> >
> > And that is the reason why it shouldn't be doing that. It should thaw
> > the whole thread group. That's it.
> >
> > > If the TIF_MEMDIE flag of a thread is not set, the thread cannot be thawed
> > > regardless of the cgroup state.
> >
> > Why would that be the case. TIF_MEMDIE should only denote the victim
> > should be able to access memory reserves. Why the whole thread group
> > needs that? While more threads could be caught in the allocation path
> > this is a sort of boost at best. It cannot guarantee any forward
> > progress and we have kept marking only the first thread that way without
> > any issues.
>
> When a process is frozen, all its threads enter __refrigerator() (in kernel/freezer.c).
> When __thaw_task is called, the threads are woken up and check the freezing(current)
> state (in __refrigerator). The freezing check is implemented via freezing_slow_path.
> When TIF_MEMDIE is set for a thread, freezing_slow_path will return false, allowing
> the thread to exit the infinite loop in __refrigerator(), and thus the thread will
> be thawed.
>
> The following code can explain how TIF_MEMDIE works in thread thawing.
> __refrigerator
> for (;;) {
> freezing = freezing(current)
> freezing_slow_path
> if (test_tsk_thread_flag(p, TIF_MEMDIE))
> return false;
> if (!freezing)
> break;
> schedule();
> }
OK, I see. We could deal with that by checking tsk_is_oom_victim()
instead of TIF_MEMDIE
> Since thread_info is not shared within a thread group, TIF_MEMDIE for each thread
> must be set so that all threads can be thawed.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-09-09 14:03 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-09 9:06 [PATCH v8 0/3] Improvements to Victim Process Thawing and OOM Reaper Traversal Order zhongjinji
2025-09-09 9:06 ` [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims zhongjinji
2025-09-09 9:15 ` Michal Hocko
2025-09-09 16:27 ` Suren Baghdasaryan
2025-09-09 16:44 ` Michal Hocko
2025-09-09 16:53 ` Suren Baghdasaryan
2025-09-09 9:06 ` [PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process zhongjinji
2025-09-09 9:15 ` Michal Hocko
2025-09-09 11:41 ` [PATCH v8 1/3] mm/oom_kill: Introduce thaw_oom_process() for thawing OOM victims zhongjinji
2025-09-09 11:59 ` Michal Hocko
2025-09-09 13:51 ` zhongjinji
2025-09-09 14:02 ` Michal Hocko [this message]
2025-09-09 14:47 ` zhongjinji
2025-09-09 16:23 ` [PATCH v8 2/3] mm/oom_kill: Thaw the entire OOM victim process Suren Baghdasaryan
2025-09-09 9:06 ` [PATCH v8 3/3] mm/oom_kill: The OOM reaper traverses the VMA maple tree in reverse order zhongjinji
2025-09-09 16:29 ` Suren Baghdasaryan
2025-09-09 16:30 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMAzkQQ4XAFh9xlm@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=feng.han@honor.com \
--cc=lenb@kernel.org \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pm@vger.kernel.org \
--cc=liulu.liu@honor.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=pavel@kernel.org \
--cc=rafael@kernel.org \
--cc=rientjes@google.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=zhongjinji@honor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox