linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Minchan Kim <minchan@kernel.org>
Cc: akpm@linux-foundation.org, david@kernel.org, mhocko@suse.com,
	 linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	surenb@google.com,  timmurray@google.com
Subject: Re: [RFC 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag
Date: Thu, 16 Apr 2026 11:13:35 +0200	[thread overview]
Message-ID: <20260416-planktont-abwinken-b9499483b939@brauner> (raw)
In-Reply-To: <20260413223948.556351-4-minchan@kernel.org>

On Mon, Apr 13, 2026 at 03:39:48PM -0700, Minchan Kim wrote:
> Currently, process_mrelease() requires userspace to send a SIGKILL signal
> prior to invocation. This separation introduces a race window where the
> victim task may receive the signal and enter the exit path before the
> reaper can invoke process_mrelease().
> 
> In this case, the victim task frees its memory via the standard, unoptimized
> exit path, bypassing the expedited clean file folio reclamation optimization
> introduced in the previous patch (which relies on the MMF_UNSTABLE flag).
> 
> This patch introduces the PROCESS_MRELEASE_REAP_KILL UAPI flag to support
> an integrated auto-kill mode. When specified, process_mrelease() directly
> injects a SIGKILL into the target task.
> 
> Crucially, this patch utilizes a dedicated signal code (KILL_MRELEASE)
> during signal injection, belonging to a new SIGKILL si_codes section.
> This special code ensures that the kernel's signal delivery path reliably
> intercepts the request and marks the target address space as unstable
> (MMF_UNSTABLE). This mechanism guarantees that the MMF_UNSTABLE flag is set
> before either the victim task or the reaper proceeds, ensuring that the
> expedited reclamation optimization is utilized regardless of scheduling
> order.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  include/uapi/asm-generic/siginfo.h |  6 ++++++
>  include/uapi/linux/mman.h          |  4 ++++
>  kernel/signal.c                    |  4 ++++
>  mm/oom_kill.c                      | 20 +++++++++++++++++++-
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
> index 5a1ca43b5fc6..0f59b791dab4 100644
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -252,6 +252,12 @@ typedef struct siginfo {
>  #define BUS_MCEERR_AO	5
>  #define NSIGBUS		5
>  
> +/*
> + * SIGKILL si_codes
> + */
> +#define KILL_MRELEASE	1	/* sent by process_mrelease */
> +#define NSIGKILL	1
> +
>  /*
>   * SIGTRAP si_codes
>   */
> diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h
> index e89d00528f2f..4266976b45ad 100644
> --- a/include/uapi/linux/mman.h
> +++ b/include/uapi/linux/mman.h
> @@ -56,4 +56,8 @@ struct cachestat {
>  	__u64 nr_recently_evicted;
>  };
>  
> +/* Flags for process_mrelease */
> +#define PROCESS_MRELEASE_REAP_KILL	(1 << 0)
> +#define PROCESS_MRELEASE_VALID_FLAGS	(PROCESS_MRELEASE_REAP_KILL)
> +
>  #endif /* _UAPI_LINUX_MMAN_H */
> diff --git a/kernel/signal.c b/kernel/signal.c
> index d65d0fe24bfb..c21b2176dc5e 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1134,6 +1134,10 @@ static int __send_signal_locked(int sig, struct kernel_siginfo *info,
>  
>  out_set:
>  	signalfd_notify(t, sig);
> +
> +	if (sig == SIGKILL && !is_si_special(info) &&
> +	    info->si_code == KILL_MRELEASE && t->mm)
> +		mm_flags_set(MMF_UNSTABLE, t->mm);
>  	sigaddset(&pending->signal, sig);
>  
>  	/* Let multiprocess signals appear after on-going forks */
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5c6c95c169ee..0b5da5208707 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -20,6 +20,8 @@
>  
>  #include <linux/oom.h>
>  #include <linux/mm.h>
> +#include <uapi/linux/mman.h>
> +#include <linux/capability.h>
>  #include <linux/err.h>
>  #include <linux/gfp.h>
>  #include <linux/sched.h>
> @@ -1218,13 +1220,29 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
>  	bool reap = false;
>  	long ret = 0;
>  
> -	if (flags)
> +	if (flags & ~PROCESS_MRELEASE_VALID_FLAGS)
>  		return -EINVAL;
>  
>  	task = pidfd_get_task(pidfd, &f_flags);
>  	if (IS_ERR(task))
>  		return PTR_ERR(task);
>  
> +	if (flags & PROCESS_MRELEASE_REAP_KILL) {
> +		struct kernel_siginfo info;
> +
> +		if (!capable(CAP_KILL)) {

Why? Just call a function that uses check_kill_permission() before
firing the signal? What's the rational for doing it this way?

Tbh, I really hate that process_mrelease() now has a kill side effect
with non-standard permission handling as well.

Seems like bad api design. Why can't you just raise the MMF_UNSTABLE bit
before the SIGKILL as that's the problem you're trying to solve.

> +			ret = -EPERM;
> +			goto put_task;
> +		}
> +		clear_siginfo(&info);
> +		info.si_signo = SIGKILL;
> +		info.si_code = KILL_MRELEASE;
> +		info.si_pid = task_tgid_vnr(current);
> +		info.si_uid = from_kuid_munged(current_user_ns(), current_uid());

This should not be open-coded like this.

> +
> +		do_send_sig_info(SIGKILL, &info, task, PIDTYPE_TGID);
> +	}
> +
>  	/*
>  	 * Make sure to choose a thread which still has a reference to mm
>  	 * during the group exit
> -- 
> 2.54.0.rc0.605.g598a273b03-goog
> 


  reply	other threads:[~2026-04-16  9:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 22:39 [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Minchan Kim
2026-04-13 22:39 ` [RFC 1/3] mm: process_mrelease: expedite clean file folio reclaim via mmu_gather Minchan Kim
2026-04-14  7:45   ` David Hildenbrand (Arm)
2026-04-14 20:21     ` Minchan Kim
2026-04-13 22:39 ` [RFC 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Minchan Kim
2026-04-14  7:20   ` David Hildenbrand (Arm)
2026-04-14 20:22     ` Minchan Kim
2026-04-13 22:39 ` [RFC 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag Minchan Kim
2026-04-16  9:13   ` Christian Brauner [this message]
2026-04-17  6:30     ` Minchan Kim
2026-04-17  7:04       ` Michal Hocko
2026-04-14  6:57 ` [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Michal Hocko
2026-04-14 20:00   ` Minchan Kim
2026-04-15  7:38     ` Michal Hocko
2026-04-15 23:26       ` Minchan Kim
2026-04-16  6:54         ` Michal Hocko
2026-04-17  6:20           ` Minchan Kim
2026-04-17  7:11             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260416-planktont-abwinken-b9499483b939@brauner \
    --to=brauner@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox