[PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
@ 2026-01-04 21:17 Mikulas Patocka
  2026-01-05 10:42 ` Vlastimil Babka
  2026-01-05 18:15 ` Liam R. Howlett
  0 siblings, 2 replies; 12+ messages in thread
From: Mikulas Patocka @ 2026-01-04 21:17 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Alex Deucher, Christian König, Andrew Morton,
	David Hildenbrand, amd-gfx, linux-mm, Liam R. Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato

If a process sets up a timer that periodically sends a signal in short
intervals and if it executes some kernel code that calls
mm_take_all_locks, we get random -EINTR failures.

The function mm_take_all_locks fails with -EINTR if there is pending
signal. The -EINTR is propagated up the call stack to userspace and
userspace fails if it gets this error.

In order to fix these failures, this commit changes
signal_pending(current) to fatal_signal_pending(current) in
mm_take_all_locks, so that it is interrupted only if the signal is
actually killing the process.

For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
probing the OpenCL device fails (strace shows that open("/dev/kfd")
failed with -EINTR). Sometimes we get the message "amdgpu:
init_user_pages: Failed to register MMU notifier: -4" in the syslog.

The bug can be reproduced with the following program.

To run this program, you need AMD graphics card and the package
"rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
installed, because it redirects the default OpenCL implementation to
itself.

include <stdio.h>
include <stdlib.h>
include <unistd.h>
include <string.h>
include <signal.h>
include <sys/time.h>

define CL_TARGET_OPENCL_VERSION	300
include <CL/opencl.h>

static void fn(void)
{
	while (1) {
		int32_t err;
		cl_device_id device;
		err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
		if (err != CL_SUCCESS) {
			fprintf(stderr, "clGetDeviceIDs failed: %d\n", err);
			exit(1);
		}
		write(2, "-", 1);
	}
}

static void alrm(int sig)
{
	write(2, ".", 1);
}

int main(void)
{
	struct itimerval it;
	struct sigaction sa;
	memset(&sa, 0, sizeof sa);
	sa.sa_handler = alrm;
	sa.sa_flags = SA_RESTART;
	sigaction(SIGALRM, &sa, NULL);
	it.it_interval.tv_sec = 0;
	it.it_interval.tv_usec = 50;
	it.it_value.tv_sec = 0;
	it.it_value.tv_usec = 50;
	setitimer(ITIMER_REAL, &it, NULL);
	fn();
	return 1;
}

I'm submitting this patch for the stable kernels, because this bug may
cause random failures in any code that calls mm_take_all_locks.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-November/133141.html
Link: https://yhbt.net/lore/linux-mm/6f16b618-26fc-3031-abe8-65c2090262e7@redhat.com/T/#u
Cc: stable@vger.kernel.org
Fixes: 7906d00cd1f6 ("mmu-notifiers: add mm_take_all_locks() operation")

---
 mm/vma.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: mm/mm/vma.c
===================================================================
--- mm.orig/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
+++ mm/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
@@ -2166,14 +2166,14 @@ int mm_take_all_locks(struct mm_struct *
 	 * is reached.
 	 */
 	for_each_vma(vmi, vma) {
-		if (signal_pending(current))
+		if (fatal_signal_pending(current))
 			goto out_unlock;
 		vma_start_write(vma);
 	}
 
 	vma_iter_init(&vmi, mm, 0);
 	for_each_vma(vmi, vma) {
-		if (signal_pending(current))
+		if (fatal_signal_pending(current))
 			goto out_unlock;
 		if (vma->vm_file && vma->vm_file->f_mapping &&
 				is_vm_hugetlb_page(vma))
@@ -2182,7 +2182,7 @@ int mm_take_all_locks(struct mm_struct *
 
 	vma_iter_init(&vmi, mm, 0);
 	for_each_vma(vmi, vma) {
-		if (signal_pending(current))
+		if (fatal_signal_pending(current))
 			goto out_unlock;
 		if (vma->vm_file && vma->vm_file->f_mapping &&
 				!is_vm_hugetlb_page(vma))
@@ -2191,7 +2191,7 @@ int mm_take_all_locks(struct mm_struct *
 
 	vma_iter_init(&vmi, mm, 0);
 	for_each_vma(vmi, vma) {
-		if (signal_pending(current))
+		if (fatal_signal_pending(current))
 			goto out_unlock;
 		if (vma->anon_vma)
 			list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-04 21:17 [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal Mikulas Patocka
@ 2026-01-05 10:42 ` Vlastimil Babka
  2026-01-05 12:15   ` Lorenzo Stoakes
  2026-01-05 18:15 ` Liam R. Howlett
  1 sibling, 1 reply; 12+ messages in thread
From: Vlastimil Babka @ 2026-01-05 10:42 UTC (permalink / raw)
  To: Mikulas Patocka, Lorenzo Stoakes
  Cc: Alex Deucher, Christian König, Andrew Morton,
	David Hildenbrand, amd-gfx, linux-mm, Liam R. Howlett, Jann Horn,
	Pedro Falcato, Matthew Wilcox

On 1/4/26 22:17, Mikulas Patocka wrote:
> If a process sets up a timer that periodically sends a signal in short
> intervals and if it executes some kernel code that calls
> mm_take_all_locks, we get random -EINTR failures.
> 
> The function mm_take_all_locks fails with -EINTR if there is pending
> signal. The -EINTR is propagated up the call stack to userspace and
> userspace fails if it gets this error.
> 
> In order to fix these failures, this commit changes
> signal_pending(current) to fatal_signal_pending(current) in
> mm_take_all_locks, so that it is interrupted only if the signal is
> actually killing the process.
> 
> For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> probing the OpenCL device fails (strace shows that open("/dev/kfd")
> failed with -EINTR). Sometimes we get the message "amdgpu:
> init_user_pages: Failed to register MMU notifier: -4" in the syslog.
> 
> The bug can be reproduced with the following program.
> 
> To run this program, you need AMD graphics card and the package
> "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> installed, because it redirects the default OpenCL implementation to
> itself.
> 
> include <stdio.h>
> include <stdlib.h>
> include <unistd.h>
> include <string.h>
> include <signal.h>
> include <sys/time.h>
> 
> define CL_TARGET_OPENCL_VERSION	300
> include <CL/opencl.h>
> 
> static void fn(void)
> {
> 	while (1) {
> 		int32_t err;
> 		cl_device_id device;
> 		err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
> 		if (err != CL_SUCCESS) {
> 			fprintf(stderr, "clGetDeviceIDs failed: %d\n", err);
> 			exit(1);
> 		}
> 		write(2, "-", 1);
> 	}
> }
> 
> static void alrm(int sig)
> {
> 	write(2, ".", 1);
> }
> 
> int main(void)
> {
> 	struct itimerval it;
> 	struct sigaction sa;
> 	memset(&sa, 0, sizeof sa);
> 	sa.sa_handler = alrm;
> 	sa.sa_flags = SA_RESTART;
> 	sigaction(SIGALRM, &sa, NULL);
> 	it.it_interval.tv_sec = 0;
> 	it.it_interval.tv_usec = 50;
> 	it.it_value.tv_sec = 0;
> 	it.it_value.tv_usec = 50;
> 	setitimer(ITIMER_REAL, &it, NULL);
> 	fn();
> 	return 1;
> }
> 
> I'm submitting this patch for the stable kernels, because this bug may
> cause random failures in any code that calls mm_take_all_locks.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-November/133141.html
> Link: https://yhbt.net/lore/linux-mm/6f16b618-26fc-3031-abe8-65c2090262e7@redhat.com/T/#u
> Cc: stable@vger.kernel.org
> Fixes: 7906d00cd1f6 ("mmu-notifiers: add mm_take_all_locks() operation")

Acked-by: Vlastimil Babka <vbabka@suse.cz>

This makes sense to me as a backportable bugfix. But I wonder if going
forward we should rather make all that locking killable instead of the
hopeful checks between individual lock attempts.

> 
> ---
>  mm/vma.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> Index: mm/mm/vma.c
> ===================================================================
> --- mm.orig/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
> +++ mm/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
> @@ -2166,14 +2166,14 @@ int mm_take_all_locks(struct mm_struct *
>  	 * is reached.
>  	 */
>  	for_each_vma(vmi, vma) {
> -		if (signal_pending(current))
> +		if (fatal_signal_pending(current))
>  			goto out_unlock;
>  		vma_start_write(vma);

E.g. here I think we already added a killable variant recently?

>  	}
>  
>  	vma_iter_init(&vmi, mm, 0);
>  	for_each_vma(vmi, vma) {
> -		if (signal_pending(current))
> +		if (fatal_signal_pending(current))
>  			goto out_unlock;
>  		if (vma->vm_file && vma->vm_file->f_mapping &&
>  				is_vm_hugetlb_page(vma))
> @@ -2182,7 +2182,7 @@ int mm_take_all_locks(struct mm_struct *
>  
>  	vma_iter_init(&vmi, mm, 0);
>  	for_each_vma(vmi, vma) {
> -		if (signal_pending(current))
> +		if (fatal_signal_pending(current))
>  			goto out_unlock;
>  		if (vma->vm_file && vma->vm_file->f_mapping &&
>  				!is_vm_hugetlb_page(vma))
> @@ -2191,7 +2191,7 @@ int mm_take_all_locks(struct mm_struct *
>  
>  	vma_iter_init(&vmi, mm, 0);
>  	for_each_vma(vmi, vma) {
> -		if (signal_pending(current))
> +		if (fatal_signal_pending(current))
>  			goto out_unlock;
>  		if (vma->anon_vma)
>  			list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-05 10:42 ` Vlastimil Babka
@ 2026-01-05 12:15   ` Lorenzo Stoakes
  0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2026-01-05 12:15 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Mikulas Patocka, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm,
	Liam R. Howlett, Jann Horn, Pedro Falcato, Matthew Wilcox

On Mon, Jan 05, 2026 at 11:42:26AM +0100, Vlastimil Babka wrote:
> On 1/4/26 22:17, Mikulas Patocka wrote:
> > If a process sets up a timer that periodically sends a signal in short
> > intervals and if it executes some kernel code that calls
> > mm_take_all_locks, we get random -EINTR failures.
> >
> > The function mm_take_all_locks fails with -EINTR if there is pending
> > signal. The -EINTR is propagated up the call stack to userspace and
> > userspace fails if it gets this error.
> >
> > In order to fix these failures, this commit changes
> > signal_pending(current) to fatal_signal_pending(current) in
> > mm_take_all_locks, so that it is interrupted only if the signal is
> > actually killing the process.
> >
> > For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> > probing the OpenCL device fails (strace shows that open("/dev/kfd")
> > failed with -EINTR). Sometimes we get the message "amdgpu:
> > init_user_pages: Failed to register MMU notifier: -4" in the syslog.
> >
> > The bug can be reproduced with the following program.
> >
> > To run this program, you need AMD graphics card and the package
> > "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> > installed, because it redirects the default OpenCL implementation to
> > itself.
> >
> > include <stdio.h>
> > include <stdlib.h>
> > include <unistd.h>
> > include <string.h>
> > include <signal.h>
> > include <sys/time.h>
> >
> > define CL_TARGET_OPENCL_VERSION	300
> > include <CL/opencl.h>
> >
> > static void fn(void)
> > {
> > 	while (1) {
> > 		int32_t err;
> > 		cl_device_id device;
> > 		err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
> > 		if (err != CL_SUCCESS) {
> > 			fprintf(stderr, "clGetDeviceIDs failed: %d\n", err);
> > 			exit(1);
> > 		}
> > 		write(2, "-", 1);
> > 	}
> > }
> >
> > static void alrm(int sig)
> > {
> > 	write(2, ".", 1);
> > }
> >
> > int main(void)
> > {
> > 	struct itimerval it;
> > 	struct sigaction sa;
> > 	memset(&sa, 0, sizeof sa);
> > 	sa.sa_handler = alrm;
> > 	sa.sa_flags = SA_RESTART;
> > 	sigaction(SIGALRM, &sa, NULL);
> > 	it.it_interval.tv_sec = 0;
> > 	it.it_interval.tv_usec = 50;
> > 	it.it_value.tv_sec = 0;
> > 	it.it_value.tv_usec = 50;
> > 	setitimer(ITIMER_REAL, &it, NULL);
> > 	fn();
> > 	return 1;
> > }
> >
> > I'm submitting this patch for the stable kernels, because this bug may
> > cause random failures in any code that calls mm_take_all_locks.
> >
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Link: https://lists.freedesktop.org/archives/amd-gfx/2025-November/133141.html
> > Link: https://yhbt.net/lore/linux-mm/6f16b618-26fc-3031-abe8-65c2090262e7@redhat.com/T/#u
> > Cc: stable@vger.kernel.org
> > Fixes: 7906d00cd1f6 ("mmu-notifiers: add mm_take_all_locks() operation")
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Also feel free to add:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

On assumption that nobody is explicitly relying on bizarre-o 'interrupt this if
_any_ signal arises'. But since it's making _actual workloads_ break I don't see
how this can be wrong.

>
> This makes sense to me as a backportable bugfix. But I wonder if going
> forward we should rather make all that locking killable instead of the
> hopeful checks between individual lock attempts.

Agreed. But anything like that should be a follow-up, let's get this
backported first.

>
> >
> > ---
> >  mm/vma.c |    8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > Index: mm/mm/vma.c
> > ===================================================================
> > --- mm.orig/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
> > +++ mm/mm/vma.c	2026-01-04 21:19:13.000000000 +0100
> > @@ -2166,14 +2166,14 @@ int mm_take_all_locks(struct mm_struct *
> >  	 * is reached.
> >  	 */
> >  	for_each_vma(vmi, vma) {
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> >  			goto out_unlock;
> >  		vma_start_write(vma);
>
> E.g. here I think we already added a killable variant recently?

Definitely unbackportable ;)

>
> >  	}
> >
> >  	vma_iter_init(&vmi, mm, 0);
> >  	for_each_vma(vmi, vma) {
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> >  			goto out_unlock;
> >  		if (vma->vm_file && vma->vm_file->f_mapping &&
> >  				is_vm_hugetlb_page(vma))
> > @@ -2182,7 +2182,7 @@ int mm_take_all_locks(struct mm_struct *
> >
> >  	vma_iter_init(&vmi, mm, 0);
> >  	for_each_vma(vmi, vma) {
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> >  			goto out_unlock;
> >  		if (vma->vm_file && vma->vm_file->f_mapping &&
> >  				!is_vm_hugetlb_page(vma))
> > @@ -2191,7 +2191,7 @@ int mm_take_all_locks(struct mm_struct *
> >
> >  	vma_iter_init(&vmi, mm, 0);
> >  	for_each_vma(vmi, vma) {
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> >  			goto out_unlock;
> >  		if (vma->anon_vma)
> >  			list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
> >
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-04 21:17 [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal Mikulas Patocka
  2026-01-05 10:42 ` Vlastimil Babka
@ 2026-01-05 18:15 ` Liam R. Howlett
  2026-01-05 20:08   ` Mikulas Patocka
  2026-01-06 11:36   ` Michel Dänzer
  1 sibling, 2 replies; 12+ messages in thread
From: Liam R. Howlett @ 2026-01-05 18:15 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Lorenzo Stoakes, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm,
	Vlastimil Babka, Jann Horn, Pedro Falcato

* Mikulas Patocka <mpatocka@redhat.com> [260104 16:17]:
> If a process sets up a timer that periodically sends a signal in short
> intervals and if it executes some kernel code that calls
> mm_take_all_locks, we get random -EINTR failures.
> 
> The function mm_take_all_locks fails with -EINTR if there is pending
> signal. The -EINTR is propagated up the call stack to userspace and
> userspace fails if it gets this error.
> 
> In order to fix these failures, this commit changes
> signal_pending(current) to fatal_signal_pending(current) in
> mm_take_all_locks, so that it is interrupted only if the signal is
> actually killing the process.

I may be missing context because I didn't get 1/3 of this patch set,
nor can I find it in my ML searching.  Nor did I get the cover letter,
or find it.  Is this series threaded?

What you are doing is changing a really horrible loop across all VMAs,
that happens 4 times, to a less interruptible method.

I'm not sure I'm okay with this.  Everyone else does seem fine with it,
because userspace basically never checks error codes for a retry (or
really anything, and sometimes not even for an error at all).

But this could potentially have larger consequences for those
applications that register signal handlers, couldn't it?  That is, they
expect to get a return based on some existing code but now it won't
return and the application is forced to wait for all locks to be taken
regardless of how long it takes - or to force kill the application?

We regularly get people requesting the default number of vmas be
increased.  This means that processes that approach max_map_count will
wait until 4 loops through the vmas before they can be interrupted, or
they have to kill the process.

> 
> For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> probing the OpenCL device fails (strace shows that open("/dev/kfd")
> failed with -EINTR). Sometimes we get the message "amdgpu:
> init_user_pages: Failed to register MMU notifier: -4" in the syslog.

If you only get the error message sometimes, does that mean there is
another signal check that isn't covered by this change - or another call
path?

> 
> The bug can be reproduced with the following program.
> 
> To run this program, you need AMD graphics card and the package
> "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> installed, because it redirects the default OpenCL implementation to
> itself.

I'm not saying it's wrong to change the signal handling, but this is
very much working around a bug in userspace constantly hammering a task
with signals and then is surprised there is a response that the kernel
was interrupted.

This seems to imply that all signal handling should only happen on fatal
signals?

...
> 
> I'm submitting this patch for the stable kernels, because this bug may
> cause random failures in any code that calls mm_take_all_locks.

They aren't random failures, they are a response to a signal sent to the
process that may be taking a very long time to do something.

I really don't see how continuously sending signals at a short interval
interrupting system calls can be considered random failures, especially
when the return is -EINTR which literally means "Interrupted system
call".  How else would you interrupt a system call, if not a signal?

I feel like we are making the code less versatile to work around the
fact that userspace didn't realise that system calls could be
interrupted without a fatal signal.

And from that view, I consider this a functional change and not a bug
fix.

Thanks,
Liam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-05 18:15 ` Liam R. Howlett
@ 2026-01-05 20:08   ` Mikulas Patocka
  2026-01-06 17:40     ` Liam R. Howlett
  2026-01-06 11:36   ` Michel Dänzer
  1 sibling, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2026-01-05 20:08 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: Lorenzo Stoakes, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm,
	Vlastimil Babka, Jann Horn, Pedro Falcato



On Mon, 5 Jan 2026, Liam R. Howlett wrote:

> I may be missing context because I didn't get 1/3 of this patch set,
> nor can I find it in my ML searching.  Nor did I get the cover letter,
> or find it.  Is this series threaded?

You can ignore the patch 1/3, it just changes memory management test 
suite.

> What you are doing is changing a really horrible loop across all VMAs,
> that happens 4 times, to a less interruptible method.
> 
> I'm not sure I'm okay with this.  Everyone else does seem fine with it,
> because userspace basically never checks error codes for a retry (or
> really anything, and sometimes not even for an error at all).

Everyone else does seem fine because they don't use periodic signals :-)

OpenCL is not the first thing that got broken by periodic signals. In the 
past, I found bugs in Linux/alpha, Linux/hppa, Cygwin, Intel Software 
Developer's Emulator regarding periodic signals. fork() was also buggy and 
it was fixed by c3ad2c3b02e953ead2b8d52a0c9e70312930c3d0.

> But this could potentially have larger consequences for those
> applications that register signal handlers, couldn't it?  That is, they
> expect to get a return based on some existing code but now it won't
> return and the application is forced to wait for all locks to be taken
> regardless of how long it takes - or to force kill the application?

Do you have any userspace application that expects open("/dev/kfd") to be 
interrupted with -EINTR and breaks when it isn't?

> We regularly get people requesting the default number of vmas be
> increased.  This means that processes that approach max_map_count will
> wait until 4 loops through the vmas before they can be interrupted, or
> they have to kill the process.
> 
> > 
> > For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> > probing the OpenCL device fails (strace shows that open("/dev/kfd")
> > failed with -EINTR). Sometimes we get the message "amdgpu:
> > init_user_pages: Failed to register MMU notifier: -4" in the syslog.
> 
> If you only get the error message sometimes, does that mean there is
> another signal check that isn't covered by this change - or another call
> path?

This call path is also triggered by -EINTR from mm_take_all_locks: 
"init_user_pages -> amdgpu_hmm_register -> mmu_interval_notifier_insert -> 
mmu_notifier_register -> __mmu_notifier_register -> mm_take_all_locks -> 
return -EINTR". I am not expert in the GPU code, so I don't know how much 
serious it is.

> > The bug can be reproduced with the following program.
> > 
> > To run this program, you need AMD graphics card and the package
> > "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> > installed, because it redirects the default OpenCL implementation to
> > itself.
> 
> I'm not saying it's wrong to change the signal handling, but this is
> very much working around a bug in userspace constantly hammering a task
> with signals and then is surprised there is a response that the kernel
> was interrupted.
> 
> This seems to imply that all signal handling should only happen on fatal
> signals?

No - the kernel should do what applications expect. open is (according to 
the man page) supposed to be interrupted when opening slow devices (for 
example fifo). I'm wondering whether /dev/kfd should be considered a slow 
device or not.

> ...
> > 
> > I'm submitting this patch for the stable kernels, because this bug may
> > cause random failures in any code that calls mm_take_all_locks.
> 
> They aren't random failures, they are a response to a signal sent to the
> process that may be taking a very long time to do something.
> 
> I really don't see how continuously sending signals at a short interval
> interrupting system calls can be considered random failures, especially
> when the return is -EINTR which literally means "Interrupted system
> call".  How else would you interrupt a system call, if not a signal?

The AMDGPU OpenCL implementation attempts to open /dev/kfd and if it gets 
-EINTR, it behaves as if OpenCL were unavailable - it won't report itself 
in clGetPlatformIDs and it will make clGetDeviceIDs fail.

So, we are dealing with random failures - any single signal received at 
wrong time can make OpenCL fail.

Even if I disabled the periodic timer, the failure could be triggered by 
other signals, for example SIGWINCH when the user resizes the terminal, or 
SIGCHLD when a subprocess exits.

> I feel like we are making the code less versatile to work around the
> fact that userspace didn't realise that system calls could be
> interrupted without a fatal signal.
> 
> And from that view, I consider this a functional change and not a bug
> fix.
> 
> Thanks,
> Liam

In practice, I use 10ms timer and I get occasional OpenCL failures. In the 
test example, I used 50us timer, so that it is reproduced reliably - but 
decreasing the timer frequency doesn't fix the failure, it just makes it 
happen less often.

Mikulas



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-05 18:15 ` Liam R. Howlett
  2026-01-05 20:08   ` Mikulas Patocka
@ 2026-01-06 11:36   ` Michel Dänzer
  2026-01-06 12:52     ` Mikulas Patocka
  2026-01-06 14:57     ` Vlastimil Babka
  1 sibling, 2 replies; 12+ messages in thread
From: Michel Dänzer @ 2026-01-06 11:36 UTC (permalink / raw)
  To: Liam R. Howlett, Mikulas Patocka, Lorenzo Stoakes, Alex Deucher,
	Christian König, Andrew Morton, David Hildenbrand, amd-gfx,
	linux-mm, Vlastimil Babka, Jann Horn, Pedro Falcato

On 1/5/26 19:15, Liam R. Howlett wrote:
> * Mikulas Patocka <mpatocka@redhat.com> [260104 16:17]:
>
> I'm not saying it's wrong to change the signal handling, but this is
> very much working around a bug in userspace constantly hammering a task
> with signals and then is surprised there is a response that the kernel
> was interrupted.

I'd go further than that. If user space fails to retry the system call in response to -EINTR, that's a user-space bug, period. It can happen anytime for any number of other reasons. (That most system calls happen to get away without it most of the time doesn't make it not a bug)

(This is assuming the system call can be safely retried, otherwise returning -EINTR to user space for non-fatal signals would be a kernel bug)


>> I'm submitting this patch for the stable kernels, because this bug may
>> cause random failures in any code that calls mm_take_all_locks.
> 
> They aren't random failures, they are a response to a signal sent to the
> process that may be taking a very long time to do something.
> 
> I really don't see how continuously sending signals at a short interval
> interrupting system calls can be considered random failures, especially
> when the return is -EINTR which literally means "Interrupted system
> call".  How else would you interrupt a system call, if not a signal?
> 
> I feel like we are making the code less versatile to work around the
> fact that userspace didn't realise that system calls could be
> interrupted without a fatal signal.
> 
> And from that view, I consider this a functional change and not a bug
> fix.

I agree.


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-06 11:36   ` Michel Dänzer
@ 2026-01-06 12:52     ` Mikulas Patocka
  2026-01-06 15:03       ` David Hildenbrand (Red Hat)
  2026-01-06 14:57     ` Vlastimil Babka
  1 sibling, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2026-01-06 12:52 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Liam R. Howlett, Lorenzo Stoakes, Alex Deucher,
	Christian König, Andrew Morton, David Hildenbrand, amd-gfx,
	linux-mm, Vlastimil Babka, Jann Horn, Pedro Falcato

[-- Attachment #1: Type: text/plain, Size: 1490 bytes --]



On Tue, 6 Jan 2026, Michel Dänzer wrote:

> On 1/5/26 19:15, Liam R. Howlett wrote:
> > * Mikulas Patocka <mpatocka@redhat.com> [260104 16:17]:
> >
> > I'm not saying it's wrong to change the signal handling, but this is
> > very much working around a bug in userspace constantly hammering a task
> > with signals and then is surprised there is a response that the kernel
> > was interrupted.
> 
> I'd go further than that. If user space fails to retry the system call 
> in response to -EINTR, that's a user-space bug, period. It can happen 
> anytime for any number of other reasons. (That most system calls happen 
> to get away without it most of the time doesn't make it not a bug)

So, I tried this - just for fun - and the machine doesn't even boot. I get 
a lot of errors about inability to open particular files on the console.

Userspace is buggy, according to your definition, regardless of whether 
you like it or not.

Mikulas

---
 fs/open.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux-2.6/fs/open.c
===================================================================
--- linux-2.6.orig/fs/open.c	2025-12-31 20:10:31.000000000 +0100
+++ linux-2.6/fs/open.c	2026-01-06 13:28:01.000000000 +0100
@@ -1419,6 +1419,9 @@ static int do_sys_openat2(int dfd, const
 	struct filename *tmp __free(putname) = NULL;
 	int err;
 
+	if (current->pid != 1 && !(get_random_u8() & 0x1))
+		return -EINTR;
+
 	err = build_open_flags(how, &op);
 	if (unlikely(err))
 		return err;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-06 11:36   ` Michel Dänzer
  2026-01-06 12:52     ` Mikulas Patocka
@ 2026-01-06 14:57     ` Vlastimil Babka
  1 sibling, 0 replies; 12+ messages in thread
From: Vlastimil Babka @ 2026-01-06 14:57 UTC (permalink / raw)
  To: Michel Dänzer, Liam R. Howlett, Mikulas Patocka,
	Lorenzo Stoakes, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm, Jann Horn,
	Pedro Falcato

On 1/6/26 12:36, Michel Dänzer wrote:
> On 1/5/26 19:15, Liam R. Howlett wrote:
>> * Mikulas Patocka <mpatocka@redhat.com> [260104 16:17]:
>>
>> I'm not saying it's wrong to change the signal handling, but this is
>> very much working around a bug in userspace constantly hammering a task
>> with signals and then is surprised there is a response that the kernel
>> was interrupted.
> 
> I'd go further than that. If user space fails to retry the system call
> in response to -EINTR, that's a user-space bug, period. It can happen
> anytime for any number of other reasons. (That most system calls happen
> to get away without it most of the time doesn't make it not a bug)
> 
> (This is assuming the system call can be safely retried, otherwise
> returning -EINTR to user space for non-fatal signals would be a kernel
> bug)

This is true and the userspace should be fixed. But the kernel still tries
hard to support even broken userspace.
Even though commit 7906d00cd1f6 is very old now, from 2.6.27, we could say
that technically it violated the "do not break userspace" rule.
It's not always easy to undo such breakages, especially after many years,
and if some other userspace meanwhile starts depending on the new behavior.
But I doubt that's the case here and anything will have issues with
non-fatal signals being delayed.

>>> I'm submitting this patch for the stable kernels, because this bug may
>>> cause random failures in any code that calls mm_take_all_locks.
>> 
>> They aren't random failures, they are a response to a signal sent to the
>> process that may be taking a very long time to do something.
>> 
>> I really don't see how continuously sending signals at a short interval
>> interrupting system calls can be considered random failures, especially
>> when the return is -EINTR which literally means "Interrupted system
>> call".  How else would you interrupt a system call, if not a signal?
>> 
>> I feel like we are making the code less versatile to work around the
>> fact that userspace didn't realise that system calls could be
>> interrupted without a fatal signal.
>> 
>> And from that view, I consider this a functional change and not a bug
>> fix.
> 
> I agree.
> 
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-06 12:52     ` Mikulas Patocka
@ 2026-01-06 15:03       ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2026-01-06 15:03 UTC (permalink / raw)
  To: Mikulas Patocka, Michel Dänzer
  Cc: Liam R. Howlett, Lorenzo Stoakes, Alex Deucher,
	Christian König, Andrew Morton, amd-gfx, linux-mm,
	Vlastimil Babka, Jann Horn, Pedro Falcato

On 1/6/26 13:52, Mikulas Patocka wrote:
> 
> 
> On Tue, 6 Jan 2026, Michel Dänzer wrote:
> 
>> On 1/5/26 19:15, Liam R. Howlett wrote:
>>> * Mikulas Patocka <mpatocka@redhat.com> [260104 16:17]:
>>>
>>> I'm not saying it's wrong to change the signal handling, but this is
>>> very much working around a bug in userspace constantly hammering a task
>>> with signals and then is surprised there is a response that the kernel
>>> was interrupted.
>>
>> I'd go further than that. If user space fails to retry the system call
>> in response to -EINTR, that's a user-space bug, period. It can happen
>> anytime for any number of other reasons. (That most system calls happen
>> to get away without it most of the time doesn't make it not a bug)
> 
> So, I tried this - just for fun - and the machine doesn't even boot. I get
> a lot of errors about inability to open particular files on the console.
> 
> Userspace is buggy, according to your definition, regardless of whether
> you like it or not.
> 
> Mikulas
> 
> ---
>   fs/open.c |    3 +++
>   1 file changed, 3 insertions(+)
> 
> Index: linux-2.6/fs/open.c
> ===================================================================
> --- linux-2.6.orig/fs/open.c	2025-12-31 20:10:31.000000000 +0100
> +++ linux-2.6/fs/open.c	2026-01-06 13:28:01.000000000 +0100
> @@ -1419,6 +1419,9 @@ static int do_sys_openat2(int dfd, const
>   	struct filename *tmp __free(putname) = NULL;
>   	int err;
>   
> +	if (current->pid != 1 && !(get_random_u8() & 0x1))
> +		return -EINTR;

Reading the man [1] page user space is only to expect EINTR in case it is
prepared to deal with signals (install signal handlers), no?

There are some exception documented:

        On Linux, even in the absence of signal handlers, certain blocking
        interfaces can fail with the error EINTR after the process is
        stopped by one of the stop signals and then resumed via SIGCONT.
        This behavior is not sanctioned by POSIX.1, and doesn't occur on
        other systems.

        The Linux interfaces that display this behavior are:

        •  "Input" socket interfaces, when a timeout (SO_RCVTIMEO) has
           been set on the socket using setsockopt(2): accept(2), recv(2),
           recvfrom(2), recvmmsg(2) (also with a non-NULL timeout
           argument), and recvmsg(2).

        •  "Output" socket interfaces, when a timeout (SO_RCVTIMEO) has
           been set on the socket using setsockopt(2): connect(2),
           send(2), sendto(2), and sendmsg(2), if a send timeout
           (SO_SNDTIMEO) has been set.

        •  epoll_wait(2), epoll_pwait(2).

        •  semop(2), semtimedop(2).

        •  sigtimedwait(2), sigwaitinfo(2).

        •  Linux 3.7 and earlier: read(2) from an inotify(7) file
           descriptor

        •  Linux 2.6.21 and earlier: futex(2) FUTEX_WAIT,
           sem_timedwait(3), sem_wait(3).

        •  Linux 2.6.8 and earlier: msgrcv(2), msgsnd(2).

        •  Linux 2.4 and earlier: nanosleep(2).


So I would expect that your test code hear breaks user space.


[1] https://man7.org/linux/man-pages/man7/signal.7.html

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-05 20:08   ` Mikulas Patocka
@ 2026-01-06 17:40     ` Liam R. Howlett
  2026-01-06 20:19       ` Mikulas Patocka
  0 siblings, 1 reply; 12+ messages in thread
From: Liam R. Howlett @ 2026-01-06 17:40 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Lorenzo Stoakes, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm,
	Vlastimil Babka, Jann Horn, Pedro Falcato

* Mikulas Patocka <mpatocka@redhat.com> [260105 15:08]:
> 
> 
> On Mon, 5 Jan 2026, Liam R. Howlett wrote:
> 
> > I may be missing context because I didn't get 1/3 of this patch set,
> > nor can I find it in my ML searching.  Nor did I get the cover letter,
> > or find it.  Is this series threaded?
> 
> You can ignore the patch 1/3, it just changes memory management test 
> suite.
> 
> > What you are doing is changing a really horrible loop across all VMAs,
> > that happens 4 times, to a less interruptible method.
> > 
> > I'm not sure I'm okay with this.  Everyone else does seem fine with it,
> > because userspace basically never checks error codes for a retry (or
> > really anything, and sometimes not even for an error at all).
> 
> Everyone else does seem fine because they don't use periodic signals :-)

I meant everyone else on this thread.

> 
> OpenCL is not the first thing that got broken by periodic signals. In the 
> past, I found bugs in Linux/alpha, Linux/hppa, Cygwin, Intel Software 
> Developer's Emulator regarding periodic signals. fork() was also buggy and 
> it was fixed by c3ad2c3b02e953ead2b8d52a0c9e70312930c3d0.
> 
> > But this could potentially have larger consequences for those
> > applications that register signal handlers, couldn't it?  That is, they
> > expect to get a return based on some existing code but now it won't
> > return and the application is forced to wait for all locks to be taken
> > regardless of how long it takes - or to force kill the application?
> 
> Do you have any userspace application that expects open("/dev/kfd") to be 
> interrupted with -EINTR and breaks when it isn't?

Your change doesn't just affect /dev/kfd though, it affects any call
path that leads to looping across all vmas four times in a row.

And since this could take a long time, I'm not sure it's a good idea to
stop responding to all non-fatal signal.

> 
> > We regularly get people requesting the default number of vmas be
> > increased.  This means that processes that approach max_map_count will
> > wait until 4 loops through the vmas before they can be interrupted, or
> > they have to kill the process.
> > 
> > > 
> > > For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> > > probing the OpenCL device fails (strace shows that open("/dev/kfd")
> > > failed with -EINTR). Sometimes we get the message "amdgpu:
> > > init_user_pages: Failed to register MMU notifier: -4" in the syslog.
> > 
> > If you only get the error message sometimes, does that mean there is
> > another signal check that isn't covered by this change - or another call
> > path?
> 
> This call path is also triggered by -EINTR from mm_take_all_locks: 
> "init_user_pages -> amdgpu_hmm_register -> mmu_interval_notifier_insert -> 
> mmu_notifier_register -> __mmu_notifier_register -> mm_take_all_locks -> 
> return -EINTR". I am not expert in the GPU code, so I don't know how much 
> serious it is.

Okay, so the other call paths also end up getting the -EINTR from this
function?  Can you please add that detail to the commit message?

This means that -EINTR can no longer be returned from open(), right?
Otherwise you are just reducing a race condition between open() and a
signal entering from your timer.

> 
> > > The bug can be reproduced with the following program.
> > > 
> > > To run this program, you need AMD graphics card and the package
> > > "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> > > installed, because it redirects the default OpenCL implementation to
> > > itself.
> > 
> > I'm not saying it's wrong to change the signal handling, but this is
> > very much working around a bug in userspace constantly hammering a task
> > with signals and then is surprised there is a response that the kernel
> > was interrupted.
> > 
> > This seems to imply that all signal handling should only happen on fatal
> > signals?
> 
> No - the kernel should do what applications expect. open is (according to 
> the man page) supposed to be interrupted when opening slow devices (for 
> example fifo). I'm wondering whether /dev/kfd should be considered a slow 
> device or not.

I disagree.  The kernel should do what was agreed to in an API,
otherwise our whole deck of cards falls over.  We may not even have the
resources to do what the applications expect.  One could also code an
application that expects to violate the security of the system.

This is documented behaviour from the open() call, so it certainly
sounds like an issue if userspace doesn't expect this return.

Any other -EINTR system call will also cause you problems since you
continuously send signals to your process, so we'll have to change them
all for this to work?

> 
> > ...
> > > 
> > > I'm submitting this patch for the stable kernels, because this bug may
> > > cause random failures in any code that calls mm_take_all_locks.
> > 
> > They aren't random failures, they are a response to a signal sent to the
> > process that may be taking a very long time to do something.
> > 
> > I really don't see how continuously sending signals at a short interval
> > interrupting system calls can be considered random failures, especially
> > when the return is -EINTR which literally means "Interrupted system
> > call".  How else would you interrupt a system call, if not a signal?
> 
> The AMDGPU OpenCL implementation attempts to open /dev/kfd and if it gets 
> -EINTR, it behaves as if OpenCL were unavailable - it won't report itself 
> in clGetPlatformIDs and it will make clGetDeviceIDs fail.

This is the userspace ignoring what the error code means and just
aborting on any error.  This is a change in behaviour on the kernel side
to work around what they are doing.

It also sounds like it can be avoided by userspace not sending signals
during the open process, or even to retry at a higher level if a
recoverable error occurs.

Again, I'm not saying that this change isn't the right thing to do, but
it's not the kernels fault that userspace is aborting on -EINTR.  And
it's not the kernels fault that the userspace application is sending
signals during a system call that will respond to those signals.

This is changing the kernels behaviour to avoid having to deal with two
cascading failures in userspace: 1. The opencl not retrying on
interrupted open calls and 2. sending signals to a process in the middle
of a system call that will respond to signals.

I am saying that calling this a bug fix is wrong.

Even though Vlastimil has pointed out that this locking did not happen
prior to 7906d00cd1f6, the contract to userspace for the open() system
call could still return -EINTR and that change predates amdgpu's kfd in
2015.  So I don't consider the 'kernel broke userspace' argument in this
case.

> 
> So, we are dealing with random failures - any single signal received at 
> wrong time can make OpenCL fail.

I feel like we are working with two different definitions of random.  I
would consider a failure caused by a periodic signal to be an expected
and predictable event.

"random failures in any code that calls mm_take_all_locks" makes it
sound like we're going to hit some uninitialized variable that sometimes
isn't zero and exit, but really we're receiving a signal that may need
to do something within a reasonable amount of time.

> 
> Even if I disabled the periodic timer, the failure could be triggered by 
> other signals, for example SIGWINCH when the user resizes the terminal, or 
> SIGCHLD when a subprocess exits.

Those are also not random, they are expected signals caused by events.

I'm trying to say this git commit message is wrong and misleading.

> 
> > I feel like we are making the code less versatile to work around the
> > fact that userspace didn't realise that system calls could be
> > interrupted without a fatal signal.
> > 
> > And from that view, I consider this a functional change and not a bug
> > fix.
> > 
> > Thanks,
> > Liam
> 
> In practice, I use 10ms timer and I get occasional OpenCL failures. In the 
> test example, I used 50us timer, so that it is reproduced reliably - but 
> decreasing the timer frequency doesn't fix the failure, it just makes it 
> happen less often.

This does not fit my definition of random, bug, or failure.

Thanks,
Liam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-06 17:40     ` Liam R. Howlett
@ 2026-01-06 20:19       ` Mikulas Patocka
  2026-01-06 21:56         ` Pedro Falcato
  0 siblings, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2026-01-06 20:19 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: Lorenzo Stoakes, Alex Deucher, Christian König,
	Andrew Morton, David Hildenbrand, amd-gfx, linux-mm,
	Vlastimil Babka, Jann Horn, Pedro Falcato

On Tue, 6 Jan 2026, Liam R. Howlett wrote:

> * Mikulas Patocka <mpatocka@redhat.com> [260105 15:08]:
> > 
> > > If you only get the error message sometimes, does that mean there is
> > > another signal check that isn't covered by this change - or another call
> > > path?
> > 
> > This call path is also triggered by -EINTR from mm_take_all_locks: 
> > "init_user_pages -> amdgpu_hmm_register -> mmu_interval_notifier_insert -> 
> > mmu_notifier_register -> __mmu_notifier_register -> mm_take_all_locks -> 
> > return -EINTR". I am not expert in the GPU code, so I don't know how much 
> > serious it is.
> 
> Okay, so the other call paths also end up getting the -EINTR from this
> function?  Can you please add that detail to the commit message?

Yes. I'd like to ask the GPU people to look at it and say how much damage 
this -EINTR could do. I don't know - I just saw the messages "Failed to 
register MMU notifier: -4" in the syslog.

> This means that -EINTR can no longer be returned from open(), right?
> Otherwise you are just reducing a race condition between open() and a
> signal entering from your timer.

EINTR can be returned from open() in cases when it was historically 
behaving this way - such as opening a fifo when there is no matching 
process having it open.

But I think that opening /dev/kfd doesn't fall into this category.

NFS has an "intr" flag that makes the filesystem syscalls interruptible by 
signals. It is off by default, because many programs don't expect EINTR 
when opening, reading or writing plain files on a filesystem.

> Any other -EINTR system call will also cause you problems since you
> continuously send signals to your process, so we'll have to change them
> all for this to work?

I use SA_RESTART for the signals. And I retry all the syscalls on EINTR 
just in case SA_RESTART didn't work. So, I don't experience random 
failures in my code due to the periodic signal.

But there is code that I have no control over - such as the OpenCL shared 
library.

> This is the userspace ignoring what the error code means and just
> aborting on any error.  This is a change in behaviour on the kernel side
> to work around what they are doing.
> 
> It also sounds like it can be avoided by userspace not sending signals
> during the open process, or even to

So far, I worked around this issue by blocking all signals around 
clGetPlatformIDs and clGetDeviceIDs - but this is a hack.

> retry at a higher level if a recoverable error occurs.

If clGetDeviceIDs fails and I call clGetDeviceIDs again, it doesn't even 
attempt to open /dev/kfd again and fails right away. So, I can't work 
around it by retrying it.

> > Even if I disabled the periodic timer, the failure could be triggered by 
> > other signals, for example SIGWINCH when the user resizes the terminal, or 
> > SIGCHLD when a subprocess exits.
> 
> Those are also not random, they are expected signals caused by events.

From the process's point of view, they are random - the process doesn't 
know when the user will drag the corner of the terminal window and resize 
it. If the process spawns a subprocess, it cannot predict when will the 
subprocess exit and SIGCHLD will be delivered.

If we don't change it, we end up with unreliable software stack that can 
fail during rare events, such as dragging the corner of the window.

> I'm trying to say this git commit message is wrong and misleading.

OK, so I'll try to rewrite the commit message and submit version 4 of the 
patch.

Mikulas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
  2026-01-06 20:19       ` Mikulas Patocka
@ 2026-01-06 21:56         ` Pedro Falcato
  0 siblings, 0 replies; 12+ messages in thread
From: Pedro Falcato @ 2026-01-06 21:56 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Liam R. Howlett, Lorenzo Stoakes, Alex Deucher,
	Christian König, Andrew Morton, David Hildenbrand, amd-gfx,
	linux-mm, Vlastimil Babka, Jann Horn

On Tue, Jan 06, 2026 at 09:19:59PM +0100, Mikulas Patocka wrote:
> 
> 
> On Tue, 6 Jan 2026, Liam R. Howlett wrote:
> 
> > * Mikulas Patocka <mpatocka@redhat.com> [260105 15:08]:
> > > 
> > > > If you only get the error message sometimes, does that mean there is
> > > > another signal check that isn't covered by this change - or another call
> > > > path?
> > > 
> > > This call path is also triggered by -EINTR from mm_take_all_locks: 
> > > "init_user_pages -> amdgpu_hmm_register -> mmu_interval_notifier_insert -> 
> > > mmu_notifier_register -> __mmu_notifier_register -> mm_take_all_locks -> 
> > > return -EINTR". I am not expert in the GPU code, so I don't know how much 
> > > serious it is.
> > 
> > Okay, so the other call paths also end up getting the -EINTR from this
> > function?  Can you please add that detail to the commit message?
> 
> Yes. I'd like to ask the GPU people to look at it and say how much damage 
> this -EINTR could do. I don't know - I just saw the messages "Failed to 
> register MMU notifier: -4" in the syslog.
> 
> > This means that -EINTR can no longer be returned from open(), right?
> > Otherwise you are just reducing a race condition between open() and a
> > signal entering from your timer.
> 
> EINTR can be returned from open() in cases when it was historically 
> behaving this way - such as opening a fifo when there is no matching 
> process having it open.
> 
> But I think that opening /dev/kfd doesn't fall into this category.
>

Well, it's a device - opening can and often does have side-effects.
It's not too far-fetched to -EINTR here.

> NFS has an "intr" flag that makes the filesystem syscalls interruptible by 
> signals. It is off by default, because many programs don't expect EINTR 
> when opening, reading or writing plain files on a filesystem.
> 
> > Any other -EINTR system call will also cause you problems since you
> > continuously send signals to your process, so we'll have to change them
> > all for this to work?
> 
> I use SA_RESTART for the signals. And I retry all the syscalls on EINTR 
> just in case SA_RESTART didn't work. So, I don't experience random 
> failures in my code due to the periodic signal.
> 
> But there is code that I have no control over - such as the OpenCL shared 
> library.

Right. So I am wondering if just returning -ERESTARTSYS (whether in
mm_take_all_locks(), or in the AMD driver) would satisfy both parties.

Folks installing and using signals need to pay attention and set
SA_RESTART, but that's already best practice when dealing with third-party
code. open(2) should be transparently restartable.

WDYT?

-- 
Pedro


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-01-06 21:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-04 21:17 [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal Mikulas Patocka
2026-01-05 10:42 ` Vlastimil Babka
2026-01-05 12:15   ` Lorenzo Stoakes
2026-01-05 18:15 ` Liam R. Howlett
2026-01-05 20:08   ` Mikulas Patocka
2026-01-06 17:40     ` Liam R. Howlett
2026-01-06 20:19       ` Mikulas Patocka
2026-01-06 21:56         ` Pedro Falcato
2026-01-06 11:36   ` Michel Dänzer
2026-01-06 12:52     ` Mikulas Patocka
2026-01-06 15:03       ` David Hildenbrand (Red Hat)
2026-01-06 14:57     ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox