From: Vlastimil Babka <vbabka@suse.cz>
To: Mikulas Patocka <mpatocka@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"David Hildenbrand" <david@redhat.com>,
amd-gfx@lists.freedesktop.org, linux-mm@kvack.org,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Jann Horn" <jannh@google.com>,
"Pedro Falcato" <pfalcato@suse.de>,
"Matthew Wilcox" <willy@infradead.org>
Subject: Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
Date: Mon, 5 Jan 2026 11:42:26 +0100 [thread overview]
Message-ID: <e435395f-90b3-4603-b305-8a52913cd0e5@suse.cz> (raw)
In-Reply-To: <b672e17b-461d-16ae-e7d3-45d3c1aab142@redhat.com>
On 1/4/26 22:17, Mikulas Patocka wrote:
> If a process sets up a timer that periodically sends a signal in short
> intervals and if it executes some kernel code that calls
> mm_take_all_locks, we get random -EINTR failures.
>
> The function mm_take_all_locks fails with -EINTR if there is pending
> signal. The -EINTR is propagated up the call stack to userspace and
> userspace fails if it gets this error.
>
> In order to fix these failures, this commit changes
> signal_pending(current) to fatal_signal_pending(current) in
> mm_take_all_locks, so that it is interrupted only if the signal is
> actually killing the process.
>
> For example, this bug happens when using OpenCL on AMDGPU. Sometimes,
> probing the OpenCL device fails (strace shows that open("/dev/kfd")
> failed with -EINTR). Sometimes we get the message "amdgpu:
> init_user_pages: Failed to register MMU notifier: -4" in the syslog.
>
> The bug can be reproduced with the following program.
>
> To run this program, you need AMD graphics card and the package
> "rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
> installed, because it redirects the default OpenCL implementation to
> itself.
>
> include <stdio.h>
> include <stdlib.h>
> include <unistd.h>
> include <string.h>
> include <signal.h>
> include <sys/time.h>
>
> define CL_TARGET_OPENCL_VERSION 300
> include <CL/opencl.h>
>
> static void fn(void)
> {
> while (1) {
> int32_t err;
> cl_device_id device;
> err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
> if (err != CL_SUCCESS) {
> fprintf(stderr, "clGetDeviceIDs failed: %d\n", err);
> exit(1);
> }
> write(2, "-", 1);
> }
> }
>
> static void alrm(int sig)
> {
> write(2, ".", 1);
> }
>
> int main(void)
> {
> struct itimerval it;
> struct sigaction sa;
> memset(&sa, 0, sizeof sa);
> sa.sa_handler = alrm;
> sa.sa_flags = SA_RESTART;
> sigaction(SIGALRM, &sa, NULL);
> it.it_interval.tv_sec = 0;
> it.it_interval.tv_usec = 50;
> it.it_value.tv_sec = 0;
> it.it_value.tv_usec = 50;
> setitimer(ITIMER_REAL, &it, NULL);
> fn();
> return 1;
> }
>
> I'm submitting this patch for the stable kernels, because this bug may
> cause random failures in any code that calls mm_take_all_locks.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-November/133141.html
> Link: https://yhbt.net/lore/linux-mm/6f16b618-26fc-3031-abe8-65c2090262e7@redhat.com/T/#u
> Cc: stable@vger.kernel.org
> Fixes: 7906d00cd1f6 ("mmu-notifiers: add mm_take_all_locks() operation")
Acked-by: Vlastimil Babka <vbabka@suse.cz>
This makes sense to me as a backportable bugfix. But I wonder if going
forward we should rather make all that locking killable instead of the
hopeful checks between individual lock attempts.
>
> ---
> mm/vma.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> Index: mm/mm/vma.c
> ===================================================================
> --- mm.orig/mm/vma.c 2026-01-04 21:19:13.000000000 +0100
> +++ mm/mm/vma.c 2026-01-04 21:19:13.000000000 +0100
> @@ -2166,14 +2166,14 @@ int mm_take_all_locks(struct mm_struct *
> * is reached.
> */
> for_each_vma(vmi, vma) {
> - if (signal_pending(current))
> + if (fatal_signal_pending(current))
> goto out_unlock;
> vma_start_write(vma);
E.g. here I think we already added a killable variant recently?
> }
>
> vma_iter_init(&vmi, mm, 0);
> for_each_vma(vmi, vma) {
> - if (signal_pending(current))
> + if (fatal_signal_pending(current))
> goto out_unlock;
> if (vma->vm_file && vma->vm_file->f_mapping &&
> is_vm_hugetlb_page(vma))
> @@ -2182,7 +2182,7 @@ int mm_take_all_locks(struct mm_struct *
>
> vma_iter_init(&vmi, mm, 0);
> for_each_vma(vmi, vma) {
> - if (signal_pending(current))
> + if (fatal_signal_pending(current))
> goto out_unlock;
> if (vma->vm_file && vma->vm_file->f_mapping &&
> !is_vm_hugetlb_page(vma))
> @@ -2191,7 +2191,7 @@ int mm_take_all_locks(struct mm_struct *
>
> vma_iter_init(&vmi, mm, 0);
> for_each_vma(vmi, vma) {
> - if (signal_pending(current))
> + if (fatal_signal_pending(current))
> goto out_unlock;
> if (vma->anon_vma)
> list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
>
next prev parent reply other threads:[~2026-01-05 10:42 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-04 21:17 Mikulas Patocka
2026-01-05 10:42 ` Vlastimil Babka [this message]
2026-01-05 12:15 ` Lorenzo Stoakes
2026-01-05 18:15 ` Liam R. Howlett
2026-01-05 20:08 ` Mikulas Patocka
2026-01-06 17:40 ` Liam R. Howlett
2026-01-06 20:19 ` Mikulas Patocka
2026-01-06 21:56 ` Pedro Falcato
2026-01-07 20:14 ` Mikulas Patocka
2026-01-07 8:43 ` Vlastimil Babka
2026-01-07 9:25 ` Michel Dänzer
2026-01-06 11:36 ` Michel Dänzer
2026-01-06 12:52 ` Mikulas Patocka
2026-01-06 15:03 ` David Hildenbrand (Red Hat)
2026-01-07 9:55 ` Vlastimil Babka
2026-01-07 22:19 ` David Hildenbrand (Red Hat)
2026-01-06 14:57 ` Vlastimil Babka
2026-01-07 9:50 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e435395f-90b3-4603-b305-8a52913cd0e5@suse.cz \
--to=vbabka@suse.cz \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mpatocka@redhat.com \
--cc=pfalcato@suse.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox