From: Pedro Falcato <pfalcato@suse.de>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"David Hildenbrand" <david@redhat.com>,
amd-gfx@lists.freedesktop.org, linux-mm@kvack.org,
"Vlastimil Babka" <vbabka@suse.cz>,
"Jann Horn" <jannh@google.com>
Subject: Re: [PATCH v3 2/3] mm: only interrupt taking all mm locks on fatal signal
Date: Tue, 6 Jan 2026 21:56:17 +0000 [thread overview]
Message-ID: <mfgqbtiqtl7cxzxhvu6ossi5umek2vpb2rag2bcqsof7ommvfz@uz6fqkc2jhik> (raw)
In-Reply-To: <6633f8ed-f432-f4c4-3fe2-8c14248cadab@redhat.com>
On Tue, Jan 06, 2026 at 09:19:59PM +0100, Mikulas Patocka wrote:
>
>
> On Tue, 6 Jan 2026, Liam R. Howlett wrote:
>
> > * Mikulas Patocka <mpatocka@redhat.com> [260105 15:08]:
> > >
> > > > If you only get the error message sometimes, does that mean there is
> > > > another signal check that isn't covered by this change - or another call
> > > > path?
> > >
> > > This call path is also triggered by -EINTR from mm_take_all_locks:
> > > "init_user_pages -> amdgpu_hmm_register -> mmu_interval_notifier_insert ->
> > > mmu_notifier_register -> __mmu_notifier_register -> mm_take_all_locks ->
> > > return -EINTR". I am not expert in the GPU code, so I don't know how much
> > > serious it is.
> >
> > Okay, so the other call paths also end up getting the -EINTR from this
> > function? Can you please add that detail to the commit message?
>
> Yes. I'd like to ask the GPU people to look at it and say how much damage
> this -EINTR could do. I don't know - I just saw the messages "Failed to
> register MMU notifier: -4" in the syslog.
>
> > This means that -EINTR can no longer be returned from open(), right?
> > Otherwise you are just reducing a race condition between open() and a
> > signal entering from your timer.
>
> EINTR can be returned from open() in cases when it was historically
> behaving this way - such as opening a fifo when there is no matching
> process having it open.
>
> But I think that opening /dev/kfd doesn't fall into this category.
>
Well, it's a device - opening can and often does have side-effects.
It's not too far-fetched to -EINTR here.
> NFS has an "intr" flag that makes the filesystem syscalls interruptible by
> signals. It is off by default, because many programs don't expect EINTR
> when opening, reading or writing plain files on a filesystem.
>
> > Any other -EINTR system call will also cause you problems since you
> > continuously send signals to your process, so we'll have to change them
> > all for this to work?
>
> I use SA_RESTART for the signals. And I retry all the syscalls on EINTR
> just in case SA_RESTART didn't work. So, I don't experience random
> failures in my code due to the periodic signal.
>
> But there is code that I have no control over - such as the OpenCL shared
> library.
Right. So I am wondering if just returning -ERESTARTSYS (whether in
mm_take_all_locks(), or in the AMD driver) would satisfy both parties.
Folks installing and using signals need to pay attention and set
SA_RESTART, but that's already best practice when dealing with third-party
code. open(2) should be transparently restartable.
WDYT?
--
Pedro
next prev parent reply other threads:[~2026-01-06 21:56 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-04 21:17 Mikulas Patocka
2026-01-05 10:42 ` Vlastimil Babka
2026-01-05 12:15 ` Lorenzo Stoakes
2026-01-05 18:15 ` Liam R. Howlett
2026-01-05 20:08 ` Mikulas Patocka
2026-01-06 17:40 ` Liam R. Howlett
2026-01-06 20:19 ` Mikulas Patocka
2026-01-06 21:56 ` Pedro Falcato [this message]
2026-01-07 20:14 ` Mikulas Patocka
2026-01-07 8:43 ` Vlastimil Babka
2026-01-07 9:25 ` Michel Dänzer
2026-01-06 11:36 ` Michel Dänzer
2026-01-06 12:52 ` Mikulas Patocka
2026-01-06 15:03 ` David Hildenbrand (Red Hat)
2026-01-07 9:55 ` Vlastimil Babka
2026-01-07 22:19 ` David Hildenbrand (Red Hat)
2026-01-06 14:57 ` Vlastimil Babka
2026-01-07 9:50 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mfgqbtiqtl7cxzxhvu6ossi5umek2vpb2rag2bcqsof7ommvfz@uz6fqkc2jhik \
--to=pfalcato@suse.de \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mpatocka@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox