linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* copying from/to user question
@ 2024-09-09  8:50 Christian Brauner
  2024-09-09  9:18 ` Christian Brauner
  2024-09-09 14:55 ` Al Viro
  0 siblings, 2 replies; 6+ messages in thread
From: Christian Brauner @ 2024-09-09  8:50 UTC (permalink / raw)
  To: Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai,
	Mike Rapoport, Vlastimil Babka, Matthew Wilcox
  Cc: linux-mm, linux-fsdevel

Hey,

This is another round of Christian's asking sus questions about kernel
apis. I asked them a few people and generally the answers I got was
"Good question, I don't know." or the reasoning varied a lot. So I take
it I'm not the only one with that question.

I was looking at a potential epoll() bug and it got me thinking about
dos & don'ts for put_user()/copy_from_user() and related helpers as
epoll does acquire the epoll mutex and then goes on to loop over a list
of ready items and calls __put_user() for each item. Granted, it only
puts a __u64 and an integer but still that seems adventurous to me and I
wondered why.

Generally, new vfs apis always try hard to call helpers that copy to or
from userspace without any locks held as my understanding has been that
this is best practice as to avoid risking taking page faults while
holding a mutex or semaphore even though that's supposedly safe.

Is this understanding correct? And aside from best practice is it in
principle safe to copy to or from userspace with sleeping locks held?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying from/to user question
  2024-09-09  8:50 copying from/to user question Christian Brauner
@ 2024-09-09  9:18 ` Christian Brauner
  2024-09-09 12:14   ` Arnd Bergmann
  2024-09-09 17:14   ` Linus Torvalds
  2024-09-09 14:55 ` Al Viro
  1 sibling, 2 replies; 6+ messages in thread
From: Christian Brauner @ 2024-09-09  9:18 UTC (permalink / raw)
  To: Thomas Gleixner, Arnd Bergmann, Jan Kara, Linus Torvalds,
	Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka,
	Matthew Wilcox
  Cc: linux-mm, linux-fsdevel

[Forgot to add Thomas and Arnd as Mike pointed out]

On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote:
> Hey,
> 
> This is another round of Christian's asking sus questions about kernel
> apis. I asked them a few people and generally the answers I got was
> "Good question, I don't know." or the reasoning varied a lot. So I take
> it I'm not the only one with that question.
> 
> I was looking at a potential epoll() bug and it got me thinking about
> dos & don'ts for put_user()/copy_from_user() and related helpers as
> epoll does acquire the epoll mutex and then goes on to loop over a list
> of ready items and calls __put_user() for each item. Granted, it only
> puts a __u64 and an integer but still that seems adventurous to me and I
> wondered why.
> 
> Generally, new vfs apis always try hard to call helpers that copy to or
> from userspace without any locks held as my understanding has been that
> this is best practice as to avoid risking taking page faults while
> holding a mutex or semaphore even though that's supposedly safe.
> 
> Is this understanding correct? And aside from best practice is it in
> principle safe to copy to or from userspace with sleeping locks held?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying from/to user question
  2024-09-09  9:18 ` Christian Brauner
@ 2024-09-09 12:14   ` Arnd Bergmann
  2024-09-09 14:31     ` Thomas Gleixner
  2024-09-09 17:14   ` Linus Torvalds
  1 sibling, 1 reply; 6+ messages in thread
From: Arnd Bergmann @ 2024-09-09 12:14 UTC (permalink / raw)
  To: Christian Brauner, Thomas Gleixner, Jan Kara, Linus Torvalds,
	Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka,
	Matthew Wilcox
  Cc: linux-mm, linux-fsdevel

On Mon, Sep 9, 2024, at 09:18, Christian Brauner wrote:
> On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote:
>> 
>> This is another round of Christian's asking sus questions about kernel
>> apis. I asked them a few people and generally the answers I got was
>> "Good question, I don't know." or the reasoning varied a lot. So I take
>> it I'm not the only one with that question.
>> 
>> I was looking at a potential epoll() bug and it got me thinking about
>> dos & don'ts for put_user()/copy_from_user() and related helpers as
>> epoll does acquire the epoll mutex and then goes on to loop over a list
>> of ready items and calls __put_user() for each item. Granted, it only
>> puts a __u64 and an integer but still that seems adventurous to me and I
>> wondered why.
>> 
>> Generally, new vfs apis always try hard to call helpers that copy to or
>> from userspace without any locks held as my understanding has been that
>> this is best practice as to avoid risking taking page faults while
>> holding a mutex or semaphore even though that's supposedly safe.
>> 
>> Is this understanding correct? And aside from best practice is it in
>> principle safe to copy to or from userspace with sleeping locks held?

I would be very suspicious if it's an actual __put_user() rather
than the normal put_user() since at least on x86 that skips the
__might_fault() instrumentation.

With the normal put_user() at least I would expect the
might_lock_read(&current->mm->mmap_lock) instrumentation
in __might_fault() to cause a lockdep splat if you are holding
a mutex that is also required during a page fault, which
in turn would deadlock if your __user pointer is paged out.

I have not seen that particular lockdep output, but I imagine
that is why VFS code tends to avoids the put_user() inside
of a mutex, while code that is nowhere near paging gets away
with it.

    Arnd


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying from/to user question
  2024-09-09 12:14   ` Arnd Bergmann
@ 2024-09-09 14:31     ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2024-09-09 14:31 UTC (permalink / raw)
  To: Arnd Bergmann, Christian Brauner, Jan Kara, Linus Torvalds,
	Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka,
	Matthew Wilcox
  Cc: linux-mm, linux-fsdevel

On Mon, Sep 09 2024 at 12:14, Arnd Bergmann wrote:
> On Mon, Sep 9, 2024, at 09:18, Christian Brauner wrote:
>> On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote:
>>> 
>>> This is another round of Christian's asking sus questions about kernel
>>> apis. I asked them a few people and generally the answers I got was
>>> "Good question, I don't know." or the reasoning varied a lot. So I take
>>> it I'm not the only one with that question.
>>> 
>>> I was looking at a potential epoll() bug and it got me thinking about
>>> dos & don'ts for put_user()/copy_from_user() and related helpers as
>>> epoll does acquire the epoll mutex and then goes on to loop over a list
>>> of ready items and calls __put_user() for each item. Granted, it only
>>> puts a __u64 and an integer but still that seems adventurous to me and I
>>> wondered why.
>>> 
>>> Generally, new vfs apis always try hard to call helpers that copy to or
>>> from userspace without any locks held as my understanding has been that
>>> this is best practice as to avoid risking taking page faults while
>>> holding a mutex or semaphore even though that's supposedly safe.
>>> 
>>> Is this understanding correct? And aside from best practice is it in
>>> principle safe to copy to or from userspace with sleeping locks held?
>
> I would be very suspicious if it's an actual __put_user() rather
> than the normal put_user() since at least on x86 that skips the
> __might_fault() instrumentation.

epoll_put_uevent() uses __put_user(). __put_user() does neither have
might_fault() nor does it check the destination pointer. It's documented
that the caller needs to have validated it via access_ok(), which
happens in do_epoll_wait().

> With the normal put_user() at least I would expect the
> might_lock_read(&current->mm->mmap_lock) instrumentation
> in __might_fault() to cause a lockdep splat if you are holding
> a mutex that is also required during a page fault, which
> in turn would deadlock if your __user pointer is paged out.

Right. But an actual page fault would still trip over that if there is a
lock dependency chain because pagefaults are enabled.

Coming back to your general question.

It is generally safe to fault with a sleeping lock held when there is no
invers lock chain vs. mmap_lock.

Whether it's a good idea is a different question, which depends on the
context of what the mutex is protecting and what consequences result in
holding it for a extended period of time, e.g. due to a swapped out
page.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying from/to user question
  2024-09-09  8:50 copying from/to user question Christian Brauner
  2024-09-09  9:18 ` Christian Brauner
@ 2024-09-09 14:55 ` Al Viro
  1 sibling, 0 replies; 6+ messages in thread
From: Al Viro @ 2024-09-09 14:55 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai,
	Mike Rapoport, Vlastimil Babka, Matthew Wilcox, linux-mm,
	linux-fsdevel

On Mon, Sep 09, 2024 at 10:50:04AM +0200, Christian Brauner wrote:
> Hey,
> 
> This is another round of Christian's asking sus questions about kernel
> apis. I asked them a few people and generally the answers I got was
> "Good question, I don't know." or the reasoning varied a lot. So I take
> it I'm not the only one with that question.
> 
> I was looking at a potential epoll() bug and it got me thinking about
> dos & don'ts for put_user()/copy_from_user() and related helpers as
> epoll does acquire the epoll mutex and then goes on to loop over a list
> of ready items and calls __put_user() for each item. Granted, it only
> puts a __u64 and an integer but still that seems adventurous to me and I
> wondered why.
> 
> Generally, new vfs apis always try hard to call helpers that copy to or
> from userspace without any locks held as my understanding has been that
> this is best practice as to avoid risking taking page faults while
> holding a mutex or semaphore even though that's supposedly safe.
> 
> Is this understanding correct? And aside from best practice is it in
> principle safe to copy to or from userspace with sleeping locks held?

You do realize that e.g. write(2) will copy from userland with a sleeping
lock (->i_rwsem) held, right?  Inevitably so.

It really depends upon the lock in question; sure, milder locking environment
is generally less headache, but that's about it.  You can't do that under
page lock.  You can do that under ->i_rwsem.  As for the epoll mutex...
do we ever have it nested inside anything taken on #PF paths?  I don't
see anything of that sort, but I might've missed something...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying from/to user question
  2024-09-09  9:18 ` Christian Brauner
  2024-09-09 12:14   ` Arnd Bergmann
@ 2024-09-09 17:14   ` Linus Torvalds
  1 sibling, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2024-09-09 17:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Thomas Gleixner, Arnd Bergmann, Jan Kara, Amir Goldstein,
	Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox,
	linux-mm, linux-fsdevel

On Mon, 9 Sept 2024 at 02:18, Christian Brauner <brauner@kernel.org> wrote:
>
> > Generally, new vfs apis always try hard to call helpers that copy to or
> > from userspace without any locks held as my understanding has been that
> > this is best practice as to avoid risking taking page faults while
> > holding a mutex or semaphore even though that's supposedly safe.

It's indeed "best practices" to strive to do user copies without
locks, but it's not always possible to reasonably avoid.

IOW, accessing user space with a lock held *can* cause some nasty
issues, but is not necessarily wrong.

The worst situation is where that lock then may be needed to *deal*
with user space page faults, and that complicates the write() paths in
particular (generic_perform_write() and friends using
copy_folio_from_iter_atomic() and other magical games). But that's
actually fairly unusual.

The much more common situation is just a random lock, and we have user
accesses under them all the time. You still want to be careful,
because if the lock is important enough, it can cause users to be able
to effectively DoS some subsystem and/or just be a huge nuisance (we
used to have that in the tty layer).

And no, the size of the user copy doesn't much matter. A __put_user()
isn't much better than a big copy_from_user() - it may be faster for
the simple case where things are in memory, but it's the "it's paged
out" case that causes issues, and then it's the IO (and possible extra
user-controlled fuse paths in particular) that are an issue, not
whether it's "just one 64-bit word".

Epoll is disgusting. But the real problems with epoll tend to be about
the random file descriptor recursions, not the epoll mutex that only
epoll cares about.

              Linus


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-09-09 17:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-09  8:50 copying from/to user question Christian Brauner
2024-09-09  9:18 ` Christian Brauner
2024-09-09 12:14   ` Arnd Bergmann
2024-09-09 14:31     ` Thomas Gleixner
2024-09-09 17:14   ` Linus Torvalds
2024-09-09 14:55 ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox