* copying from/to user question @ 2024-09-09 8:50 Christian Brauner 2024-09-09 9:18 ` Christian Brauner 2024-09-09 14:55 ` Al Viro 0 siblings, 2 replies; 6+ messages in thread From: Christian Brauner @ 2024-09-09 8:50 UTC (permalink / raw) To: Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox Cc: linux-mm, linux-fsdevel Hey, This is another round of Christian's asking sus questions about kernel apis. I asked them a few people and generally the answers I got was "Good question, I don't know." or the reasoning varied a lot. So I take it I'm not the only one with that question. I was looking at a potential epoll() bug and it got me thinking about dos & don'ts for put_user()/copy_from_user() and related helpers as epoll does acquire the epoll mutex and then goes on to loop over a list of ready items and calls __put_user() for each item. Granted, it only puts a __u64 and an integer but still that seems adventurous to me and I wondered why. Generally, new vfs apis always try hard to call helpers that copy to or from userspace without any locks held as my understanding has been that this is best practice as to avoid risking taking page faults while holding a mutex or semaphore even though that's supposedly safe. Is this understanding correct? And aside from best practice is it in principle safe to copy to or from userspace with sleeping locks held? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copying from/to user question 2024-09-09 8:50 copying from/to user question Christian Brauner @ 2024-09-09 9:18 ` Christian Brauner 2024-09-09 12:14 ` Arnd Bergmann 2024-09-09 17:14 ` Linus Torvalds 2024-09-09 14:55 ` Al Viro 1 sibling, 2 replies; 6+ messages in thread From: Christian Brauner @ 2024-09-09 9:18 UTC (permalink / raw) To: Thomas Gleixner, Arnd Bergmann, Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox Cc: linux-mm, linux-fsdevel [Forgot to add Thomas and Arnd as Mike pointed out] On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote: > Hey, > > This is another round of Christian's asking sus questions about kernel > apis. I asked them a few people and generally the answers I got was > "Good question, I don't know." or the reasoning varied a lot. So I take > it I'm not the only one with that question. > > I was looking at a potential epoll() bug and it got me thinking about > dos & don'ts for put_user()/copy_from_user() and related helpers as > epoll does acquire the epoll mutex and then goes on to loop over a list > of ready items and calls __put_user() for each item. Granted, it only > puts a __u64 and an integer but still that seems adventurous to me and I > wondered why. > > Generally, new vfs apis always try hard to call helpers that copy to or > from userspace without any locks held as my understanding has been that > this is best practice as to avoid risking taking page faults while > holding a mutex or semaphore even though that's supposedly safe. > > Is this understanding correct? And aside from best practice is it in > principle safe to copy to or from userspace with sleeping locks held? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copying from/to user question 2024-09-09 9:18 ` Christian Brauner @ 2024-09-09 12:14 ` Arnd Bergmann 2024-09-09 14:31 ` Thomas Gleixner 2024-09-09 17:14 ` Linus Torvalds 1 sibling, 1 reply; 6+ messages in thread From: Arnd Bergmann @ 2024-09-09 12:14 UTC (permalink / raw) To: Christian Brauner, Thomas Gleixner, Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox Cc: linux-mm, linux-fsdevel On Mon, Sep 9, 2024, at 09:18, Christian Brauner wrote: > On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote: >> >> This is another round of Christian's asking sus questions about kernel >> apis. I asked them a few people and generally the answers I got was >> "Good question, I don't know." or the reasoning varied a lot. So I take >> it I'm not the only one with that question. >> >> I was looking at a potential epoll() bug and it got me thinking about >> dos & don'ts for put_user()/copy_from_user() and related helpers as >> epoll does acquire the epoll mutex and then goes on to loop over a list >> of ready items and calls __put_user() for each item. Granted, it only >> puts a __u64 and an integer but still that seems adventurous to me and I >> wondered why. >> >> Generally, new vfs apis always try hard to call helpers that copy to or >> from userspace without any locks held as my understanding has been that >> this is best practice as to avoid risking taking page faults while >> holding a mutex or semaphore even though that's supposedly safe. >> >> Is this understanding correct? And aside from best practice is it in >> principle safe to copy to or from userspace with sleeping locks held? I would be very suspicious if it's an actual __put_user() rather than the normal put_user() since at least on x86 that skips the __might_fault() instrumentation. With the normal put_user() at least I would expect the might_lock_read(¤t->mm->mmap_lock) instrumentation in __might_fault() to cause a lockdep splat if you are holding a mutex that is also required during a page fault, which in turn would deadlock if your __user pointer is paged out. I have not seen that particular lockdep output, but I imagine that is why VFS code tends to avoids the put_user() inside of a mutex, while code that is nowhere near paging gets away with it. Arnd ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copying from/to user question 2024-09-09 12:14 ` Arnd Bergmann @ 2024-09-09 14:31 ` Thomas Gleixner 0 siblings, 0 replies; 6+ messages in thread From: Thomas Gleixner @ 2024-09-09 14:31 UTC (permalink / raw) To: Arnd Bergmann, Christian Brauner, Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox Cc: linux-mm, linux-fsdevel On Mon, Sep 09 2024 at 12:14, Arnd Bergmann wrote: > On Mon, Sep 9, 2024, at 09:18, Christian Brauner wrote: >> On Mon, Sep 09, 2024 at 10:50:10AM GMT, Christian Brauner wrote: >>> >>> This is another round of Christian's asking sus questions about kernel >>> apis. I asked them a few people and generally the answers I got was >>> "Good question, I don't know." or the reasoning varied a lot. So I take >>> it I'm not the only one with that question. >>> >>> I was looking at a potential epoll() bug and it got me thinking about >>> dos & don'ts for put_user()/copy_from_user() and related helpers as >>> epoll does acquire the epoll mutex and then goes on to loop over a list >>> of ready items and calls __put_user() for each item. Granted, it only >>> puts a __u64 and an integer but still that seems adventurous to me and I >>> wondered why. >>> >>> Generally, new vfs apis always try hard to call helpers that copy to or >>> from userspace without any locks held as my understanding has been that >>> this is best practice as to avoid risking taking page faults while >>> holding a mutex or semaphore even though that's supposedly safe. >>> >>> Is this understanding correct? And aside from best practice is it in >>> principle safe to copy to or from userspace with sleeping locks held? > > I would be very suspicious if it's an actual __put_user() rather > than the normal put_user() since at least on x86 that skips the > __might_fault() instrumentation. epoll_put_uevent() uses __put_user(). __put_user() does neither have might_fault() nor does it check the destination pointer. It's documented that the caller needs to have validated it via access_ok(), which happens in do_epoll_wait(). > With the normal put_user() at least I would expect the > might_lock_read(¤t->mm->mmap_lock) instrumentation > in __might_fault() to cause a lockdep splat if you are holding > a mutex that is also required during a page fault, which > in turn would deadlock if your __user pointer is paged out. Right. But an actual page fault would still trip over that if there is a lock dependency chain because pagefaults are enabled. Coming back to your general question. It is generally safe to fault with a sleeping lock held when there is no invers lock chain vs. mmap_lock. Whether it's a good idea is a different question, which depends on the context of what the mutex is protecting and what consequences result in holding it for a extended period of time, e.g. due to a swapped out page. Thanks, tglx ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copying from/to user question 2024-09-09 9:18 ` Christian Brauner 2024-09-09 12:14 ` Arnd Bergmann @ 2024-09-09 17:14 ` Linus Torvalds 1 sibling, 0 replies; 6+ messages in thread From: Linus Torvalds @ 2024-09-09 17:14 UTC (permalink / raw) To: Christian Brauner Cc: Thomas Gleixner, Arnd Bergmann, Jan Kara, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox, linux-mm, linux-fsdevel On Mon, 9 Sept 2024 at 02:18, Christian Brauner <brauner@kernel.org> wrote: > > > Generally, new vfs apis always try hard to call helpers that copy to or > > from userspace without any locks held as my understanding has been that > > this is best practice as to avoid risking taking page faults while > > holding a mutex or semaphore even though that's supposedly safe. It's indeed "best practices" to strive to do user copies without locks, but it's not always possible to reasonably avoid. IOW, accessing user space with a lock held *can* cause some nasty issues, but is not necessarily wrong. The worst situation is where that lock then may be needed to *deal* with user space page faults, and that complicates the write() paths in particular (generic_perform_write() and friends using copy_folio_from_iter_atomic() and other magical games). But that's actually fairly unusual. The much more common situation is just a random lock, and we have user accesses under them all the time. You still want to be careful, because if the lock is important enough, it can cause users to be able to effectively DoS some subsystem and/or just be a huge nuisance (we used to have that in the tty layer). And no, the size of the user copy doesn't much matter. A __put_user() isn't much better than a big copy_from_user() - it may be faster for the simple case where things are in memory, but it's the "it's paged out" case that causes issues, and then it's the IO (and possible extra user-controlled fuse paths in particular) that are an issue, not whether it's "just one 64-bit word". Epoll is disgusting. But the real problems with epoll tend to be about the random file descriptor recursions, not the epoll mutex that only epoll cares about. Linus ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: copying from/to user question 2024-09-09 8:50 copying from/to user question Christian Brauner 2024-09-09 9:18 ` Christian Brauner @ 2024-09-09 14:55 ` Al Viro 1 sibling, 0 replies; 6+ messages in thread From: Al Viro @ 2024-09-09 14:55 UTC (permalink / raw) To: Christian Brauner Cc: Jan Kara, Linus Torvalds, Amir Goldstein, Aleksa Sarai, Mike Rapoport, Vlastimil Babka, Matthew Wilcox, linux-mm, linux-fsdevel On Mon, Sep 09, 2024 at 10:50:04AM +0200, Christian Brauner wrote: > Hey, > > This is another round of Christian's asking sus questions about kernel > apis. I asked them a few people and generally the answers I got was > "Good question, I don't know." or the reasoning varied a lot. So I take > it I'm not the only one with that question. > > I was looking at a potential epoll() bug and it got me thinking about > dos & don'ts for put_user()/copy_from_user() and related helpers as > epoll does acquire the epoll mutex and then goes on to loop over a list > of ready items and calls __put_user() for each item. Granted, it only > puts a __u64 and an integer but still that seems adventurous to me and I > wondered why. > > Generally, new vfs apis always try hard to call helpers that copy to or > from userspace without any locks held as my understanding has been that > this is best practice as to avoid risking taking page faults while > holding a mutex or semaphore even though that's supposedly safe. > > Is this understanding correct? And aside from best practice is it in > principle safe to copy to or from userspace with sleeping locks held? You do realize that e.g. write(2) will copy from userland with a sleeping lock (->i_rwsem) held, right? Inevitably so. It really depends upon the lock in question; sure, milder locking environment is generally less headache, but that's about it. You can't do that under page lock. You can do that under ->i_rwsem. As for the epoll mutex... do we ever have it nested inside anything taken on #PF paths? I don't see anything of that sort, but I might've missed something... ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-09 17:14 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-09-09 8:50 copying from/to user question Christian Brauner 2024-09-09 9:18 ` Christian Brauner 2024-09-09 12:14 ` Arnd Bergmann 2024-09-09 14:31 ` Thomas Gleixner 2024-09-09 17:14 ` Linus Torvalds 2024-09-09 14:55 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox