* Re: [Ksummit-discuss] On the off-chance that my mount() notes are at all useful [not found] <20140819173125.GA17432@linux.vnet.ibm.com> @ 2014-08-24 16:59 ` Andy Lutomirski 2014-08-25 2:55 ` Paul E. McKenney 0 siblings, 1 reply; 2+ messages in thread From: Andy Lutomirski @ 2014-08-24 16:59 UTC (permalink / raw) To: Paul McKenney; +Cc: ksummit-discuss On Tue, Aug 19, 2014 at 10:31 AM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > o Mount based on file descriptor. Generated from openfs() > or some such. Ted: Want mount(), remount(), bind(), as separate > things. > > Have a mountf() for mounting an openfs()ed filesystem. > > Al: Ouch. > > Andy: Want to distinguish between this mount is read-only > and the underlying device will no longer be written to. > > Al: Three piles of garbage, not two. Need to take care about > userids and such. Some of the per-superblock flags are not > entirely private to a given filesystem, some are visible > to the VFS layer. > > Al: First syscall to start mounting could establish an open > descriptor. But the descriptor would not be a root directory, > but rather a channel for talking to a filesystem driver. Then > you can feed the parameters to the filesystem driver as needed, > rather than dumping them into the open() system call. > > Al: If you want horrors, look at ncpfs (sp?). This illustrates > why just getting the root directory is wrong. Root directory > is initially empty, after some operations it suddenly has > files in it. > > Al: Given that the syscalls are often followed by one another, > why have them separated? > > Al: If we are going to have this FD, then we should keep the > FD around for the duration. Closing it would get rid of > everything. Use FD to talk to filesystem driver throughout. > Don't need a process to hang around. > > Al: Note that unmount operates purely on the namespace. You > might still have open files on the unmounted filesystem, so > the filesystem is still around. > > Some discussion about getting the FD given a mounted filesystem. > Interaction between FD and shutdown. > > Al: But if FD is around, someone might remount filesystem. > So some hair if using FD to wait for all files from the > filesystem to be closed. > > Mount over symlinks? > > Al: Need to be careful here. Last I looked, this would be > extremely painful. Easier to hide a directory with a symlink > than vice versa. > > Discussion of an openat() and security holes. > > Ted: Can pass a directory FD across a UNIX-domain socket and > then do openat(), so security issue already exists. More > fun with mountat(). > > Al: Completely insane, greatly increases attack surface. > > Ted: FS fuzzers giving bugs are first-class bugs. But cloud > sysadmins might not like the attack surface. > > Serge: Use fuse to mediate security. > Here are my notes on features that I want, augmented some by the discussion: Requirements: - Syscalls that just affects mount points - Mount by fd. - Overmounting / should be useful (e.g. return an fd, mount-and-chdir, etc.) Currently, using mount(2) to mount on top of '/' is mostly useless, because there is no way to chdir to the new mount, to chroot to it, or to get an fd for it. - Cross-ns bind mount. That is, I want to be able to mount a foreign fd into my namespace. This doesn't really need a new API, but it would be a lot cleaner if we could use SCM_RIGHTS for this without mucking with /proc/self/fd. - Don't follow symlinks, at least optionally. Al Viro says that mounting on top of certain types of objects might be impossible, but I'd like to extend the set of possible overmounts. - Clear separation of superblock flags and mount flags. The read-only flag is somewhat special, but I think that it can be managed cleanly. - Explicit set/clear mount flags. Setting the read-only bit shouldn't involve reading the old flags with a separate syscall. - Bind and set/clear flags at the same time. (e.g. create a new read-only bind mount atomically.) - Leave room for unions. I'm not sure what this entails. Here's a possible piece of a new API: int mount_bind(int sourcefd, int destdfd, const char *destpath, int opflags, int clearflags, int setflags); opflags include BINDMNT_CHDIR, AT_NOFOLLOW, etc. The setflags are ored into the flags from the source, and the clearflags are cleared. Other flags are left unchanged. if (setflags & clearflags), -EINVAL is returned. int mount_changebindflags(int dfd, const char *path, int opflags, int clearflags, int setflags); Al Viro mentioned that, for a new fs (as opposed to a bind mount), we want a control fd for a file system, on which we can send commands, close (i.e. superblock shutdown), and change flags. --Andy ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Ksummit-discuss] On the off-chance that my mount() notes are at all useful 2014-08-24 16:59 ` [Ksummit-discuss] On the off-chance that my mount() notes are at all useful Andy Lutomirski @ 2014-08-25 2:55 ` Paul E. McKenney 0 siblings, 0 replies; 2+ messages in thread From: Paul E. McKenney @ 2014-08-25 2:55 UTC (permalink / raw) To: Andy Lutomirski; +Cc: ksummit-discuss On Sun, Aug 24, 2014 at 09:59:12AM -0700, Andy Lutomirski wrote: > On Tue, Aug 19, 2014 at 10:31 AM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > o Mount based on file descriptor. Generated from openfs() > > or some such. Ted: Want mount(), remount(), bind(), as separate > > things. > > > > Have a mountf() for mounting an openfs()ed filesystem. > > > > Al: Ouch. > > > > Andy: Want to distinguish between this mount is read-only > > and the underlying device will no longer be written to. > > > > Al: Three piles of garbage, not two. Need to take care about > > userids and such. Some of the per-superblock flags are not > > entirely private to a given filesystem, some are visible > > to the VFS layer. > > > > Al: First syscall to start mounting could establish an open > > descriptor. But the descriptor would not be a root directory, > > but rather a channel for talking to a filesystem driver. Then > > you can feed the parameters to the filesystem driver as needed, > > rather than dumping them into the open() system call. > > > > Al: If you want horrors, look at ncpfs (sp?). This illustrates > > why just getting the root directory is wrong. Root directory > > is initially empty, after some operations it suddenly has > > files in it. > > > > Al: Given that the syscalls are often followed by one another, > > why have them separated? > > > > Al: If we are going to have this FD, then we should keep the > > FD around for the duration. Closing it would get rid of > > everything. Use FD to talk to filesystem driver throughout. > > Don't need a process to hang around. > > > > Al: Note that unmount operates purely on the namespace. You > > might still have open files on the unmounted filesystem, so > > the filesystem is still around. > > > > Some discussion about getting the FD given a mounted filesystem. > > Interaction between FD and shutdown. > > > > Al: But if FD is around, someone might remount filesystem. > > So some hair if using FD to wait for all files from the > > filesystem to be closed. > > > > Mount over symlinks? > > > > Al: Need to be careful here. Last I looked, this would be > > extremely painful. Easier to hide a directory with a symlink > > than vice versa. > > > > Discussion of an openat() and security holes. > > > > Ted: Can pass a directory FD across a UNIX-domain socket and > > then do openat(), so security issue already exists. More > > fun with mountat(). > > > > Al: Completely insane, greatly increases attack surface. > > > > Ted: FS fuzzers giving bugs are first-class bugs. But cloud > > sysadmins might not like the attack surface. > > > > Serge: Use fuse to mediate security. > > > > Here are my notes on features that I want, augmented some by the discussion: Good additions! Would you like to send your added notes to Jon Corbet? He asked for notes for sessions that neither he nor Jake was able to attend. Thanx, Paul > Requirements: > > - Syscalls that just affects mount points > > - Mount by fd. > > - Overmounting / should be useful (e.g. return an fd, > mount-and-chdir, etc.) Currently, using mount(2) to mount on top of > '/' is mostly useless, because there is no way to chdir to the new > mount, to chroot to it, or to get an fd for it. > > - Cross-ns bind mount. That is, I want to be able to mount a foreign > fd into my namespace. This doesn't really need a new API, but it > would be a lot cleaner if we could use SCM_RIGHTS for this without > mucking with /proc/self/fd. > > - Don't follow symlinks, at least optionally. Al Viro says that > mounting on top of certain types of objects might be impossible, but > I'd like to extend the set of possible overmounts. > > - Clear separation of superblock flags and mount flags. The > read-only flag is somewhat special, but I think that it can be managed > cleanly. > > - Explicit set/clear mount flags. Setting the read-only bit > shouldn't involve reading the old flags with a separate syscall. > > - Bind and set/clear flags at the same time. (e.g. create a new > read-only bind mount atomically.) > > - Leave room for unions. I'm not sure what this entails. > > > Here's a possible piece of a new API: > > int mount_bind(int sourcefd, int destdfd, const char *destpath, int > opflags, int clearflags, int setflags); > > opflags include BINDMNT_CHDIR, AT_NOFOLLOW, etc. The setflags are > ored into the flags from the source, and the clearflags are cleared. > Other flags are left unchanged. if (setflags & clearflags), -EINVAL > is returned. > > > int mount_changebindflags(int dfd, const char *path, int opflags, int > clearflags, int setflags); > > > Al Viro mentioned that, for a new fs (as opposed to a bind mount), we > want a control fd for a file system, on which we can send commands, > close (i.e. superblock shutdown), and change flags. > > --Andy > ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-08-25 2:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20140819173125.GA17432@linux.vnet.ibm.com>
2014-08-24 16:59 ` [Ksummit-discuss] On the off-chance that my mount() notes are at all useful Andy Lutomirski
2014-08-25 2:55 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox