linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ackerley Tng <ackerleytng@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: kvm@vger.kernel.org, linux-api@vger.kernel.org,
	linux-arch@vger.kernel.org,  linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, qemu-devel@nongnu.org,  aarcange@redhat.com,
	ak@linux.intel.com, akpm@linux-foundation.org,  arnd@arndb.de,
	bfields@fieldses.org, bp@alien8.de,  chao.p.peng@linux.intel.com,
	corbet@lwn.net, dave.hansen@intel.com,  david@redhat.com,
	ddutile@redhat.com, dhildenb@redhat.com, hpa@zytor.com,
	 hughd@google.com, jlayton@kernel.org, jmattson@google.com,
	joro@8bytes.org,  jun.nakajima@intel.com,
	kirill.shutemov@linux.intel.com, linmiaohe@huawei.com,
	 luto@kernel.org, mail@maciej.szmigiero.name, mhocko@suse.com,
	 michael.roth@amd.com, mingo@redhat.com, naoya.horiguchi@nec.com,
	 pbonzini@redhat.com, qperret@google.com, rppt@kernel.org,
	seanjc@google.com,  shuah@kernel.org, steven.price@arm.com,
	tabba@google.com, tglx@linutronix.de,  vannapurve@google.com,
	vbabka@suse.cz, vkuznets@redhat.com,  wanpengli@tencent.com,
	wei.w.wang@intel.com, x86@kernel.org,
	 yu.c.zhang@linux.intel.com
Subject: Re: [RFC PATCH v2 1/2] mm: restrictedmem: Allow userspace to specify mount for memfd_restricted
Date: Fri, 31 Mar 2023 23:56:10 +0000	[thread overview]
Message-ID: <diqzmt3sqxut.fsf@ackerleytng-cloudtop.c.googlers.com> (raw)
In-Reply-To: <20230322111951.vfrm2xf4o5kmtte6@wittgenstein> (message from Christian Brauner on Wed, 22 Mar 2023 12:19:51 +0100)

Christian Brauner <brauner@kernel.org> writes:

> On Tue, Mar 21, 2023 at 08:15:32PM +0000, Ackerley Tng wrote:
>> By default, the backing shmem file for a restrictedmem fd is created
>> on shmem's kernel space mount.

>> ...

Thanks for reviewing this patch!


> This looks like you can just pass in some tmpfs fd and you just use it
> to identify the mnt and then you create a restricted memfd area in that
> instance. So if I did:

> mount -t tmpfs tmpfs /mnt
> mknod /mnt/bla c 0 0
> fd = open("/mnt/bla")
> memfd_restricted(fd)

> then it would create a memfd restricted entry in the tmpfs instance
> using the arbitrary dummy device node to infer the tmpfs instance.

> Looking at the older thread briefly and the cover letter. Afaict, the
> new mount api shouldn't figure into the design of this. fsopen() returns
> fds referencing a VFS-internal fs_context object. They can't be used to
> create or lookup files or identify mounts. The mount doesn't exist at
> that time. Not even a superblock might exist at the time before
> fsconfig(FSCONFIG_CMD_CREATE).

> When fsmount() is called after superblock setup then it's similar to any
> other fd from open() or open_tree() or whatever (glossing over some
> details that are irrelevant here). Difference is that open_tree() and
> fsmount() would refer to the root of a mount.

This is correct, memfd_restricted() needs an fd returned from fsmount()
and not fsopen(). Usage examples of this new parameter in
memfd_restricted() are available in selftests.


> At first I wondered why this doesn't just use standard *at() semantics
> but I guess the restricted memfd is unlinked and doesn't show up in the
> tmpfs instance.

> So if you go down that route then I would suggest to enforce that the
> provided fd refer to the root of a tmpfs mount. IOW, it can't just be an
> arbitrary file descriptor in a tmpfs instance. That seems cleaner to me:

> sb = f_path->mnt->mnt_sb;
> sb->s_magic == TMPFS_MAGIC && f_path->mnt->mnt_root == sb->s_root

> and has much tigher semantics than just allowing any kind of fd.

Thanks for your suggestion, I've tightened the semantics as you
suggested. memfd_restricted() now only accepts fds representing the root
of the mount.


> Another wrinkly I find odd but that's for you to judge is that this
> bypasses the permission model of the tmpfs instance. IOW, as long as you
> have a handle to the root of a tmpfs mount you can just create
> restricted memfds in there. So if I provided a completely sandboxed
> service - running in a user namespace or whatever - with an fd to the
> host's tmpfs instance they can just create restricted memfds in there no
> questions asked.

> Maybe that's fine but it's certainly something to spell out and think
> about the implications.

Thanks for pointing this out! I added a permissions check in RFC v3, and
clarified the permissions model (please see patch 1 of 2):
https://lore.kernel.org/lkml/cover.1680306489.git.ackerleytng@google.com/


  reply	other threads:[~2023-03-31 23:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 20:15 [RFC PATCH v2 0/2] Providing mount in memfd_restricted() syscall Ackerley Tng
2023-03-21 20:15 ` [RFC PATCH v2 1/2] mm: restrictedmem: Allow userspace to specify mount for memfd_restricted Ackerley Tng
2023-03-22 11:19   ` Christian Brauner
2023-03-31 23:56     ` Ackerley Tng [this message]
2023-03-21 20:15 ` [RFC PATCH v2 2/2] selftests: restrictedmem: Check hugepage-ness of shmem file backing restrictedmem fd Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=diqzmt3sqxut.fsf@ackerleytng-cloudtop.c.googlers.com \
    --to=ackerleytng@google.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bfields@fieldses.org \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=chao.p.peng@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jlayton@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jun.nakajima@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=mhocko@suse.com \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qperret@google.com \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=tabba@google.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox