linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Kees Cook <keescook@chromium.org>
Cc: Andrei Vagin <avagin@gmail.com>,
	linux-kernel@vger.kernel.org,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	Florian Weimer <fweimer@redhat.com>,
	linux-mm@kvack.org, Eric Biederman <ebiederm@xmission.com>
Subject: Re: [PATCH 1/2] fs/exec: allow to unshare a time namespace on vfork+exec
Date: Wed, 15 Jun 2022 09:52:19 +0200	[thread overview]
Message-ID: <20220615075219.5cvoc3py3zdm74oo@wittgenstein> (raw)
In-Reply-To: <202206141412.2B0732FF6C@keescook>

On Tue, Jun 14, 2022 at 02:14:35PM -0700, Kees Cook wrote:
> On Sun, Jun 12, 2022 at 11:07:22PM -0700, Andrei Vagin wrote:
> > Right now, a new process can't be forked in another time namespace
> > if it shares mm with its parent. It is prohibited, because each time
> > namespace has its own vvar page that is mapped into a process address
> > space.
> > 
> > When a process calls exec, it gets a new mm and so it could be "legal"
> > to switch time namespace in that case. This was not implemented and
> > now if we want to do this, we need to add another clone flag to not
> > break backward compatibility.
> > 
> > We don't have any user requests to switch times on exec except the
> > vfork+exec combination, so there is no reason to add a new clone flag.
> > As for vfork+exec, this should be safe to allow switching timens with
> > the current clone flag. Right now, vfork (CLONE_VFORK | CLONE_VM) fails
> > if a child is forked into another time namespace. With this change,
> > vfork creates a new process in parent's timens, and the following exec
> > does the actual switch to the target time namespace.
> 
> This seems like a very special case. None of the other namespaces do
> this, do they?
> 
> How is CLONE_NEWTIME supposed to be used today?

Time namespaces are similar to pid namespaces. If a process calls
unshare(CLONE_NEWTIME) it will not change into a new time namespace.
Only the children of the process will.

You can also see this via /proc/<pid>/ns/time and
/proc/<pid>/ns/time_for_children. After an unshare(CLONE_NEWTIME)
/proc/<pid>/ns/time will be unchanged while
/proc/<pid>/ns/time_for_children will reference a new time namespace.

So if the process now calls fork() the child will be placed in a new
time namespace.

As Andrei correctly points out in the commit message each time namespace
gets it's own vvar page mapped into the process address space.
Consequently calls to clone*() with CLONE_VM will need to fail because
it would alter the parent's mm as well. That includes vfork() which is
roughly just CLONE_VM | CLONE_VFORK.

fork() remains unaffected. So anything that implements a process
launcher using vfork() needs to implement a fork() fallback after vfork
failure() in case the original process has unshared a new time
namespace.

As posix spawn is implemented using vfork() we would force glibc to
implement a fork() fallback and enforce the introducing of a lot of
complexity to work around this.

I think the proposal here makes sense and allows us to avoid introducing
yet another clone flag. For vfork() it also makes sense because the
calling process is suspended until exec or exit so the semantics between
fork() and clone*() are sufficiently distinct to justify this
difference. Iow, vfork() is distinctly targeted at enforcing an exec*
call already anyway.

Christian


  reply	other threads:[~2022-06-15  7:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13  6:07 Andrei Vagin
2022-06-13  6:07 ` [PATCH 2/2] testing/timens: add a test for vfork+exit Andrei Vagin
2022-06-14 21:14 ` [PATCH 1/2] fs/exec: allow to unshare a time namespace on vfork+exec Kees Cook
2022-06-15  7:52   ` Christian Brauner [this message]
2022-06-15  7:53   ` Florian Weimer
2022-06-15  8:00     ` Christian Brauner
2022-06-15  8:14       ` Florian Weimer
2022-06-15  8:53         ` Christian Brauner
2022-06-15  7:37 ` Christian Brauner
2022-06-15 15:01 ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220615075219.5cvoc3py3zdm74oo@wittgenstein \
    --to=brauner@kernel.org \
    --cc=0x7f454c46@gmail.com \
    --cc=avagin@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=fweimer@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox