From: Alok Tiagi <aloktiagi@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Hillf Danton <hdanton@sina.com>,
ebiederm@xmission.com, netdev@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] net: add new socket option SO_SETNETNS
Date: Thu, 2 Feb 2023 23:58:56 +0000 [thread overview]
Message-ID: <Y9xOQPPGDrSN0IBu@ip-172-31-38-16.us-west-2.compute.internal> (raw)
In-Reply-To: <CANn89iLWZb-Uf_9a41ofBtVsHjBwHzbOVn+V_QrksnB9y80m6w@mail.gmail.com>
On Thu, Feb 02, 2023 at 09:10:23PM +0100, Eric Dumazet wrote:
> On Thu, Feb 2, 2023 at 8:55 PM Alok Tiagi <aloktiagi@gmail.com> wrote:
> >
> > On Thu, Feb 02, 2023 at 09:48:10AM +0800, Hillf Danton wrote:
> > > On Wed, 1 Feb 2023 19:22:57 +0000 aloktiagi <aloktiagi@gmail.com>
> > > > @@ -1535,6 +1535,52 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
> > > > WRITE_ONCE(sk->sk_txrehash, (u8)val);
> > > > break;
> > > >
> > > > + case SO_SETNETNS:
> > > > + {
> > > > + struct net *other_ns, *my_ns;
> > > > +
> > > > + if (sk->sk_family != AF_INET && sk->sk_family != AF_INET6) {
> > > > + ret = -EOPNOTSUPP;
> > > > + break;
> > > > + }
> > > > +
> > > > + if (sk->sk_type != SOCK_STREAM && sk->sk_type != SOCK_DGRAM) {
> > > > + ret = -EOPNOTSUPP;
> > > > + break;
> > > > + }
> > > > +
> > > > + other_ns = get_net_ns_by_fd(val);
> > > > + if (IS_ERR(other_ns)) {
> > > > + ret = PTR_ERR(other_ns);
> > > > + break;
> > > > + }
> > > > +
> > > > + if (!ns_capable(other_ns->user_ns, CAP_NET_ADMIN)) {
> > > > + ret = -EPERM;
> > > > + goto out_err;
> > > > + }
> > > > +
> > > > + /* check that the socket has never been connected or recently disconnected */
> > > > + if (sk->sk_state != TCP_CLOSE || sk->sk_shutdown & SHUTDOWN_MASK) {
> > > > + ret = -EOPNOTSUPP;
> > > > + goto out_err;
> > > > + }
> > > > +
> > > > + /* check that the socket is not bound to an interface*/
> > > > + if (sk->sk_bound_dev_if != 0) {
> > > > + ret = -EOPNOTSUPP;
> > > > + goto out_err;
> > > > + }
> > > > +
> > > > + my_ns = sock_net(sk);
> > > > + sock_net_set(sk, other_ns);
> > > > + put_net(my_ns);
> > > > + break;
> > >
> > > cpu 0 cpu 2
> > > --- ---
> > > ns = sock_net(sk);
> > > my_ns = sock_net(sk);
> > > sock_net_set(sk, other_ns);
> > > put_net(my_ns);
> > > ns is invalid ?
> >
> > That is the reason we want the socket to be in an un-connected state. That
> > should help us avoid this situation.
>
> This is not enough....
>
> Another thread might look at sock_net(sk), for example from inet_diag
> or tcp timers
> (which can be fired even in un-connected state)
>
> Even UDP sockets can receive packets while being un-connected,
> and they need to deref the net pointer.
>
> Currently there is no protection about sock_net(sk) being changed on the fly,
> and the struct net could disappear and be freed.
>
> There are ~1500 uses of sock_net(sk) in the kernel, I do not think
> you/we want to audit all
> of them to check what could go wrong...
I agree, auditing all the uses of sock_net(sk) is not a feasible option. From my
exploration of the usage of sock_net(sk) it appeared that it might be safe to
swap a sockets net ns if it had never been connected but I looked at only a
subset of such uses.
Introducing a ref counting logic to every access of sock_net(sk) may help get
around this but invovles a bigger change to increment and decrement the count at
every use of sock_net().
Any suggestions if this could be achieved in another way much close to the
socket creation time or any comments on our workaround for injecting sockets using
seccomp addfd?
next prev parent reply other threads:[~2023-02-02 23:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Y9q8Ec1CJILZz7dj@ip-172-31-38-16.us-west-2.compute.internal>
2023-02-02 1:48 ` Hillf Danton
2023-02-02 19:55 ` Alok Tiagi
2023-02-02 20:10 ` Eric Dumazet
2023-02-02 23:58 ` Alok Tiagi [this message]
2023-02-03 15:09 ` Eric Dumazet
2023-02-03 17:50 ` Alok Tiagi
2023-02-03 21:17 ` Eric W. Biederman
2023-02-04 18:44 ` Alok Tiagi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9xOQPPGDrSN0IBu@ip-172-31-38-16.us-west-2.compute.internal \
--to=aloktiagi@gmail.com \
--cc=ebiederm@xmission.com \
--cc=edumazet@google.com \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox