From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47DBEC433E2 for ; Mon, 7 Sep 2020 14:55:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B437821532 for ; Mon, 7 Sep 2020 14:55:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B437821532 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ubuntu.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EF7FA6B0003; Mon, 7 Sep 2020 10:55:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA8566B0037; Mon, 7 Sep 2020 10:55:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBF3A6B0055; Mon, 7 Sep 2020 10:55:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0033.hostedemail.com [216.40.44.33]) by kanga.kvack.org (Postfix) with ESMTP id C68CF6B0003 for ; Mon, 7 Sep 2020 10:55:57 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 86F941E12 for ; Mon, 7 Sep 2020 14:55:57 +0000 (UTC) X-FDA: 77236565154.25.slope15_3d0128c270cd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 591AB1804E3A8 for ; Mon, 7 Sep 2020 14:55:57 +0000 (UTC) X-HE-Tag: slope15_3d0128c270cd X-Filterd-Recvd-Size: 8209 Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 7 Sep 2020 14:55:56 +0000 (UTC) Received: from ip5f5af70b.dynamic.kabel-deutschland.de ([95.90.247.11] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kFIYl-0003o4-3d; Mon, 07 Sep 2020 14:55:55 +0000 Date: Mon, 7 Sep 2020 16:55:54 +0200 From: Christian Brauner To: Adalbert =?utf-8?B?TGF6xINy?= Cc: linux-mm@kvack.org, linux-api@vger.kernel.org, Andrew Morton , Alexander Graf , Stefan Hajnoczi , Jerome Glisse , Paolo Bonzini , Mihai =?utf-8?B?RG9uyJt1?= , Mircea Cirjaliu , Andy Lutomirski , Arnd Bergmann , Sargun Dhillon , Aleksa Sarai , Oleg Nesterov , Jann Horn , Kees Cook , Matthew Wilcox , Christian Brauner Subject: Re: [RESEND RFC PATCH 5/5] pidfd_mem: implemented remote memory mapping system call Message-ID: <20200907145554.5yxgrdi6b3gkmt5t@wittgenstein> References: <20200904113116.20648-1-alazar@bitdefender.com> <20200904113116.20648-6-alazar@bitdefender.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200904113116.20648-6-alazar@bitdefender.com> X-Rspamd-Queue-Id: 591AB1804E3A8 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 04, 2020 at 02:31:16PM +0300, Adalbert Laz=C4=83r wrote: > From: Mircea Cirjaliu >=20 > This system call returns 2 fds for inspecting the address space of a > remote process: one for control and one for access. Use according to > remote mapping specifications. >=20 > Cc: Christian Brauner > Signed-off-by: Mircea Cirjaliu > Signed-off-by: Adalbert Laz=C4=83r > --- > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > include/linux/pid.h | 1 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 2 + > kernel/exit.c | 2 +- > kernel/pid.c | 55 ++++++++++++++++++++++++++ > 7 files changed, 62 insertions(+), 1 deletion(-) >=20 > diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/sy= scalls/syscall_32.tbl > index 54581ac671b4..ca1b5a32dbc5 100644 > --- a/arch/x86/entry/syscalls/syscall_32.tbl > +++ b/arch/x86/entry/syscalls/syscall_32.tbl > @@ -440,5 +440,6 @@ > 433 i386 fspick sys_fspick > 434 i386 pidfd_open sys_pidfd_open > 435 i386 clone3 sys_clone3 > +436 i386 pidfd_mem sys_pidfd_mem > 437 i386 openat2 sys_openat2 > 438 i386 pidfd_getfd sys_pidfd_getfd > diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/sy= scalls/syscall_64.tbl > index 37b844f839bc..6138d3d023f8 100644 > --- a/arch/x86/entry/syscalls/syscall_64.tbl > +++ b/arch/x86/entry/syscalls/syscall_64.tbl > @@ -357,6 +357,7 @@ > 433 common fspick sys_fspick > 434 common pidfd_open sys_pidfd_open > 435 common clone3 sys_clone3 > +436 common pidfd_mem sys_pidfd_mem > 437 common openat2 sys_openat2 > 438 common pidfd_getfd sys_pidfd_getfd > =20 > diff --git a/include/linux/pid.h b/include/linux/pid.h > index cc896f0fc4e3..9ec23ab23fd4 100644 > --- a/include/linux/pid.h > +++ b/include/linux/pid.h > @@ -76,6 +76,7 @@ extern const struct file_operations pidfd_fops; > =20 > struct file; > =20 > +extern struct pid *pidfd_get_pid(unsigned int fd); > extern struct pid *pidfd_pid(const struct file *file); > =20 > static inline struct pid *get_pid(struct pid *pid) > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index 1815065d52f3..621f3d52ed4e 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -934,6 +934,7 @@ asmlinkage long sys_clock_adjtime32(clockid_t which= _clock, > asmlinkage long sys_syncfs(int fd); > asmlinkage long sys_setns(int fd, int nstype); > asmlinkage long sys_pidfd_open(pid_t pid, unsigned int flags); > +asmlinkage long sys_pidfd_mem(int pidfd, int __user *fds, unsigned int= flags); > asmlinkage long sys_sendmmsg(int fd, struct mmsghdr __user *msg, > unsigned int vlen, unsigned flags); > asmlinkage long sys_process_vm_readv(pid_t pid, > diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-gener= ic/unistd.h > index 3a3201e4618e..2663afc03c86 100644 > --- a/include/uapi/asm-generic/unistd.h > +++ b/include/uapi/asm-generic/unistd.h > @@ -850,6 +850,8 @@ __SYSCALL(__NR_pidfd_open, sys_pidfd_open) > #define __NR_clone3 435 > __SYSCALL(__NR_clone3, sys_clone3) > #endif > +#define __NR_pidfd_mem 436 > +__SYSCALL(__NR_pidfd_mem, sys_pidfd_mem) > =20 > #define __NR_openat2 437 > __SYSCALL(__NR_openat2, sys_openat2) > diff --git a/kernel/exit.c b/kernel/exit.c > index 389a88cb3081..37cd8949e606 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -1464,7 +1464,7 @@ static long do_wait(struct wait_opts *wo) > return retval; > } > =20 > -static struct pid *pidfd_get_pid(unsigned int fd) > +struct pid *pidfd_get_pid(unsigned int fd) > { > struct fd f; > struct pid *pid; > diff --git a/kernel/pid.c b/kernel/pid.c > index c835b844aca7..c9c49edf4a8a 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -42,6 +42,7 @@ > #include > #include > #include > +#include > =20 > struct pid init_struct_pid =3D { > .count =3D REFCOUNT_INIT(1), > @@ -565,6 +566,60 @@ SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned i= nt, flags) > return fd; > } > =20 > +/** > + * pidfd_mem() - Allow access to process address space. > + * > + * @pidfd: pid file descriptor for the target process > + * @fds: array where the control and access file descriptors are ret= urned > + * @flags: flags to pass > + * > + * This creates a pair of file descriptors used to gain access to the > + * target process memory. The control fd is used to establish a linear > + * mapping between an offset range and a userspace address range. > + * The access fd is used to mmap(offset range) on the client side. > + * > + * Return: On success, 0 is returned. > + * On error, a negative errno number will be returned. > + */ > +SYSCALL_DEFINE3(pidfd_mem, int, pidfd, int __user *, fds, unsigned int= , flags) > +{ > + struct pid *pid; > + struct task_struct *task; > + int ret_fds[2]; > + int ret; > + > + if (pidfd < 0) > + return -EINVAL; This needs to be EBADF. > + if (!fds) > + return -EINVAL; If this API holds up, I think similar to what Florian suggested, a struct would be nicer for userspace. Sm like: struct rmemfds { s32 memfd1; s32 memfd2; } and passing it a size pidfd_mem(int pidfd, struct rmemfd *fds, size_t size, unsigned int, flags= ) and then copy_struct_from_user() will take care to do the right thing for you. > + if (flags) > + return -EINVAL; > + > + pid =3D pidfd_get_pid(pidfd); > + if (IS_ERR(pid)) > + return PTR_ERR(pid); > + > + task =3D get_pid_task(pid, PIDTYPE_PID); > + put_pid(pid); > + if (IS_ERR(task)) > + return PTR_ERR(task); > + > + ret =3D -EPERM; > + if (unlikely(task =3D=3D current) || capable(CAP_SYS_PTRACE)) > + ret =3D task_remote_map(task, ret_fds); > + put_task_struct(task); > + if (IS_ERR_VALUE((long)ret)) > + return ret; > + > + if (copy_to_user(fds, ret_fds, sizeof(ret_fds))) { > + put_unused_fd(ret_fds[0]); > + put_unused_fd(ret_fds[1]); > + return -EFAULT; > + } > + > + return 0; > +} > + > void __init pid_idr_init(void) > { > /* Verify no one has done anything silly: */