From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB862C2BB40 for ; Tue, 15 Dec 2020 03:13:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7A36322473 for ; Tue, 15 Dec 2020 03:13:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7A36322473 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1BBB78D0079; Mon, 14 Dec 2020 22:13:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16BB58D001C; Mon, 14 Dec 2020 22:13:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 05F848D0079; Mon, 14 Dec 2020 22:13:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0031.hostedemail.com [216.40.44.31]) by kanga.kvack.org (Postfix) with ESMTP id DA5308D001C for ; Mon, 14 Dec 2020 22:13:57 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A77F41EF3 for ; Tue, 15 Dec 2020 03:13:57 +0000 (UTC) X-FDA: 77594047314.12.coat35_231774127420 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 89D8F1801CE34 for ; Tue, 15 Dec 2020 03:13:57 +0000 (UTC) X-HE-Tag: coat35_231774127420 X-Filterd-Recvd-Size: 6488 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Dec 2020 03:13:56 +0000 (UTC) Date: Mon, 14 Dec 2020 19:13:54 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1608002036; bh=nkW2/q6Ai/VIeeigz2vg/kGiHU/H4sL1QSI7nxooTZU=; h=From:To:Subject:In-Reply-To:From; b=uFrZcTuYYZaNLEFClzvy9BonD1rEqNmPMxm0QsAuPeba1e2+Azbu7JNB6XjJ9EAkj /h4R+3kaxhwgK/O4POZ0jN35tYMr8z++9eyXA8HeTBx561ESdJ+7ufaSC5vxXRkpau 8cD2mjqHSZ6zs2NlN8So/FE/P2IS3sHCH2A4KfXQ= From: Andrew Morton To: aarcange@redhat.com, akpm@linux-foundation.org, bigeasy@linutronix.de, calin@google.com, corbet@lwn.net, dancol@dancol.org, dancol@google.com, ebiggers@kernel.org, hannes@cmpxchg.org, jeffv@google.com, jglisse@redhat.com, joel@joelfernandes.org, kaleshsingh@google.com, keescook@chromium.org, linux-mm@kvack.org, lokeshgidra@google.com, mcgrof@kernel.org, mchehab+huawei@kernel.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, nigupta@nvidia.com, peterx@redhat.com, rppt@linux.vnet.ibm.com, shli@fb.com, stephen.smalley.work@gmail.com, surenb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, viro@zeniv.linux.org.uk, yzaikin@google.com Subject: [patch 180/200] userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob Message-ID: <20201215031354.gUsHJUpKo%akpm@linux-foundation.org> In-Reply-To: <20201214190237.a17b70ae14f129e2dca3d204@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Lokesh Gidra Subject: userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob With this change, when the knob is set to 0, it allows unprivileged users to call userfaultfd, like when it is set to 1, but with the restriction that page faults from only user-mode can be handled. In this mode, an unprivileged user (without SYS_CAP_PTRACE capability) must pass UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM. This enables administrators to reduce the likelihood that an attacker with access to userfaultfd can delay faulting kernel code to widen timing windows for other exploits. The default value of this knob is changed to 0. This is required for correct functioning of pipe mutex. However, this will fail postcopy live migration, which will be unnoticeable to the VM guests. To avoid this, set 'vm.userfault = 1' in /sys/sysctl.conf. The main reason this change is desirable as in the short term is that the Android userland will behave as with the sysctl set to zero. So without this commit, any Linux binary using userfaultfd to manage its memory would behave differently if run within the Android userland. For more details, refer to Andrea's reply [1]. [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/ Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.com Signed-off-by: Lokesh Gidra Reviewed-by: Andrea Arcangeli Cc: Kees Cook Cc: Jonathan Corbet Cc: Peter Xu Cc: Sebastian Andrzej Siewior Cc: Alexander Viro Cc: Stephen Smalley Cc: Eric Biggers Cc: Daniel Colascione Cc: "Joel Fernandes (Google)" Cc: Kalesh Singh Cc: Suren Baghdasaryan Cc: Jeff Vander Stoep Cc: Cc: Mike Rapoport Cc: Shaohua Li Cc: Jerome Glisse Cc: Mauro Carvalho Chehab Cc: Johannes Weiner Cc: Mel Gorman Cc: Nitin Gupta Cc: Vlastimil Babka Cc: Iurii Zaikin Cc: Luis Chamberlain Cc: Daniel Colascione Signed-off-by: Andrew Morton --- Documentation/admin-guide/sysctl/vm.rst | 15 ++++++++++----- fs/userfaultfd.c | 10 ++++++++-- 2 files changed, 18 insertions(+), 7 deletions(-) --- a/Documentation/admin-guide/sysctl/vm.rst~add-user-mode-only-option-to-unprivileged_userfaultfd-sysctl-knob +++ a/Documentation/admin-guide/sysctl/vm.rst @@ -873,12 +873,17 @@ file-backed pages is less than the high unprivileged_userfaultfd ======================== -This flag controls whether unprivileged users can use the userfaultfd -system calls. Set this to 1 to allow unprivileged users to use the -userfaultfd system calls, or set this to 0 to restrict userfaultfd to only -privileged users (with SYS_CAP_PTRACE capability). +This flag controls the mode in which unprivileged users can use the +userfaultfd system calls. Set this to 0 to restrict unprivileged users +to handle page faults in user mode only. In this case, users without +SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd to +succeed. Prohibiting use of userfaultfd for handling faults from kernel +mode may make certain vulnerabilities more difficult to exploit. -The default value is 1. +Set this to 1 to allow unprivileged users to use the userfaultfd system +calls without any restrictions. + +The default value is 0. user_reserve_kbytes --- a/fs/userfaultfd.c~add-user-mode-only-option-to-unprivileged_userfaultfd-sysctl-knob +++ a/fs/userfaultfd.c @@ -28,7 +28,7 @@ #include #include -int sysctl_unprivileged_userfaultfd __read_mostly = 1; +int sysctl_unprivileged_userfaultfd __read_mostly; static struct kmem_cache *userfaultfd_ctx_cachep __read_mostly; @@ -1966,8 +1966,14 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) struct userfaultfd_ctx *ctx; int fd; - if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE)) + if (!sysctl_unprivileged_userfaultfd && + (flags & UFFD_USER_MODE_ONLY) == 0 && + !capable(CAP_SYS_PTRACE)) { + printk_once(KERN_WARNING "uffd: Set unprivileged_userfaultfd " + "sysctl knob to 1 if kernel faults must be handled " + "without obtaining CAP_SYS_PTRACE capability\n"); return -EPERM; + } BUG_ON(!current->mm); _