From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63B76C63697 for ; Fri, 20 Nov 2020 03:10:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B68EB2145D for ; Fri, 20 Nov 2020 03:10:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g3r94oz0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B68EB2145D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3EE966B006C; Thu, 19 Nov 2020 22:10:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 39FA56B006E; Thu, 19 Nov 2020 22:10:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28DC76B0070; Thu, 19 Nov 2020 22:10:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id E59FD6B006C for ; Thu, 19 Nov 2020 22:10:33 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 79466180AD81D for ; Fri, 20 Nov 2020 03:10:33 +0000 (UTC) X-FDA: 77503318746.06.sky96_4c156fc27348 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 56409100536AF for ; Fri, 20 Nov 2020 03:10:33 +0000 (UTC) X-HE-Tag: sky96_4c156fc27348 X-Filterd-Recvd-Size: 7900 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Nov 2020 03:10:32 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id k27so10852565ejs.10 for ; Thu, 19 Nov 2020 19:10:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eUfE7C8uSYsq1jDdQ+F2fB6YPrXtnZerneXItzJY/94=; b=g3r94oz06VUs8DX9vKbCAQpTO3TwkaLGfrLi/5MlgFQS4kV5935NMw6Pw/39dHUUmW VTNxBWyVCISQAoa0c5QxREPtJEc2Is0PGtGo7KJjmNf5g2r5gUhcetU8U+7hineERSmu 672wa2JzGxN14vkk4mUt8wO174RSmF2oH6Ar23ngI9tUL0jB+k4Sm1bDbQ97WPjNi/6r VRGPHDvKCOL+dKnpb7ed6oZhf9USK6POpE7St9g1qrTmEOAywHOZLmroTcbX5FNKHiLc IoNe/w5FOVjgNTnMxRelQo8hfy1rhmDu7ufcByHaib1jZZNuI2iT1AUU7HgbxoZFktqL iSQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eUfE7C8uSYsq1jDdQ+F2fB6YPrXtnZerneXItzJY/94=; b=CshFO650kIUpzJWKu435+AaGBnqv2IYHTascxook9Zpg5pHwim/+CHq/E1dvUUKYP0 PTUiRZqZkzgi6O8TaNBvWpU5MoTQmyB9xNClPXEj02E2Ep2VFDSf09INZF1z2aN517rJ cOiXLbB356sBXv6RBA0S3VUa4ldjN5JTk7feH11UfKjm2hz0gYGKY+YNzFPcVSEfj72H IK8SFRB7IJFyrwDa6Ol5a0nDaQz2xzX1vOPCIJ1pRCSCYpSEWzyaWsOjmrThOtaz0J00 h8ViyEGy4zu6FIt/mWIwiXYwANmULINgCCFi66AW7E+OU58MUdD1Fb2H4raJ4VBRzhNp zbfQ== X-Gm-Message-State: AOAM531T0ajkYEhHO5qN85l4+lCa/3HQXbaR1bI07Ipkxb5Q/BZRHL9j mpE1NqFJ4ILv4qjJ9C/lyKW6T5LwpiORYXohptF3xQ== X-Google-Smtp-Source: ABdhPJzgq/aERbeM1xKirsVcdEPVVKlH3kowdPx+apHmrmJn/Q9UIgbYin/OoVyKJbxwtQiJQlSbZm2sTdsvaqf/H8k= X-Received: by 2002:a17:906:c312:: with SMTP id s18mr30924739ejz.185.1605841831524; Thu, 19 Nov 2020 19:10:31 -0800 (PST) MIME-Version: 1.0 References: <20201120030411.2690816-1-lokeshgidra@google.com> <20201120030411.2690816-3-lokeshgidra@google.com> In-Reply-To: <20201120030411.2690816-3-lokeshgidra@google.com> From: Lokesh Gidra Date: Thu, 19 Nov 2020 19:10:20 -0800 Message-ID: Subject: Re: [PATCH v6 2/2] Add user-mode only option to unprivileged_userfaultfd sysctl knob To: Kees Cook , Jonathan Corbet , Peter Xu , Andrea Arcangeli , Sebastian Andrzej Siewior , Andrew Morton Cc: Alexander Viro , Stephen Smalley , Eric Biggers , Daniel Colascione , "Joel Fernandes (Google)" , Linux FS Devel , linux-kernel , linux-doc@vger.kernel.org, Kalesh Singh , Calin Juravle , Suren Baghdasaryan , Jeffrey Vander Stoep , "Cc: Android Kernel" , Mike Rapoport , Shaohua Li , Jerome Glisse , Mauro Carvalho Chehab , Johannes Weiner , Mel Gorman , Nitin Gupta , Vlastimil Babka , Iurii Zaikin , Luis Chamberlain , "open list:MEMORY MANAGEMENT" Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 19, 2020 at 7:04 PM Lokesh Gidra wrote: > > With this change, when the knob is set to 0, it allows unprivileged > users to call userfaultfd, like when it is set to 1, but with the > restriction that page faults from only user-mode can be handled. > In this mode, an unprivileged user (without SYS_CAP_PTRACE capability) > must pass UFFD_USER_MODE_ONLY to userfaultd or the API will fail with > EPERM. > > This enables administrators to reduce the likelihood that an attacker > with access to userfaultfd can delay faulting kernel code to widen > timing windows for other exploits. > > The default value of this knob is changed to 0. This is required for > correct functioning of pipe mutex. However, this will fail postcopy > live migration, which will be unnoticeable to the VM guests. To avoid > this, set 'vm.userfault = 1' in /sys/sysctl.conf. > > The main reason this change is desirable as in the short term is that > the Android userland will behave as with the sysctl set to zero. So > without this commit, any Linux binary using userfaultfd to manage its > memory would behave differently if run within the Android userland. > For more details, refer to Andrea's reply [1]. > > [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/ > > Signed-off-by: Lokesh Gidra > Reviewed-by: Andrea Arcangeli > --- > Documentation/admin-guide/sysctl/vm.rst | 15 ++++++++++----- > fs/userfaultfd.c | 10 ++++++++-- > 2 files changed, 18 insertions(+), 7 deletions(-) > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst > index f455fa00c00f..d06a98b2a4e7 100644 > --- a/Documentation/admin-guide/sysctl/vm.rst > +++ b/Documentation/admin-guide/sysctl/vm.rst > @@ -873,12 +873,17 @@ file-backed pages is less than the high watermark in a zone. > unprivileged_userfaultfd > ======================== > > -This flag controls whether unprivileged users can use the userfaultfd > -system calls. Set this to 1 to allow unprivileged users to use the > -userfaultfd system calls, or set this to 0 to restrict userfaultfd to only > -privileged users (with SYS_CAP_PTRACE capability). > +This flag controls the mode in which unprivileged users can use the > +userfaultfd system calls. Set this to 0 to restrict unprivileged users > +to handle page faults in user mode only. In this case, users without > +SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd to > +succeed. Prohibiting use of userfaultfd for handling faults from kernel > +mode may make certain vulnerabilities more difficult to exploit. > > -The default value is 1. > +Set this to 1 to allow unprivileged users to use the userfaultfd system > +calls without any restrictions. > + > +The default value is 0. > > > user_reserve_kbytes > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 605599fde015..894cc28142e7 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -28,7 +28,7 @@ > #include > #include > > -int sysctl_unprivileged_userfaultfd __read_mostly = 1; > +int sysctl_unprivileged_userfaultfd __read_mostly; > > static struct kmem_cache *userfaultfd_ctx_cachep __read_mostly; > > @@ -1966,8 +1966,14 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) > struct userfaultfd_ctx *ctx; > int fd; > > - if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE)) > + if (!sysctl_unprivileged_userfaultfd && > + (flags & UFFD_USER_MODE_ONLY) == 0 && > + !capable(CAP_SYS_PTRACE)) { > + printk_once(KERN_WARNING "uffd: Set unprivileged_userfaultfd " > + "sysctl knob to 1 if kernel faults must be handled " > + "without obtaining CAP_SYS_PTRACE capability\n"); > return -EPERM; > + } > > BUG_ON(!current->mm); > > -- > 2.29.0.rc1.297.gfa9743e501-goog > Adding linux-mm@kvack.org list