From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DD5CC433EF for ; Mon, 13 Jun 2022 22:38:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC74D8D01EC; Mon, 13 Jun 2022 18:38:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B75AA8D01E9; Mon, 13 Jun 2022 18:38:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3CCC8D01EC; Mon, 13 Jun 2022 18:38:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 965338D01E9 for ; Mon, 13 Jun 2022 18:38:39 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 6611112080D for ; Mon, 13 Jun 2022 22:38:39 +0000 (UTC) X-FDA: 79574678358.24.A1CB468 Received: from mail-io1-f44.google.com (mail-io1-f44.google.com [209.85.166.44]) by imf12.hostedemail.com (Postfix) with ESMTP id 0572040097 for ; Mon, 13 Jun 2022 22:38:38 +0000 (UTC) Received: by mail-io1-f44.google.com with SMTP id n11so7662841iod.4 for ; Mon, 13 Jun 2022 15:38:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=iOe2c0LZoVW+GY27vdPQ164YNh5IJrimb87nSfQol6K9V2Wjvi7hEXWRmNiYR1bwUQ 5TpXCahY9UIL2dY5I5t2MiTzSJeqPHWw1Bto/qfAjYn4kJxplc0U01mR//HUbdxvQDGH LC1M4Ru3Us/w/yWJb/J6AsoYk4lQ2ihX8s0Q5Vwi6gl6Ecnqp3Ll7kSNctOBXPo0GBq9 0QlakwT6SIWPHk+3iwnGbYC8THbaSirHeaaduFJiwRI7mJgxtID2V6jIeJmNOgmln4Kn nr1bVIcZPbAxgugwhMKLjJyi++xVgjKyQpMomzK+P5O1PiDVc69khAKw2yR2WyI0wcqy tq7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=xXo13H07+W0V7EaKCVJNWNoCrZHsoHTKuAuXE/+GafIecY4oDM8HMSw4CotRJx8Bfw o3j8APJ00FNhm6QZVeOLB7eHdmJciIW90s+hEUtjehGOimWt4gj3ujJqPKHF/7y60aL8 A6IBlPWanHFExHdTnz9peoBlKUv6tZzZ0I3g9LWmGzCXLib2bupD6XPnjqgs8xQQ1ESg VEcrokpsEn+3JKQWO7KicSLX/RL8J8k5vFOQnNowwamVpK5yZkgvIxye8le+SUjcJpdv c/1/aGC71Lw3NRbXYfedG+dpHpsOOmJewfXCb4onuD2ADIkcqTSW1fcCNCYTHcxX7d9i UTGA== X-Gm-Message-State: AOAM532q8VLLkwFRTXnRhuRheYc2eMmlJPJ8vnR8dIgI+mVEnNmf2W9I 3IcQCOfKZ5WRb4PetXfim2OtzMzndpSiBZ52Zus4sw== X-Google-Smtp-Source: ABdhPJzj+2L+AFjPv9s2kVu3WHMiTFEhxc4SE7Sug7k/UXLKnkibOKebwzScvc2nvqb2XiNvP2b7nEi2BQSFne2x9hs= X-Received: by 2002:a02:2305:0:b0:331:a026:b650 with SMTP id u5-20020a022305000000b00331a026b650mr1141093jau.314.1655159918049; Mon, 13 Jun 2022 15:38:38 -0700 (PDT) MIME-Version: 1.0 References: <20220601210951.3916598-1-axelrasmussen@google.com> <20220601210951.3916598-3-axelrasmussen@google.com> <20220613145540.1c9f7750092911bae1332b92@linux-foundation.org> In-Reply-To: From: Axel Rasmussen Date: Mon, 13 Jun 2022 15:38:02 -0700 Message-ID: Subject: Re: [PATCH v3 2/6] userfaultfd: add /dev/userfaultfd for fine grained access control To: Peter Xu Cc: Andrew Morton , Alexander Viro , Charan Teja Reddy , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, LKML , Linux MM , Linuxkselftest Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655159919; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=GUp/qSMgbHk56z4MB98J/CvwG9eVXZBbh6Tme0OhjoUvOlGuKsoN/LLK3BjcOmd76GeaIK m9x9omcotB9ydXyJEze3CNuen+HB+1L3M8fMS/ZSI0EgLi+ldvfxgr0Tk+OFqpe5b8pifP G8wflpe7zkpjhrJURsbaiRBSOpOthYo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655159919; a=rsa-sha256; cv=none; b=19oq6GPKRpNWl8qCaJv97ZBB087WtVIl6rgtkh8N094kKhbxExSCyWOlzdV53nvIwiSnBe eoq7lb+mQy9wLVwAmPZn5RZuobhr5AkKdgOgaDtWbdG3hk2Q/mqIDh1+HBxBuGYlQ1ATe8 +GVflkF/JA3DSJvbB4yXOXuXBYEuhAE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iOe2c0LZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.166.44 as permitted sender) smtp.mailfrom=axelrasmussen@google.com Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iOe2c0LZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.166.44 as permitted sender) smtp.mailfrom=axelrasmussen@google.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: wuhf69roygeb8m4xm1pk6cng97sz6obs X-Rspamd-Queue-Id: 0572040097 X-HE-Tag: 1655159918-364268 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 13, 2022 at 3:29 PM Peter Xu wrote: > > On Mon, Jun 13, 2022 at 02:55:40PM -0700, Andrew Morton wrote: > > On Wed, 1 Jun 2022 14:09:47 -0700 Axel Rasmussen wrote: > > > > > To achieve this, add a /dev/userfaultfd misc device. This device > > > provides an alternative to the userfaultfd(2) syscall for the creation > > > of new userfaultfds. The idea is, any userfaultfds created this way will > > > be able to handle kernel faults, without the caller having any special > > > capabilities. Access to this mechanism is instead restricted using e.g. > > > standard filesystem permissions. > > > > The use of a /dev node isn't pretty. Why can't this be done by > > tweaking sys_userfaultfd() or by adding a sys_userfaultfd2()? I think for any approach involving syscalls, we need to be able to control access to who can call a syscall. Maybe there's another way I'm not aware of, but I think today the only mechanism to do this is capabilities. I proposed adding a CAP_USERFAULTFD for this purpose, but that approach was rejected [1]. So, I'm not sure of another way besides using a device node. One thing that could potentially make this cleaner is, as one LWN commenter pointed out, we could have open() on /dev/userfaultfd just return a new userfaultfd directly, instead of this multi-step process of open /dev/userfaultfd, NEW ioctl, then you get a userfaultfd. When I wrote this originally it wasn't clear to me how to get that to happen - open() doesn't directly return the result of our custom open function pointer, as far as I can tell - but it could be investigated. [1]: https://lore.kernel.org/lkml/686276b9-4530-2045-6bd8-170e5943abe4@schaufler-ca.com/T/ > > > > Peter, will you be completing review of this patchset? > > Sorry to not have reviewed it proactively.. > > I think it's because I never had a good picture/understanding of what > should be the best security model for uffd, meanwhile I am (it seems) just > seeing more and more ways to "provide a safer uffd" by different people > using different ways.. and I never had time (and probably capability too..) > to figure out the correct approach if not to accept all options provided. Agreed, what we have right now is a bit of a mess of different approaches. I think the reason for this is, there is no "perfect" way to control access to features like this, so what we now have is several different approaches with different tradeoffs. >From my perspective, the existing controls were simpler to implement, but are not ideal because they require us to grant access to UFFD *plus more stuff too*. The approach I've proposed is the most granular, so it doesn't require adding any extra permissions. But, I agree the interface is sort of overcomplicated. :/ But, from my perspective, security in shared Cloud computing environments where UFFD is used for live migration is critical, so I prefer this tradeoff - I'll put up with a slightly messier interface, if the gain is a very minimal set of privileges. > > I think I'll just assume the whole thing is acked already from you > generally, then I'll read at least the implementation before the end of > tomorrow. > > Thanks, > > -- > Peter Xu >