From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C323C5DF61 for ; Tue, 5 Nov 2019 16:24:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E85082087E for ; Tue, 5 Nov 2019 16:24:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CqI1ZXbl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E85082087E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 44F666B000D; Tue, 5 Nov 2019 11:24:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D8D96B000E; Tue, 5 Nov 2019 11:24:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A1306B0269; Tue, 5 Nov 2019 11:24:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40]) by kanga.kvack.org (Postfix) with ESMTP id 0D5846B000D for ; Tue, 5 Nov 2019 11:24:34 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id A4F66180AD811 for ; Tue, 5 Nov 2019 16:24:33 +0000 (UTC) X-FDA: 76122746826.16.trick25_a8861be60542 X-HE-Tag: trick25_a8861be60542 X-Filterd-Recvd-Size: 6018 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Tue, 5 Nov 2019 16:24:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572971071; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Al/+c8KHyhxPJlQUXk+cOOY+ajrIwqI170evHk6gx8A=; b=CqI1ZXblgRzb+q9/pIvdG6gk6abBEDrZxCTpZSpMA55FPzDfKXcF0WO+A6CofiSowW8Cc1 Xjdlw8uUiDt7+Tzu+bw58v0TEAlcPBSUgQUINuDLj6j6YdhCENeDOU+geeHZIKXWoIrDxV OL8h5ZWKIcy1Xp/+ouIuFYK3kcc4Wkw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-93-4kcvJejoNyy-KkwYcL7N0g-1; Tue, 05 Nov 2019 11:24:28 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DD2C9107ACC3; Tue, 5 Nov 2019 16:24:25 +0000 (UTC) Received: from mail (ovpn-121-157.rdu2.redhat.com [10.10.121.157]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 65A90393B; Tue, 5 Nov 2019 16:24:25 +0000 (UTC) Date: Tue, 5 Nov 2019 11:24:24 -0500 From: Andrea Arcangeli To: Andy Lutomirski Cc: Daniel Colascione , Mike Rapoport , linux-kernel , Andrew Morton , Jann Horn , Linus Torvalds , Lokesh Gidra , Nick Kralevich , Nosh Minwalla , Pavel Emelyanov , Tim Murray , Linux API , linux-mm Subject: Re: [PATCH 1/1] userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK Message-ID: <20191105162424.GH30717@redhat.com> References: <1572967777-8812-1-git-send-email-rppt@linux.ibm.com> <1572967777-8812-2-git-send-email-rppt@linux.ibm.com> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: 4kcvJejoNyy-KkwYcL7N0g-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 05, 2019 at 08:00:26AM -0800, Andy Lutomirski wrote: > On Tue, Nov 5, 2019 at 7:55 AM Daniel Colascione wrot= e: > > > > On Tue, Nov 5, 2019 at 7:29 AM Mike Rapoport wrote= : > > > > > > Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file > > > descriptor table from the read() implementation of uffd, which may ha= ve > > > security implications for unprivileged use of the userfaultfd. > > > > > > Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that h= ave > > > CAP_SYS_PTRACE. > > > > Thanks. But shouldn't we be doing the capability check at > > userfaultfd(2) time (when we do the other permission checks), not > > later, in the API ioctl? >=20 > The ioctl seems reasonable to me. In particular, if there is anyone > who creates a userfaultfd as root and then drop permissions, a later > ioctl could unexpectedly enable FORK. >=20 > This assumes that the code in question is only reachable through > ioctl() and not write(). write isn't implemented. Until UFFDIO_API runs, all other implemented syscalls are disabled (i.e. all other ioctls, poll and read). You can quickly verify all the 3 blocks by searching for UFFD_STATE_WAIT_API, UFFDIO_API is the place where the handshake with userland happens. userland asks for certain features and the kernel implementation of userlands answers yes or no. Normally we would only ever return -EINVAL on a request of a feature that isn't available in the running kernel (equivalent to -ENOSYS if the syscall is entirely missing on an even older kernel), -EPERM is more informative as it tells userland the feature is actually in the kernel just it requires more permissions. We could have returned -EINVAL too, but it wouldn't have made a difference to non-privileged CRIU and we're not aware of other users that could benefit from -EINVAL instead of -EPERM. This the relevant CRIU userland: if (ioctl(uffd, UFFDIO_API, &uffdio_api)) { pr_perror("Failed to get uffd API"); goto err; =09} Unfortunately this is an ABI break, preferred than the clean removal of the feature, because it's at least not going to break CRIU deployments running with the PTRACE privilege. The clean removal while non-ABI breaking, would have prevented all CRIU users to keep running after a kernel upgrade. The long term plan is to introduce UFFD_FEATURE_EVENT_FORK2 feature flag that uses the ioctl to receive the child uffd, it'll consume more CPU, but it wouldn't require the PTRACE privilege anymore. Overall any suid or SCM_RIGHTS fd-receiving app, that isn't checking the retval of open/socket or whatever fd "installing" syscall, is non robust and is prone to break over time as more people edit the code or as any library call internally change behavior, so if there's any practical issue caused by this, it should be fixed in userland too for higher robustness. If you stick your userland to std::fs and std::net robustness against issues like this is enforced by the language. Thanks, Andrea