From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF4F8C77B60 for ; Fri, 31 Mar 2023 20:05:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E7F46B0072; Fri, 31 Mar 2023 16:05:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 198266B0074; Fri, 31 Mar 2023 16:05:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 038276B0075; Fri, 31 Mar 2023 16:05:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E8EF06B0072 for ; Fri, 31 Mar 2023 16:05:34 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BE102A0E7A for ; Fri, 31 Mar 2023 20:05:34 +0000 (UTC) X-FDA: 80630273388.19.2A246E4 Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by imf22.hostedemail.com (Postfix) with ESMTP id CD73BC0029 for ; Fri, 31 Mar 2023 20:05:31 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CKiFHrTc; spf=pass (imf22.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.167.45 as permitted sender) smtp.mailfrom=axelrasmussen@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680293132; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HhMkBqMOvsh1hs5NuN3XdIseTUJUxytQaVYnt3fGACo=; b=JyqBc6P50JBcD57j8AcBOKdeHUcwYXjyfeQthkN1vzaFtbNiKSbYYiEGcbtVDeEG9osDCc W0S9vdcgnH+pvISKGJA8hZE+CE/QLOZFKPW5qXmpH/NXOnAxtnNYuZmF3UXlqkRRwc71LW Vc5rnw7kU5BTb5vKOlNau6Ag3iTvfJE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CKiFHrTc; spf=pass (imf22.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.167.45 as permitted sender) smtp.mailfrom=axelrasmussen@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680293132; a=rsa-sha256; cv=none; b=Euc93PVqx52sdyv5BnD+Pu7yTwgE4C63JPJATo95jisLou989MjWjgiIIhRcVnKMv8dWBJ LhxcS+ACD53/xFlKf6f2dTgjGNpQi4hzVXQslq0OH4FI9iHRaPkHen6zwBvUWlyG6Slrpy 1xRncqrIGjXNO9xkIVXyPkrk5jPs00Q= Received: by mail-lf1-f45.google.com with SMTP id h11so23405548lfu.8 for ; Fri, 31 Mar 2023 13:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680293130; x=1682885130; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HhMkBqMOvsh1hs5NuN3XdIseTUJUxytQaVYnt3fGACo=; b=CKiFHrTcdNs+6RXLHaR0N4wjV3XmkpbLlpU2eTJg6eUxdnNesJSQL5gOC2c6nHDTU9 gxpsA2oZSTp7IGANoD0B5g+0ZzZ9iAnw23wzwBevscy/ZndGTbcRlfzNhh/NaabzubS2 s8TRR/mgxdeedBpBoRBHYqwATp78WJeMQ+L9EH82IBSkHdXQeWi3Ojp9WUBoyaeN1/OI b33WYYdIoa+Fs3vGUDO3Elh/Qgx30ZcW0g8oWIpWezYzFKuKSI3yKlf06k+1r37CTvQW V7Z4zh1PST/cFgqSkP7zVqpvLwW6b06orTGI1AfTA+k0QXQZcUsSvMjxGJwQX8FrlbrV J9mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680293130; x=1682885130; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HhMkBqMOvsh1hs5NuN3XdIseTUJUxytQaVYnt3fGACo=; b=Ird7nvZEYHSF63E/VYGdXP52R6Ll1ch/JUWrCVrInfLCnnN4zFlEds8AV7PqMhOIit 3qpKP5LopEOlqWczCwFnYKOjTbll4kEw4zV3yYkbec0aKo1CtPwwxmHh88Y5PIdu1qRr uG6KwCwpvZZzjJqbneZL9729MViNvAuxc2RCTEREMDif18OW4OoSjRH9WWb9qLnXl9sI VLn4+9vOZNPKbZGSQ3c60gz7qTXCJJraOCLlFCpXBZVymtMpqeZnSZxi7mBjegJXd9CT tl/i6AQh9cY28nCT599H8cyKGXQMiNrJMrDuoBquYXv68+ZC9igFUvvn/E9E2eFv6DVC +b4Q== X-Gm-Message-State: AAQBX9dPB46n1k0tj4sz/V+gCCWCQffDdOimKNcABHw5TWJt1EGdG/i7 9q/SPu/u7zCHO7ijP9wdiDN0nOmrwmsh0Si5+cMNBw== X-Google-Smtp-Source: AKy350ZzxL9raDUswX9rEq9TaPaTi0wrfS1d40b4KfnhPiA1zGs8nwTqdPTUKJa2azKoy8XydInoIfJUvcJs1iaagjg= X-Received: by 2002:a05:6512:23a3:b0:4d8:86c2:75ea with SMTP id c35-20020a05651223a300b004d886c275eamr4951980lfv.3.1680293129654; Fri, 31 Mar 2023 13:05:29 -0700 (PDT) MIME-Version: 1.0 References: <20230330155707.3106228-1-peterx@redhat.com> <20230330155707.3106228-2-peterx@redhat.com> In-Reply-To: From: Axel Rasmussen Date: Fri, 31 Mar 2023 13:04:53 -0700 Message-ID: Subject: Re: [PATCH 01/29] Revert "userfaultfd: don't fail on unrecognized features" To: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Peter Xu , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mike Kravetz , Andrew Morton , Andrea Arcangeli , Mike Rapoport , Nadav Amit , Leonardo Bras Soares Passos , David Hildenbrand , linux-stable Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: m3qo6auxf4fg1djmi1pkcaqdstqwbpsh X-Rspamd-Queue-Id: CD73BC0029 X-HE-Tag: 1680293131-297707 X-HE-Meta: U2FsdGVkX18alzwdXVonoV+nqFpw7i9N4I00Q8axAcBsItfmt8unEtyrpJAtHK0IqJ2BIbQPC19geTJqMKQhMbEXFaT1bbvoJ3+eMyQX+q4sB1CLjmpMsZXeYHP9R7m5iea2Cg5A8UF0JD6KmPbSS8tG0mhk11p4TBLsYFabBbWUSvagkRXGFRwLXvaPdIIEmih7oaxwvcFaDsZQCclAW9fkqpQX5OhflAwgaNmVG9wvnUPtlgvZZcaQkbbyCYNdZOgQQGZ39BQak39Hb1pKOiY5fdtRmP62a2DJtyjOn2tmHjqYpEhShUQDtCvz7w9tpHMXfYEKVizNgVDfDk4rqf58AVblT+yLVybWAOl8fKgWGbwZre2/c3XaB1sKTx6aSCgVIUUUIxWMiUwJfHM2vvsg389M19uY9PULwaEBo4V6yaYixRY/4VY4RZYDaXeE+m1pTNW4M0qmDJ4ypG/w2+RRAeDLhFgoQP1Ag74ORiRDtmOOKJeErXIAnsaiUBL3PE9vab7GXr1lmau3Ld/3olcvtiQ5FvRXGtgNMtrdtaR0nzamsFAKfrR8wAU0wpwqrevZtiE2pysEWz62stLgklHfXvhdI9nY9+iEKKASLBBFAwKll4SxnbMHAyOSgNeoP9fPNu7soMIky8l+mM2EqZTrRLsPicX3hB6uG9O2iFdb7VEuwCIcM2u/1HlFP2jo81qmtODUcSv7BLIxOdlV4GqmCorTWlxptuq0Ry9ozYGz408xYOLyKwqs53bYvt5IYZme3nkyMa7g9fVfm4HG9eQsyp6dYTB4gwHbHKGG+4v6p6K7XyJQ/cLVWg+X1KqNMWa0+DLu4wphMwDMNHd8Nk8W/xTBM3X28v6e2pe8GZY+0bnagxQJstQ0VTMsrrcORStLZdSC/mR2Pyc57agHKTmRDkx7x1YLRPoonAN/AqEnhcFhSFUb0eK0sEodvu2N/g6Y6FFI8bQtBRdH5jy 8qEdv+eZ 81hElL9K6/22CLM6cVCLqSYrnTNHM96JZ/snCZFk4Ej6dEweKyVk0mL6NIde2FeMP6Bt7Qchcfxk4AgqUGtUCxT4M5Lb/OKYkP1zp9CDsw7mXVngEGxQ3yag/N9ZVum6D+lAaSSzvJAW94wDvCcPtmJGhPIvtPnPsYgoBpWIE8wj8eNyj71kfblVVh77D5zcMgO9bbE7G0zKR7ajsCYFw2EBaNftR2INbx3kv6URmuXYqZ9sjHWGtHMh7eqJfxzLbEASLOjc4ACHS3MlXgPLjC/Yc+wzytl545nOQzFo525a7yFI7cIlPRAZl0g7N8hvad59+k++dVjNhwaECZ7scGOTKfUfAjIkkF79nyfkVzIb7xCdVySRhG3sf6zkMhX5mL88a5gg7eY75eNYSiTojtLupFZEAV/9diTJuZKQ1hdtNOegyK2EH/3bn1ZLJG+iZ3wC2oqH32400WErStcnIV4eAQ7kxYjAdbPnqzB3pU/O4Vo63LAUXuE+OgdyVgH8S3qmJwmcSUuSbkaOsTB091/8oWqG+KH5N6EdTQLjoenaxWisSs61sbRfnQkNLl8D4rLHywUFDpmtvf4raqw6Eln5nnov7niXBEpBys7HN+7lUqO8BKMw+sTqY6UXsM7DtxEeB X-Bogosity: Ham, tests=bogofilter, spamicity=0.005609, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 31, 2023 at 11:08=E2=80=AFAM Dmitry Safonov <0x7f454c46@gmail.c= om> wrote: > > On Fri, 31 Mar 2023 at 17:52, Axel Rasmussen w= rote: > > > > On Thu, Mar 30, 2023 at 3:27=E2=80=AFPM Peter Xu wr= ote: > > > > > > On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote: > > > > On Thu, Mar 30, 2023 at 8:57=E2=80=AFAM Peter Xu wrote: > > > > > > > > > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > > > > > > > I found this when writting a simple UFFDIO_API test to be the fir= st unit > > > > > test in this set. Two things breaks with the commit: > > > > > > > > > > - UFFDIO_API check was lost and missing. According to man page= , the > > > > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api !=3D 0= xaa. This > > > > > check is needed if the api version will be extended in the futu= re, or > > > > > user app won't be able to identify which is a new kernel. > > > > > > > > > > - Feature flags checks were removed, which means UFFDIO_API wit= h a > > > > > feature that does not exist will also succeed. According to th= e man > > > > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API= ) if > > > > > unknown features passed in. > > If features/flags are not checked in kernel, and the kernel doesn't > return an error on > an unknown flag/error, that makes the syscall non-extendable, meaning > that adding > any new feature may break existing software, which doesn't sanitize > them properly. > https://lwn.net/Articles/588444/ I don't think the same problem applies here. In the case of syscalls, the problem is the only way the kernel can communicate is by the EINVAL return value. Without the check, if a call succeeds the caller can't tell: was the flag supported + applied, or unrecognized + ignored? With UFFDIO_API (we aren't talking about userfaultfd(2) itself), when you pass in a set of flags, we return the subset of flags which were enabled, in addition to the return code. So via that mechanism, one is "able to check whether it is running on a kernel where [userfaultfd] supports [the feature]" as the article describes - the only difference is, the caller must check the returned set of features, instead of checking for an error code. I don't think it's exactly *how* userspace can check that's important, but rather *that* it can check. Another important difference: I have a hard time imagining a case where adding a new feature could break userspace, even with my approach, but let's say for the sake of argument one arises in the future. Unlike normal syscalls, we have the UFFD_API version check, so we have the option of incrementing that to separate users relying on the old behavior, from users willing to deal with the new behavior. (Syscalls can kind of replicate this by adding a new syscall, like clone() vs clone2(), but I think that's messier than the API version check being built-in to the API.) > > See a bunch of painful exercises from syscalls with numbers in the end: > https://lwn.net/Articles/792628/ > To adding an additional setsockopt() because an old one didn't have > sanity checks for flags: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit= /?id=3D8917a777be3b > (not the best example, as the new setsockopt() didn't check flags for > sanity as well (sic!), > but that's near the code I work on now) > > This is even documented nowadays: > https://www.kernel.org/doc/html/latest/process/adding-syscalls.html#desig= ning-the-api-planning-for-extension > > ...and everyone knows what happens when you blame userspace for breaking = by > not doing what you would have expected it to do: > https://lkml.org/lkml/2012/12/23/75 100% agreed. :) > > [..] > > > There's one reason that we may consider keeping the behavior. IMHO i= t is > > > when there're major softwares that uses the "wrong" ABI (let's say so= ; > > > because it's not following the man pages). If you're aware any such = major > > > softwares (especially open sourced) will break due to this revert pat= ch, > > > please shoot. > > > > Well, I did find one example, criu: > > https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L2= 66 > > Mike can speak better than me about uffd, but AFAICS, CRIU correctly dete= cts > features with kerneldat/kdat: > https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/kerndat.c#L= 1235 Ah, right, this is the simplest case where no optional features are asked for. So, it's not a great example; this particular case would look the same regardless of what the kernel does. > > So, doing a sane thing in kernel shouldn't break CRIU (at least here). > > Thanks, > Dmitry