From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28BEBCE7B1F for ; Thu, 28 Sep 2023 19:47:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A585E8D00D6; Thu, 28 Sep 2023 15:47:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A08868D0053; Thu, 28 Sep 2023 15:47:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D0F58D00D6; Thu, 28 Sep 2023 15:47:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7E2B88D0053 for ; Thu, 28 Sep 2023 15:47:37 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5C40F4095C for ; Thu, 28 Sep 2023 19:47:37 +0000 (UTC) X-FDA: 81287040954.10.95ED7CA Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) by imf27.hostedemail.com (Postfix) with ESMTP id 9092D40025 for ; Thu, 28 Sep 2023 19:47:35 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RiUVkFI6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695930455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mlqQYHcSZTQEN/TIyJgTq3z/gFgEQXlRPAw0wk0he68=; b=2OCz0v/w17DI7cpJvAVPwa51fuSAPMhhhZ9YPWtxe2fTMdseL7GENirjQoBzSKWYxHwE0G r2J7y77D8pH1Ex+IziCPtNzJW5g2YL8mtdA/NDWuUJhizxIWIF8iAQ0ODjDb8ttbxJ3gf9 Av2FzPaMYL8NShzv/wbTLXKyro4gOZM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RiUVkFI6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695930455; a=rsa-sha256; cv=none; b=MzSz3LBjKJel/q1BX0+rqeGGj4+IEK4DCpaRx8/Grfxn81oD8gF5Fu4/zCLTC09Vv6ONLE 7BFeeS+Tb8x/4SSWEGIwnCJlg9B54xwZCplUvBQSQ/7k/TslQMmPNv5oP1yhiRfnp6OB7r QLUgWZmYAOYnQHkDQt/VvR09AjLKTM8= Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-579de633419so164060697b3.3 for ; Thu, 28 Sep 2023 12:47:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695930454; x=1696535254; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=mlqQYHcSZTQEN/TIyJgTq3z/gFgEQXlRPAw0wk0he68=; b=RiUVkFI67rIbZDODnxGS12J1C88kLvTAZ1U8FHoPb+JaRhMb5Hr7V2d5ZaddQqtDCo HU23uHodgawDrdmQ+RzEocnZeYisTUuIuIgNMgQ9t1ESGfgG7YD5MCEADVXFURr4cu+N lnduY+/NIeugZDjsYMavhwK7vL2vVrrgh4f7+EQMIK7wNWHB4edLskzkrwJP7fxZE/Lc Mjqw3gGRLbVI6uyTTj/AVlmsMWK39Lhu8XmcnoJF8akUTgaQCbMmt3r5fuyITsPuK46j KhuWUFG+cFvlncIijOrVPxqwzKNZTthm1pyddpOKn0N68hAPWAaIvW6IJ93YOqNlOene 1rlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695930454; x=1696535254; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mlqQYHcSZTQEN/TIyJgTq3z/gFgEQXlRPAw0wk0he68=; b=sA6ufx8ylB1PVLdZfcEG+Mrk3IHmnS5DwkX4FYKUatenRUUr/yxBU2QW1VwVqJelSE VRNqWeOijFnZ3Nb1SsH9cZqCPIdEZ3nURQObzY1EpPn6cZxR5RTgNAkQleDJSsrLNSDT bGaQ5if/g9f70ANqswpqJsLkEJxJgYi42pVhaKSUeeEWzEboyBW1Ev7DDj4wsVphNT/u HsjtVOFhQQSGA0Ow42VX3dzXhxSwzgSuAJm1y1VE/kwgJ4HY0bESPdU2Ubt1qmFEaVFQ IQ2Gi4lPFsg1syf+k+oe3c49D+YMw6QUOuJDQ0c05ryo8brcweDrCbBtJ+dIEhexIID6 cWag== X-Gm-Message-State: AOJu0YxkWexMz91PdXrC+m8SsQhlJYY7ByaLF/Ccp/GS+CEkvlSSVX/2 XaM6HR2SDEAQf2Ycr1qZ34IzFNGHS5KBBkl0PBrK3A== X-Google-Smtp-Source: AGHT+IHwCwf7G5lBhB8dV+VuDEF9GYBuzciJmXGyD46zbE5siwyxIqRSHjWzbZlsjSHQXB2P1bWaVvAT2t8QBdpOYn4= X-Received: by 2002:a0d:df45:0:b0:59f:6175:bd72 with SMTP id i66-20020a0ddf45000000b0059f6175bd72mr2137442ywe.6.1695930454372; Thu, 28 Sep 2023 12:47:34 -0700 (PDT) MIME-Version: 1.0 References: <20230923013148.1390521-1-surenb@google.com> <20230923013148.1390521-3-surenb@google.com> <03f95e90-82bd-6ee2-7c0d-d4dc5d3e15ee@redhat.com> <98b21e78-a90d-8b54-3659-e9b890be094f@redhat.com> <85e5390c-660c-ef9e-b415-00ee71bc5cbf@redhat.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 28 Sep 2023 12:47:21 -0700 Message-ID: Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI To: Peter Xu Cc: David Hildenbrand , Jann Horn , akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 4ifsu65r7zkgore15t5zomegbt4rje7q X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9092D40025 X-HE-Tag: 1695930455-567843 X-HE-Meta: U2FsdGVkX1/N63qW8wKU73IZB3yg/6Pbee1FvURHxziGjaAq0oZZvb8xf7aw9ywoY/uwkf8665kiTpurShMm1NJ3ix1NYCs6Dq0V6iBHguVJKWJifV62dlUTSCjStK0JXFhdlwh7Draow4M4FuTMcoMsX96Xrcfjsb86B2CgRYkr/gEIHQYrmmdqhSXmpLQQFLu7ouaGYduDf65cdIboYbn1DnVMkfjY/VDkcwhNGXWJ1eFPM7yGx4l+T56tPqfYKJFRIZZ56vbEYpK1TbAQoSGEaF4DBgJWYYt235MtXu3UQ2bmDJ5FPnOyYRmfY6i3RP0RjYkaYV51wmPlnDZmLSB5Tgl6/ZeCoHz1ImRLEseFt3543/+uOzQpDHhd8RdYRoOxF0JRn85IP38Hzw9EwWrXYyO/gT14mVE5+3TTUPti/yUjlfSxj+2+l2eCEm030j/9L5P5H5c0wtAdOnfEbcMFsYDL2wRP4O6BUhik7/9w23BbvdL0t3/tfjdjz//MH1cCkVoTL5oqSrkxBTwxReasQswRmXqU50y+Z3t8yMVeYPAFQTCt1d/3B0G0Cp58gWunZ6/S1gt7uDX20xn6ZvbK4yVfiLwyW1izUiIGOo5TIqrmgFuSLcOKW9A0hBBg8KorC6TCBgLfXBd3dN1m+JKDO/8ocKH2nnVBjF8JxJFtof1CZyE/xiYwvggqQUpQULIm2qDZ2OVmY6Nfzr+pWMpEPDAjqEb2gjcXqHCvjKeVzuAec6WdyElyM82CA4BOUbsV7XHkDpCS1pWkxDYnQ1D/Qu5s6i8c7I78OKVqgVwoZuPZ6p2/AgglxW1sn6yUoX39MZTlB2JZk6QsZQSolxSADcplArQkA6C41QysbbR4IgicsHRiJcN59hLsqbZf39vzHPf+5FecQ41hyaNXnvaIJkjMvBHj/5zwts4xM+VgMxh4aOKlMg9eAXTv01iHbyMr56Rd+XL+EbuHCPZ LvCJyoKy h2SmY0z5b4rGoV5cp5jmKNzYD320mWImftLBsrXkNKbqaW29FrbHqAwOjeMRIqKEn+okLir+yi/vHKywKswhwJ5Mq6lUTJ60yX/kYr/7wjxgf4fxqlyKGoCZxdgKz6uMxHPVkcEw2OnxTv9lX1q07d5OkNUOsfsCgq9pKNZSvYmiLuMbpodjNLA0DviGZj9RyDiXP+PrS8ED6WjTno4MxDg+uZFWv8tr5atnCfIAdVoiNc0EaExMGcmjuo+ezZIIrPTpn2W0ilv5obK+M+U3KQP4y2rJ7IDU1jo7ggIy1fSMXBfI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 28, 2023 at 11:34=E2=80=AFAM Peter Xu wrote= : > > On Thu, Sep 28, 2023 at 07:51:18PM +0200, David Hildenbrand wrote: > > On 28.09.23 19:21, Peter Xu wrote: > > > On Thu, Sep 28, 2023 at 07:05:40PM +0200, David Hildenbrand wrote: > > > > As described as reply to v1, without fork() and KSM, the PAE bit sh= ould > > > > stick around. If that's not the case, we should investigate why. > > > > > > > > If we ever support the post-fork case (which the comment above rema= p_pages() > > > > excludes) we'll need good motivation why we'd want to make this > > > > overly-complicated feature even more complicated. > > > > > > The problem is DONTFORK is only a suggestion, but not yet restricted.= If > > > someone reaches on top of some !PAE page on src it'll never gonna pro= ceed > > > and keep failing, iiuc. > > > > Yes. It won't work if you fork() and not use DONTFORK on the src VMA. W= e > > should document that as a limitation. > > > > For example, you could return an error to the user that can just call > > UFFDIO_COPY. (or to the UFFDIO_COPY from inside uffd code, but that's > > probably ugly as well). > > We could indeed provide some special errno perhaps upon the PAE check, th= en > document it explicitly in the man page and suggest resolutions (like > DONTFORK) when user hit it. > > > > > > > > > do_wp_page() doesn't have that issue of accuracy only because one rou= nd of > > > CoW will just allocate a new page with PAE set guaranteed, which is p= retty > > > much self-heal and unnoticed. > > > > Yes. But it might have to copy, at which point the whole optimization o= f > > remap is gone :) > > Right, but that's fine IMHO because it should still be very corner case, > definitely not expected to be the majority to start impact the performanc= e > results. > > > > > > > > > So it'll be great if we can have similar self-heal way for PAE. If n= ot, I > > > think it's still fine we just always fail on !PAE src pages, but then= maybe > > > we should let the user know what's wrong, e.g., the user can just for= got to > > > apply DONTFORK then forked. And then the user hits error and don't k= now > > > what happened. Probably at least we should document it well in man p= ages. > > > > > Yes, exactly. > > > > > Another option can be we keep using folio_mapcount() for pte, and ano= ther > > > helper (perhaps: _nr_pages_mapped=3D=3DCOMPOUND_MAPPED && _entire_map= count=3D=3D1) > > > for thp. But I know that's not ideal either. > > > > As long as we only set the pte writable if PAE is set, we're good from = a CVE > > perspective. The other part is just simplicity of avoiding all these > > mapcount+swapcount games where possible. > > > > (one day folio_mapcount() might be faster -- I'm still working on that = patch > > in the bigger picture of handling PTE-mapped THP better) > > Sure. > > For now as long as we're crystal clear on the possibility of inaccuracy o= f > PAE, it never hits besides fork() && !DONTFORK, and properly document it, > then sounds good here. Ok, sounds like we have a consensus. I'll prepare manpage changes to document the DONTFORK requirement for uffd_remap. > > Thanks, > > -- > Peter Xu >