From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55D76E732E5 for ; Thu, 28 Sep 2023 15:36:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 83E216B0170; Thu, 28 Sep 2023 11:36:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EE096B0171; Thu, 28 Sep 2023 11:36:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B63D6B0173; Thu, 28 Sep 2023 11:36:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5A3B86B0170 for ; Thu, 28 Sep 2023 11:36:45 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 21D80B43F8 for ; Thu, 28 Sep 2023 15:36:45 +0000 (UTC) X-FDA: 81286408770.01.BA17D0D Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) by imf02.hostedemail.com (Postfix) with ESMTP id 5C54D80018 for ; Thu, 28 Sep 2023 15:36:43 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0s1E3gRR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.128.177 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695915403; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6e7z54uATHSSGDjwccq5t2JrfgU8sx45IO2zPYWKTu4=; b=kz650Cao8v87RQXG2snN35OFmjdALPRp1rzUxeuU7gVd99WfwhDwrL1YZ/ET3pmIV2aFnH QKlFa8fq35YPxCtv21Bi+SOMrR+HnZxOFdOvlnUgAS5Y5Rm+njcKMm4JxORJJqAxQa1+25 nBKNLpIcTHBqMusJTfE+0wXbW30qkFQ= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0s1E3gRR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.128.177 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695915403; a=rsa-sha256; cv=none; b=ZeUdVp0OXYWeYZKyjqIavZARP2+3ft2kPxvJK0fhWmArK2n8v1UcPdCYbB+bSkmB/qovtB d7tIsvFg3ElU9SPcnVmTGORkJegoFKP+1aNrjhsMaHSkC13CRLjL1LJ5uWqN5erjc07dCO h2LOEWzYd1PoCjhLQ7kR2M5NaJdDbqA= Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-579de633419so160781087b3.3 for ; Thu, 28 Sep 2023 08:36:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695915402; x=1696520202; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6e7z54uATHSSGDjwccq5t2JrfgU8sx45IO2zPYWKTu4=; b=0s1E3gRRTBHj1B9ZASxODkDI+Giatpig4VeMtNKXmsfqOunxYexnjtHp3zZLW/K7Bj 8D9LA91G40D+ZGTd5zzMm1y3hCXM5xFglXnH16vgaois1KLl09MXbfnrwsS8QppqdgFx SVyIESPgdouGDA+jFbS3M+JDB0SqGnGPqDTA5k4jseyKZxGpOKHsJ/XdiaO8ZAOpqzqE dJ4bmSlWaAszlVflgyyRObWw6Vy3/9+cQMOmDapqeUah0ZJ+/8rQGPoStHQ9pHt5lJ0r tNHWLge4Qn1t0yFJqN44GA+ynXr9yXbluCh5hbCHMqi5zf3Y9tjJ04Z6inRdFVGJRVxN xURQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695915402; x=1696520202; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6e7z54uATHSSGDjwccq5t2JrfgU8sx45IO2zPYWKTu4=; b=tg0aAslU26PAw8y1geeFYUZvdDiOenYaoIKQ5AxZGZqvs1LyLYyL39rqQ1lXUFzIBu bJ4BI+N4cTsD8Da7AwXIlMBv2joca3PbL8UfYOdSGuQNa3P1mc40/JeKls0Ty2bsfmtC 1ULtaEu26aHX0LbmXRRFXUeg2d/UQHqmyKX5KYo6d2v65jTNInsb1qs4ikRhs6Qb/+y8 uYe/nwnDqTIXWl7bLuzpz2Lv4m0U7cQMIqfiz78ECPASdaGNOghlVy4btzGitpPbfy6O YuUM/GZxj52SVoNfnZj5pBTG3o8pI9NdR/ztlaAhTB+CKDKMsQQ9WiGXuRXZKBqlOBWT Czvg== X-Gm-Message-State: AOJu0YyaDscqUcRxLEsDul8/H933kTnzJC60vub/M4mIt1hxJXwDTRQ+ HkMDbgvqGZOXZDJSfXDk0tUWxV6qAg9JAp2yYBUS2Q== X-Google-Smtp-Source: AGHT+IFjUIiq8zXa/6i3Gd2mLTZVRJSM762ThmYuOrNiwbXezL93NXanoc8htBlyr6ZC4BGKzu/lDhUW4vJYOzuiSDk= X-Received: by 2002:a81:4e85:0:b0:569:479f:6d7f with SMTP id c127-20020a814e85000000b00569479f6d7fmr1392570ywb.43.1695915402166; Thu, 28 Sep 2023 08:36:42 -0700 (PDT) MIME-Version: 1.0 References: <20230923013148.1390521-1-surenb@google.com> <20230923013148.1390521-3-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 28 Sep 2023 08:36:31 -0700 Message-ID: Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI To: Jann Horn Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, peterx@redhat.com, david@redhat.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5C54D80018 X-Stat-Signature: qt45q6ienmqqme3a6nenj6bosbp63cf3 X-HE-Tag: 1695915403-542571 X-HE-Meta: U2FsdGVkX1/Krn5bQLTltvR11JOqarDKEAPr99oW5dQkva4yeKkKSsb+8Vp1u8rvaZvfm3X3LC5zWDAyzx6WlWeiAZYeu5327kipX2/n5HT2jVgjcrFHI/u/UkQ0OfFi6PYq4554uxLXjt4SgSCa/D+1igNM8b100YZTmZ0uET7LH+Bo0ljTD9Pi9w1d1xnyZkrzyc+utd1JIV+ghgSRCM26PULpQlHROvwt2dVBXS2/A1z13SbTIucVQttNXtEVnDYEUVjNH3Q8N/dKouA3EX4fD1E/VB/6fiUq+uMwm6GwQr0AADgpgXukvxvkuGiMjFCtu0/9YGzlwDuCFAOOiPRjbCBGbMlNIuhJacUoW9dWQk3Y/LyhWfslZfV0WYFmqR3g3c4G9dRmaedPos+Qkv2Mo3W50JcEtbQTdciSB+OFJlKbmzlEgzXg94uPuUUC3/BfSiKdQv5pKFbTKy2X1GeRc3/wwaFxoW2HOkYgsg6PcrEtbQUyy2+vuFDThyLscmEaRZ03bxxIydxZ/hySc9Qs03naZ48bE9EcysQaH+s428uhutqm9cNqqi3svoLWgdztQ4j7sFoUsNulAP3FacYuFINrX80CzBDi9Cx0I3OT9ZmaEaY3cn3XO7QLvO+z1QqemwnvWh//5AwbM+ZgBlJKhs7kJuEnbFNOPtqumt/TCmlSxkRE8IT06WowB4zmNUr7XdzEXHS6VP/lt14qjeiBBzEyDalPXUol0ygu55TVbL92LrZl0e18WAcVa/zrDwDNYCc7fASnxejq7xgPTCBFkCntBG3NY9kla0mVyjbvojXqld05KrCfyzAyLw1ONDxDl2iFSG1MbRrsIdFI44YsUMWk8cpSQsnwVLE8Te4JeT8nhWwuYfF7yYZSzxn0fpKbLnx9lxd1+aupG6epoNnrGhqencdY69fvLUbDKB+4W60cLp2KNwoaY66ToCF6IQFWTRFsryp/zuxIywc 5t9UbOCP cdcuiy3eH6RnTyb/8y+X9sd8zLTaC07/JVGVFmiphFB2j6G6iIP6qXWrUNE7vgzn4Zzv/SrfpoNQXSqdRyci2PTYDAwOYnCd4fFwnMiOx8lCBgPRH00Ng8XN63bPFIY/PrUVvpY3YvToIR6nRGKozKDeODbPTumn87JjiSz3m+VRm9oJ3OpnLTa7my/VhZ8f0a+CO6AxO/TbYahHwEAEzDGYSd7vTc31gVzqHRYai584bxTICSkBIJ1B/LyoM+5zLxJ6zpvvzc5/k55Vo7/tx4DOOxhIwXTW/4+ip32C0QRmMejDFKAYJ0F0j5jjy4n4L5PdZBmWSdkvzwigDZMVc9Nal1A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 27, 2023 at 3:49=E2=80=AFPM Jann Horn wrote: > > On Wed, Sep 27, 2023 at 11:08=E2=80=AFPM Suren Baghdasaryan wrote: > > On Wed, Sep 27, 2023 at 1:42=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > > On Wed, Sep 27, 2023 at 1:04=E2=80=AFPM Jann Horn = wrote: > > > > > > > > On Wed, Sep 27, 2023 at 8:08=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > On Wed, Sep 27, 2023 at 5:47=E2=80=AFAM Jann Horn wrote: > > > > > > On Sat, Sep 23, 2023 at 3:31=E2=80=AFAM Suren Baghdasaryan wrote: > > > > > > > + dst_pmdval =3D pmdp_get_lockless(dst_pmd); > > > > > > > + /* > > > > > > > + * If the dst_pmd is mapped as THP don't over= ride it and just > > > > > > > + * be strict. If dst_pmd changes into TPH aft= er this check, the > > > > > > > + * remap_pages_huge_pmd() will detect the cha= nge and retry > > > > > > > + * while remap_pages_pte() will detect the ch= ange and fail. > > > > > > > + */ > > > > > > > + if (unlikely(pmd_trans_huge(dst_pmdval))) { > > > > > > > + err =3D -EEXIST; > > > > > > > + break; > > > > > > > + } > > > > > > > + > > > > > > > + ptl =3D pmd_trans_huge_lock(src_pmd, src_vma)= ; > > > > > > > + if (ptl && !pmd_trans_huge(*src_pmd)) { > > > > > > > + spin_unlock(ptl); > > > > > > > + ptl =3D NULL; > > > > > > > + } > > > > > > > > > > > > This still looks wrong - we do still have to split_huge_pmd() > > > > > > somewhere so that remap_pages_pte() works. > > > > > > > > > > Hmm, I guess this extra check is not even needed... > > > > > > > > Hm, and instead we'd bail at the pte_offset_map_nolock() in > > > > remap_pages_pte()? I guess that's unusual but works... > > > > > > Yes, that's what I was thinking but I agree, that seems fragile. Mayb= e > > > just bail out early if (ptl && !pmd_trans_huge())? > > > > No, actually we can still handle is_swap_pmd() case by splitting it > > and remapping the individual ptes. So, I can bail out only in case of > > pmd_devmap(). > > FWIW I only learned today that "real" swap PMDs don't actually exist - > only migration entries, which are encoded as swap PMDs, exist. You can > see that when you look through the cases that something like > __split_huge_pmd() or zap_pmd_range() actually handles. Ah, good point. > > So I think if you wanted to handle all the PMD types properly here > without splitting, you could do that without _too_ much extra code. > But idk if it's worth it. Yeah, I guess I can call pmd_migration_entry_wait() and retry by returning EAGAIN, similar to how remap_pages_pte() handles PTE migration. Looks simple enough. Thanks for all the pointers! I'll start cooking the next version.