From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50600D59F78 for ; Wed, 6 Nov 2024 21:49:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDD1B8D0002; Wed, 6 Nov 2024 16:49:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B8B468D0001; Wed, 6 Nov 2024 16:49:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A2DBC8D0002; Wed, 6 Nov 2024 16:49:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7D1AD8D0001 for ; Wed, 6 Nov 2024 16:49:23 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2B88C1C1F67 for ; Wed, 6 Nov 2024 21:49:23 +0000 (UTC) X-FDA: 82757011470.14.F93CF60 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf11.hostedemail.com (Postfix) with ESMTP id 5069540012 for ; Wed, 6 Nov 2024 21:48:38 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vSlvWJcJ; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730929624; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iqBReMzQ4QYB8LLJpM6gahb2W8ayiPnSC7EMvyXGOk4=; b=cF93XM88HJMZu0tC/YRDWtVADnV+F5lN/169+64sLj4ikf0lW6g3a0Tr8AYQEK9hX9qNTF DXeLHNUpXHV7bY6vvH4wyZ06eWrmrTxLinGEUse31qHgg08j3jXIBiyftohe4FY2km+iYt okqgSP6ern8J7yRoYy6l2hRBkEZkvaI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730929624; a=rsa-sha256; cv=none; b=RMG0g460OJles/1q0ul9qXmOHJB0ChKHbZV3MtTYInhyoq1Yc9fqA10oij0m0g+S7pZpmo ibtOWj1lTdVfO5RQ0fE+UvtYOIe9BySD8sG+VKED7T05ZGhRFIRjil4eA35Vq5NtLjIwmb pmwgC+jhlFNZL1SFcylEb0ftcIeBKhU= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vSlvWJcJ; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-5c947c00f26so1857a12.1 for ; Wed, 06 Nov 2024 13:49:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730929760; x=1731534560; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iqBReMzQ4QYB8LLJpM6gahb2W8ayiPnSC7EMvyXGOk4=; b=vSlvWJcJILFBPPO8+gDjD+tCmqwTMbVh+s24UU7PokbT9VSeSPFpqg/Wta3ZUqK8yS mjWMENOpoCNWxWTqINOL6MyGHgCWZW7JAUBOS2t9p48gq0g035NIvAJu+3uuaEFOBpA+ ZlJ69GX8J/8k+NTrXup5bQSo0TfnkBULXMsrVadDD16DbmYEtr3dNxZQRVjdgic3eIAt PucQj+JH+U+LFmLPi+Q1rgFTdGz+58g18dOfepXz8+xIq6jVmCtDetiD55ymz4jtvnsx 4cdYuynzKhAURglbo9O4R2Ia0UvN3IMNRdoy0brAT+a/Gq8/4QTZknW2aROp73ZHMUiu DLAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730929760; x=1731534560; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iqBReMzQ4QYB8LLJpM6gahb2W8ayiPnSC7EMvyXGOk4=; b=TZZkLiob0oklmxIlML0sTMU52cjeefyxERUXdBd5vsqTkU1qT+1Wy+zRYQag24FV9v e4MByPLu2fvMfMFohsJo9lusoKuur4H5rVJP6PQUbuDKYE8tWMWdNCpj44oyMRhGuhLm dh33Ticg9zI/G2NbWC3AlxmM9pUiTqBh97g8ZKa32Jbt3iN/pNNSo8ofQnKOvfpt7lYS V+X1z+hlF8Rr2WcyQTtWUbq+NwgbIDFawdsSf6BIEZTXiiK2aw/q7lk8DU7i4b1lQpOy i6BK59GfJ6bc61CedLDfs3grzFdInQP1cEPghB03mV9A8J7n5pUxCX3bQpCN+qQDmv1O Ykmw== X-Forwarded-Encrypted: i=1; AJvYcCUaDyerDHOPOqgTAJyZzYKURtyNhTrAUCd+VCLPGo7rY1kiRPOsB5J3D/yWW14NWQS4SIkwLIDutQ==@kvack.org X-Gm-Message-State: AOJu0YyskXeJruSY6cZDKK3/+xxj236ZtTTQPnnLwWFGv6d4Fo3VKUAa 35ucabs8Is2+vDnEaaFfYJfWhxiEKNh/qICIweWaX0phpewx2IY9NTpTbL0JtwN8XEGNIiibel3 lkA1TPmfqQItw0NeAGAunK+6LoT/adfPyEbYC X-Gm-Gg: ASbGnctrgvXxNgmwCB8ojdMomMunFhJMh+gB1e5BfdeyQZ7upbh08Wi40B/zTUe82l2 w7qkBniPXnrjPMPeGVKgMCh49uWdeWGtM8M4ioFovi8QMu6GHxGfw9AQOKXg= X-Google-Smtp-Source: AGHT+IG1aBUYlTL8mRWEZzGnMhdvOJ97x/UkRnuojHriKyUAEksu27Bi3SoiNa+Ns1JqqKMmuo7TOb3Dps7+GIbPctc= X-Received: by 2002:a50:f69a:0:b0:5cb:7316:2e15 with SMTP id 4fb4d7f45d1cf-5cefbb03500mr239807a12.0.1730929759279; Wed, 06 Nov 2024 13:49:19 -0800 (PST) MIME-Version: 1.0 References: <4c3f4aa29f38c013c4529a43bce846a3edd31523.1730360798.git.zhengqi.arch@bytedance.com> In-Reply-To: <4c3f4aa29f38c013c4529a43bce846a3edd31523.1730360798.git.zhengqi.arch@bytedance.com> From: Jann Horn Date: Wed, 6 Nov 2024 22:48:43 +0100 Message-ID: Subject: Re: [PATCH v2 1/7] mm: khugepaged: retract_page_tables() use pte_offset_map_rw_nolock() To: Qi Zheng Cc: david@redhat.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5069540012 X-Stat-Signature: 4qehbzc586uru5ygyh5fqptnyck3s1r6 X-HE-Tag: 1730929718-681614 X-HE-Meta: U2FsdGVkX1/BkuVTKKDvPqW4Het+uUSFGcWTkfSVrECFq6azf0JdP+IP+I3k+zfJk1393eDndrMV4nu2aXlzeNjm6p3CAz6yM0NOC/uXu1JzkL2pxRwEzDz9Y/g1U0w6mKcHiakPytgMkF3XydxLoswU8F96oKI9RmFKQSFUzPvJE2fRLldO5sSG3fBKwubsd7ywOXiLJuCpoZ1LK4kJf08htNbGeWc1mKlfKYHTgrvd3XfovAbQz700oY+EFKHN4XZKne9Xv5EFW6DHdBQjVYKoUNXomawYbhaEHqxAq/jltpcZZOL4uTTBahdS7NSHANVMgENydnVOyLK/bVPOm4Nf/VyesA9/9SCIWBYQuZ+XAX+4MZnwEuim8w7t9dv77QXYFPyUKjNL0PsxF/ICi4iJXvGyoxHmL1kL4ElFHFXhsBbVBJ10WT9EGvndVAMBTr46BLte5MM6fC21uJSmG2JgawfjuY2BXJuEzfa2ePeWGcRRTsK5I6zl7aMLvE7ep2AwTeTyQNXtDrLYuhQ3a0w20yEeTFcLGvk8b6Thdv5wWFwqYcgh1tdDC5xjrHyi3KgNYTTVBjR5dR23eI+emPM5ua1QrsMNrBAhCvOxNt2arYsx1dEbm+ojOsn0OReEi+3MBQZkJdqG2HAogBbIPSFdYrZT+6ajKdgRx8ASS3zpb5Pj+81kZjwPXzuneI7xgzf6ega5l87ZOhi0rRahkr2/ubVAGR1Iej5uJhpZsKi6LsY7l9ndysKuqM2kR5x2yqW/y1rayiQljPCbi1l/ZKyQ0zMEpfwKhLtSjYGZZHreq/Qis5Gai5zcaVJfSsdnf9mbN/t8JW593PdXwcowtr2oBTTmOBOsQc60wVq5DrJc8ZzLjhq15s4zJ12eYhjz1QQBnVvPEj0n4kSlzcSwe7aBsjR/GFpU42gPwbrwwqObqq6ePFv60/ZM6z2hDhBKT0dBN+CPgCcT74tdQgu 6gecFl3D 8pxdYW7BUHWuIJfxzV8Cp4QXiwls2aTrmp9sOOAzeXc0O+KCwufNbY0YXIiNPMmsvxUn13neZMumTI45KHHIrLNUDzVaiPKRGsE/x23J7wDywOeptnzAkj19GVJvH4FrThb5XuonFSPC64ZyyLVK35uf7yqfyzUeMviFOfFNZ73lK8PQik10yZfrcXOdi4Rsjj5SKhdQ/hMj5sOhS0lFASrI3/g/V3XFcHtGSWYyAE1gze3CfvwriJsE93ZVU95E+GHc5QG58TSRLiYk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000474, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 31, 2024 at 9:14=E2=80=AFAM Qi Zheng wrote: > In retract_page_tables(), we may modify the pmd entry after acquiring the > pml and ptl, so we should also check whether the pmd entry is stable. Why does taking the PMD lock not guarantee that the PMD entry is stable? > Using pte_offset_map_rw_nolock() + pmd_same() to do it, and then we can > also remove the calling of the pte_lockptr(). > > Signed-off-by: Qi Zheng > --- > mm/khugepaged.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 6f8d46d107b4b..6d76dde64f5fb 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1721,6 +1721,7 @@ static void retract_page_tables(struct address_spac= e *mapping, pgoff_t pgoff) > spinlock_t *pml; > spinlock_t *ptl; > bool skipped_uffd =3D false; > + pte_t *pte; > > /* > * Check vma->anon_vma to exclude MAP_PRIVATE mappings th= at > @@ -1756,11 +1757,25 @@ static void retract_page_tables(struct address_sp= ace *mapping, pgoff_t pgoff) > addr, addr + HPAGE_PMD_SIZE); > mmu_notifier_invalidate_range_start(&range); > > + pte =3D pte_offset_map_rw_nolock(mm, pmd, addr, &pgt_pmd,= &ptl); > + if (!pte) { > + mmu_notifier_invalidate_range_end(&range); > + continue; > + } > + > pml =3D pmd_lock(mm, pmd); I don't understand why you're mapping the page table before locking the PMD. Doesn't that just mean we need more error checking afterwards? > - ptl =3D pte_lockptr(mm, pmd); > if (ptl !=3D pml) > spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); > > + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd))))= { > + pte_unmap_unlock(pte, ptl); > + if (ptl !=3D pml) > + spin_unlock(pml); > + mmu_notifier_invalidate_range_end(&range); > + continue; > + } > + pte_unmap(pte); > + > /* > * Huge page lock is still held, so normally the page tab= le > * must remain empty; and we have already skipped anon_vm= a > -- > 2.20.1 >