From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C199C5B543 for ; Fri, 30 May 2025 16:45:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9CC06B015A; Fri, 30 May 2025 12:45:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D4CCC6B0164; Fri, 30 May 2025 12:45:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1B056B016F; Fri, 30 May 2025 12:45:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A3DE66B015A for ; Fri, 30 May 2025 12:45:41 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5B55A1D51EB for ; Fri, 30 May 2025 16:45:41 +0000 (UTC) X-FDA: 83500150482.14.0388B8E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 8BB30180002 for ; Fri, 30 May 2025 16:45:39 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748623539; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kHajGx30uWF0TTRuggGVT6JTMMDrZNq6YUbQoVq8GPM=; b=aN3QIHpI0rSVrxTuIiPZdWGXTttZjc49Jt8IxxJojaQLt6yZ38dr1fas5z6OCVvy3de6pQ WV/x1J9B3uCvjFaf98o6CTk/bfSaE1j1L6PE/SazGG2NEOpCT+kupNaqcBXLkrZNtm1ukT zRGvS6XRuuEi2v0f9IiZBJOUe7ye/fU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748623539; a=rsa-sha256; cv=none; b=ebGP5glHWQA70pTpuZwfBPl1FjTPZIC9uG0dEWrWCtViW3TnMGVcQvjYaXL3tB9Y7E9fv3 FkuhMobES8KD3b4HyYW9NLIdDW331GQCqheUyClkR/IpBHMV7LSrGS8BL6v1oPWVkNddOu KJ3biodxDcyQmDbR30L2z0LoZOCcesE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 366C71692; Fri, 30 May 2025 09:45:22 -0700 (PDT) Received: from [10.57.95.14] (unknown [10.57.95.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CC2B43F673; Fri, 30 May 2025 09:45:32 -0700 (PDT) Message-ID: Date: Fri, 30 May 2025 17:45:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 1/6] fs/proc/task_mmu: Fix pte update and tlb maintenance ordering in pagemap_scan_pmd_entry() Content-Language: en-GB To: Jann Horn Cc: Catalin Marinas , Will Deacon , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , "David S. Miller" , Andreas Larsson , Juergen Gross , Ajay Kaher , Alexey Makhalov , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Boris Ostrovsky , "Aneesh Kumar K.V" , Andrew Morton , Peter Zijlstra , Arnd Bergmann , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Alexei Starovoitov , Andrey Ryabinin , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, linux-mm@kvack.org, Andy Lutomirski References: <20250530140446.2387131-1-ryan.roberts@arm.com> <20250530140446.2387131-2-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8BB30180002 X-Stat-Signature: z7ez1eka4n43htg16e5fmwftzj6u3ze3 X-Rspam-User: X-HE-Tag: 1748623539-406259 X-HE-Meta: U2FsdGVkX183wBVCocibh6dQnOhUF96B/vTaGs7YSGAC7q8ubqHQVsT8LM2+4loOnKmzYBNuBU+W34B59MdH1Z90QGO7CyGYTzL0pecf16buj76Ditg2znNFttpUfIaM3ZeUV2Gh33gNvxQam9H/Q/FaCsQ/Y/9YznEAj9+RWRSRa8oZ7DujyJpS0DFswv8BVhK79cLKz+NLse+JN1e0DJHvZv9+ReFtOTZ73eFP2IPmDX5rNgmLHb8Y9VVtXWl4jGZiIC/2EA9rVecIskvJ3ddBOP+KFuzlZEklq9hlZHP08m5piDzC3xusunKUp/UHcv8DRHO3piVitvmjwPn6PXitk3+jaIfoJVTLRbZYbSzPsOLsaybt2mxpDEzWJ0UGP+d6ndgCRcxJmkvNwGcSNhN9S+/1LEPVpaobGu8OzPaMA0iHDO9j+YHqbClYAmMAWF494fyC5cHI7er7KmbwSNLSODpoMKAMaZGQzSFrM+CIFi6rWqqR8hXZYUdyNwToQSWPUGLygAvhE63r8tUL2q7mnHKdf6rRykRDn8BfxrKmEnQQkycgeO+0o4A5MGVSGSE/5wFdwmtk/3hTixBly+PktjARl0gRxgbbtwW1/70j59M01PP3bylnPRISV8cADP1CaS+c0F8u5Fh/y1i+7+4kYM6nsLpi02Hlf2LcAigVpQYxVuq7+N/i8l2CWNK/ajeeEYha4ajg3JF/utxoi6hx+zN57UWiaH2bnWXhbiH1lQmWg3aJyw5juDkUh7J45pzoqXPY61TsX/A49zplgKpcC+7KCjUfqkfgyvbaMPuMflVSIV8s+8rEX5UThh05gbyaFKd9Q+BhsBXQNhSmBDzN8pDWSf+5YRXGiRdsR/GVnD0dJe5zYr4dNeovmCTWcUkCRfJhahGnHlDgNhnVr4a7aJil7jw/ggu4bfYZy4OtCnvk96ITTVkyQPcuZuFjwW6+rPs1udaoLu5Nw5H B+x4DlKX wqKcz5sc9RmtIgk6hUmm2ZhBZK81cto5EClVNMf0/pn1BHSvj7Z/lWcmAgNcdNX9jHHtcqqDTr22oV+n5SXct5VlQtpc0MU2QFOJxJn/KQ7u2YI5iVioi9Yab0K6HUm9lJI2hzwR0XK8wPG712mFaEtjBeJDg0HeyNrAi5j+Sh8alaFXdqIwu2aTybNmcTfN9e82IFOlAlEpCzFioFAy6c3Zc1zThZSlJe6fz0BFNuW6SZLCxE3+umdrA3TTaY7Ekg+/JaU4mwoRIe6HNvfRZc2hNBeGQRACpWjkWEfe4EHXaXLgmrdaGxHIwza2kfpqX1+X3GGtSlBD1JWJH3nr+bsJFH5UHYszZCgyz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 30/05/2025 17:26, Jann Horn wrote: > On Fri, May 30, 2025 at 4:04 PM Ryan Roberts wrote: >> pagemap_scan_pmd_entry() was previously modifying ptes while in lazy mmu >> mode, then performing tlb maintenance for the modified ptes, then >> leaving lazy mmu mode. But any pte modifications during lazy mmu mode >> may be deferred until arch_leave_lazy_mmu_mode(), inverting the required >> ordering between pte modificaiton and tlb maintenance. >> >> Let's fix that by leaving mmu mode, forcing all the pte updates to be >> actioned, before doing the tlb maintenance. >> >> This is a theorectical bug discovered during code review. >> >> Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") > > Hmm... isn't lazy mmu mode supposed to also delay TLB flushes, and > preserve the ordering of PTE modifications and TLB flushes? > > Looking at the existing implementations of lazy MMU: > > - In Xen PV implementation of lazy MMU, I see that TLB flush > hypercalls are delayed as well (xen_flush_tlb(), > xen_flush_tlb_one_user() and xen_flush_tlb_multi() all use > xen_mc_issue(XEN_LAZY_MMU) which delays issuing if lazymmu is active). > - The sparc version also seems to delay TLB flushes, and sparc's > arch_leave_lazy_mmu_mode() seems to do TLB flushes via > flush_tlb_pending() if necessary. > - powerpc's arch_leave_lazy_mmu_mode() also seems to do TLB flushes. > > Am I missing something? I doubt it. I suspect this was just my misunderstanding then. I hadn't appreciated that lazy mmu is also guarranteed to maintain flush ordering; it's chronically under-documented. Sorry for the noise here. On that basis, I expect the first 2 patches can definitely be dropped. > > If arm64 requires different semantics compared to all existing > implementations and doesn't delay TLB flushes for lazy mmu mode, I > think the "Fixes" tag should point to your addition of lazy mmu > support for arm64. arm64 doesn't require different semantics. arm64 is using lazy mmu in a very limited manner and it can already tolerate the current code. I just spotted this during code review and was trying to be a good citizen. Thanks for setting me straight! Thanks, Ryan > >> Signed-off-by: Ryan Roberts >> --- >> fs/proc/task_mmu.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >> index 994cde10e3f4..361f3ffd9a0c 100644 >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -2557,10 +2557,9 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start, >> } >> >> flush_and_return: >> + arch_leave_lazy_mmu_mode(); >> if (flush_end) >> flush_tlb_range(vma, start, addr); >> - >> - arch_leave_lazy_mmu_mode(); > > I think this ordering was probably intentional, because doing it this > way around allows Xen PV to avoid one more hypercall, because the TLB > flush can be batched together with the page table changes? > > >> pte_unmap_unlock(start_pte, ptl); >> >> cond_resched(); >> -- >> 2.43.0 >>