From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B30CC3DA6D for ; Tue, 20 May 2025 21:50:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A170F6B007B; Tue, 20 May 2025 17:50:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C7C66B0082; Tue, 20 May 2025 17:50:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B5CE6B0083; Tue, 20 May 2025 17:50:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 62AD96B007B for ; Tue, 20 May 2025 17:50:23 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0BCB1161490 for ; Tue, 20 May 2025 21:50:23 +0000 (UTC) X-FDA: 83464630326.13.553B740 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [91.218.175.182]) by imf21.hostedemail.com (Postfix) with ESMTP id F2FBD1C000B for ; Tue, 20 May 2025 21:50:20 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=AHflEly1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf21.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747777821; a=rsa-sha256; cv=none; b=zwzCKNd5XymIi4NVd896xhQskQTB4puWqHCOERBanMT/Wl/jDNhzZbZbs710AbSHy6w4sd g5z+21Gvoixi2eLmBT+JAluOfv91jQ5KsQuMoRUdlyZsZVCfRoJntRCfxp+tTHJfsKRmTi 1aMFQGIFhJgvvWGCCDQSzt0oAnoEtXw= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=AHflEly1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf21.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747777821; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=vhRek88RxGtXFL5GQiII5xcktGcNXGmD9RuiUnq2K54=; b=XBO43AN/4RuBVRpIkT7qOIN20ZC1pf3dj7jn1pnj10tfhuNSxKDIf9gbtfK+grKU/27mAe R95rTBpTkc3EhWo/Yb5PyxHseFpMpV9PKIT8VRJpf9/DBorx2/uUsbCoc7twSWJ89KG6is B3vpogbUc9u02zOIa6eCiebj4ExRjGI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747777818; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=vhRek88RxGtXFL5GQiII5xcktGcNXGmD9RuiUnq2K54=; b=AHflEly14n0X34gfVPvWZPWcEPssOk7pSxD1eGi4aIIFMZ66LMqftlPyti+BMBaFBiI31C QL1kq3z9A5xbG3oJ0uZydTXV0Vj7HYI6GtQXTBQCIw3MVC0/eChQ5p+mfEUqMJHHIWuBmY 2xXSkJaVDmkihKQTr5EkFwwsat/0ceE= From: Roman Gushchin To: Andrew Morton , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Roman Gushchin , Jann Horn , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Hugh Dickins , linux-arch@vger.kernel.org Subject: [PATCH v5] mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() Date: Tue, 20 May 2025 21:48:13 +0000 Message-ID: <20250520214813.3946964-1-roman.gushchin@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Stat-Signature: ycif8yf9i547bhba6p8qbcoic43h7qaz X-Rspam-User: X-Rspamd-Queue-Id: F2FBD1C000B X-Rspamd-Server: rspam06 X-HE-Tag: 1747777820-310904 X-HE-Meta: U2FsdGVkX1+77SrG9JmmYIt20lLwjWGyHzzrEpStcF1tgHiGNM0gsx0FFeLn2ErhdDEwHMOfqpPBPDePUMDvMCUm2UbbDRL3OX29zD7gkyNwl2ITiu+d7NvzSVtjQUuvzIj0ySCHmaISJAvIm6sk0ZdYu6DxdgQEUP9+ldG6SX/vCBMsIyB+7sRyJtj6UJhNfVKgniVKI+/85TPe4IQGKfSNvoixhOsfDbXuWQ/yuhWGrtoiNmiZWsQq+yWe8ATy+uNqS85K/VDuc1oD1DpFCv360VJ3Hw51DrAYg+zsuf8UgJre8wsEP2ZCghx96Xskq32tgDxczmrrG2nvBso6AaCZbYDme2XMDQNMSkAji8jZGyNcqzFMCcga4bg58uxz48NmbL9og/m2yZ+WKu6gymNMOjCvl/VzghIHurgjt3z5BLlCnldNLoll61M9UiabGg4VLS0UmtA0nXywqfe1NLoHEfLRyTexxZsvcHV0kDmBdgzRFzZOZj3vJEReXtHNga5APdQWANwpq0smlwRGG4HapxaniXfdkByEBV1Bp4emddmnk+LYP0CTjL0IDR06z8q6dnTORZO6MuLPImKbPIL8DQjVL4yncrxGs22b3QqqSDQKSfFi5+GXIpbwnEQDwskOX1zbr8piEvzUYsh8uMcLM/Mq8sYnsLsxHbmJbd7jSD3WOfkRvi8xpykDaJC6Hepba5I9fk0eaSWgGP2qoeU43SLrhLM+Y4fZ+PP4wvbiNSJmQ91Hwm6F1Wsb8djezJ0DNRmJcYkNSIhNdMld7diPEO2a6LbI1WbCSN+iV2X6kWzECOo+nVGK8C4q8FMFcGxUrdmF98fxIODi+QJMP6VdbBQd8VK7cwLrI5SJ7GlTvoptN8M/EoqnD37rkeNhz7L07E4C9XoBY1u8R4CeUgjdhyeY1VJ0mYlLNkwlIkUpoIkyjI4KiQ385y39MTp8qUvgFxBrkZ5GD+Sw3pv pEO6393j BQogoBgjHeuUEos+z7IC6uV1K7oHVbGzi3STMpiiHqvbugl7vtI/l+q0KNFsaV+PpMxzqE3CZ9o0kZqWO/cHfm1sLVp9f+rXoeMJUp0xh0EVBHjmPUdbxmqe+lZ9MOg4dcr2DrcPzNPOiozwsdlsqLfXO0TJnghAl+EN7VcFKL9VQxmm2Gu0/+M82V85qg4iXwq38aWVbNZNI8bw9R0MJCzlbUczqr0mtyuOPepkCzo0/yQuTQGmi4MUBD+2l+Ft/lxiMpwhPSkQpFVvfD147fiEkGXviiA1ns1deaxSg93BUzF21y01g3j9MSW4Ax1UXAUz7S8SlOSXII/+TgIEMtJDGhQXPx9GjnwIG6U1YSQMYp9porCsR5bTee0Fi5sXBxoI6X1hbduJ4srIulVSh+Z6LQZtau/n76taS6Dwm0KhMVWvjDtccXeVwrMfNi2QS/ZBSb6b90xydUasyJD0hbXLMGTdIq5ZPJ05+QDtF6/S3jCq/240j/MUvaw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas") added a forced tlbflush to tlb_vma_end(), which is required to avoid a race between munmap() and unmap_mapping_range(). However it added some overhead to other paths where tlb_vma_end() is used, but vmas are not removed, e.g. madvise(MADV_DONTNEED). Fix this by moving the tlb flush out of tlb_end_vma() into new tlb_flush_vmas() called from free_pgtables(), somewhat similar to the stable version of the original commit: commit 895428ee124a ("mm: Force TLB flush for PFNMAP mappings before unlink_file_vma()"). Note, that if tlb->fullmm is set, no flush is required, as the whole mm is about to be destroyed. Signed-off-by: Roman Gushchin Cc: Jann Horn Cc: Peter Zijlstra Cc: Will Deacon Cc: "Aneesh Kumar K.V" Cc: Andrew Morton Cc: Nick Piggin Cc: Hugh Dickins Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org --- v5: - tlb_free_vma() -> tlb_free_vmas() to avoid extra checks v4: - naming/comments update (by Peter Z.) - check vma->vma->vm_flags in tlb_free_vma() (by Peter Z.) v3: - added initialization of vma_pfn in __tlb_reset_range() (by Hugh D.) v2: - moved vma_pfn flag handling into tlb.h (by Peter Z.) - added comments (by Peter Z.) - fixed the vma_pfn flag setting (by Hugh D.) --- include/asm-generic/tlb.h | 49 +++++++++++++++++++++++++++++++-------- mm/memory.c | 2 ++ 2 files changed, 41 insertions(+), 10 deletions(-) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 88a42973fa47..8a8b9535a930 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -58,6 +58,11 @@ * Defaults to flushing at tlb_end_vma() to reset the range; helps when * there's large holes between the VMAs. * + * - tlb_free_vmas() + * + * tlb_free_vmas() marks the start of unlinking of one or more vmas + * and freeing page-tables. + * * - tlb_remove_table() * * tlb_remove_table() is the basic primitive to free page-table directories @@ -399,7 +404,10 @@ static inline void __tlb_reset_range(struct mmu_gather *tlb) * Do not reset mmu_gather::vma_* fields here, we do not * call into tlb_start_vma() again to set them if there is an * intermediate flush. + * + * Except for vma_pfn, that only cares if there's pending TLBI. */ + tlb->vma_pfn = 0; } #ifdef CONFIG_MMU_GATHER_NO_RANGE @@ -464,7 +472,12 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) */ tlb->vma_huge = is_vm_hugetlb_page(vma); tlb->vma_exec = !!(vma->vm_flags & VM_EXEC); - tlb->vma_pfn = !!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)); + + /* + * Track if there's at least one VM_PFNMAP/VM_MIXEDMAP vma + * in the tracked range, see tlb_free_vmas(). + */ + tlb->vma_pfn |= !!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)); } static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) @@ -547,23 +560,39 @@ static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct * } static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) +{ + if (tlb->fullmm || IS_ENABLED(CONFIG_MMU_GATHER_MERGE_VMAS)) + return; + + /* + * Do a TLB flush and reset the range at VMA boundaries; this avoids + * the ranges growing with the unused space between consecutive VMAs, + * but also the mmu_gather::vma_* flags from tlb_start_vma() rely on + * this. + */ + tlb_flush_mmu_tlbonly(tlb); +} + +static inline void tlb_free_vmas(struct mmu_gather *tlb) { if (tlb->fullmm) return; /* * VM_PFNMAP is more fragile because the core mm will not track the - * page mapcount -- there might not be page-frames for these PFNs after - * all. Force flush TLBs for such ranges to avoid munmap() vs - * unmap_mapping_range() races. + * page mapcount -- there might not be page-frames for these PFNs + * after all. + * + * Specifically() there is a race between munmap() and + * unmap_mapping_range(), where munmap() will unlink the VMA, such + * that unmap_mapping_range() will no longer observe the VMA and + * no-op, without observing the TLBI, returning prematurely. + * + * So if we're about to unlink such a VMA, and we have pending + * TLBI for such a vma, flush things now. */ - if (tlb->vma_pfn || !IS_ENABLED(CONFIG_MMU_GATHER_MERGE_VMAS)) { - /* - * Do a TLB flush and reset the range at VMA boundaries; this avoids - * the ranges growing with the unused space between consecutive VMAs. - */ + if (tlb->vma_pfn) tlb_flush_mmu_tlbonly(tlb); - } } /* diff --git a/mm/memory.c b/mm/memory.c index 5cb48f262ab0..6b71a66cc4fe 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -358,6 +358,8 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, { struct unlink_vma_file_batch vb; + tlb_free_vmas(tlb); + do { unsigned long addr = vma->vm_start; struct vm_area_struct *next; -- 2.49.0.1112.g889b7c5bd8-goog