From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDCF0C433FE for ; Fri, 14 Oct 2022 22:56:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8C006B0072; Fri, 14 Oct 2022 18:56:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3B976B0075; Fri, 14 Oct 2022 18:56:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BCB96B0078; Fri, 14 Oct 2022 18:56:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 846926B0072 for ; Fri, 14 Oct 2022 18:56:00 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 45F49A04BC for ; Fri, 14 Oct 2022 22:56:00 +0000 (UTC) X-FDA: 80021064480.05.C2D869B Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) by imf10.hostedemail.com (Postfix) with ESMTP id BB8ACC001F for ; Fri, 14 Oct 2022 22:55:59 +0000 (UTC) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 846193200909; Fri, 14 Oct 2022 18:55:57 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Fri, 14 Oct 2022 18:55:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1665788157; x=1665874557; bh=M7 3D+dncr5/TWrQ5CiXGIZUf2R2xBbwveAi+EFE4fEk=; b=RMqn5vaWWc6KfSESFS bFrZxeHCm6h+SYVXKblWfKEKiJvD6sEwhZlqEn2bLZK1GQlkCbVD4yG8mwGW5DAB rvHbW/FmrbXUKMpCs0Cs0OKPo8cJ8JzsxEwnab9PLk2ude/SBmzmqJsnoXJehJP0 veo5d6MV2wqochWSph6FNlazyb6oygmEdxBzICaXBQwUz4MiCa+gYjyAjNldJ/8Q 70F/6VBiW4gTnfVMKWW20wtsw0WM4+98YqK7abSB9VzdxzVjzyVX172vFQeZyfkU 4e1bBVk5ioAqI+bfUUcl03SBAXmgsMKuFvNOTLcJHvUkDCCHeVwU72Y4l1ONqokd np8Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1665788157; x=1665874557; bh=M73D+dncr5/TWrQ5CiXGIZUf2R2x BbwveAi+EFE4fEk=; b=DgBUj8WmGfVVXapD9O1r6q63nPgkH3lf0Cl3vtVKLUVc ELr8FXoWR8k0kHFx+WxIeMCoKXFOz/Sq3dn+xYD1sg/3bj0txk8dlEiZhI0G+NJi Cckx/8x8Mw2NytaIhjAhxjvUmm3ymmoOH1CxWY229Upq+ymJRqgCMgBTAIJGB0qa vTcHpSXZzlNEsKfOJbF0cgIPbuJU0UkJJvge9i3nwjIeZSSuo6yXRkg5D9QTsF+M 3TIKt2KYfcZqbxIdXOmveJBFQyEKL6o/byhl86qfn80MySFQrSmDzfdeC6cYI3Dc rWpuCGJJLq1pKeAD9s0tmpblgIguIwY11oJJHsh/1w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeekfedgudeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdttddttddtvdenucfhrhhomhepfdfmihhr ihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlsehshhhuthgvmhhovhdrnh grmhgvqeenucggtffrrghtthgvrhhnpefhieeghfdtfeehtdeftdehgfehuddtvdeuheet tddtheejueekjeegueeivdektdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvg X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 14 Oct 2022 18:55:56 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id 188911094FB; Sat, 15 Oct 2022 01:55:54 +0300 (+03) Date: Sat, 15 Oct 2022 01:55:54 +0300 From: "Kirill A. Shutemov" To: Jann Horn Cc: Andy Lutomirski , Linux-MM , Mel Gorman , Rik van Riel , kernel list , Kees Cook , Ingo Molnar , Sasha Levin , Andrew Morton , Will Deacon , Peter Zijlstra , Linus Torvalds Subject: Re: [BUG?] X86 arch_tlbbatch_flush() seems to be lacking mm_tlb_flush_nested() integration Message-ID: <20221014225554.q6lxvc2ffp5drqvs@box.shutemov.name> References: <20221014222346.n337tvkbyr33dsdx@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b=RMqn5vaW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=DgBUj8Wm; spf=pass (imf10.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.19 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665788160; a=rsa-sha256; cv=none; b=PLqNluPYD/5LxvTPFPmnyDoOAbrMwR0s015hLKdrK1TuwtErkYdfsjofivivmEkdlMLAW/ vJQbU/bO7YfI3L/FP1Fm+KiRxKSr6DOjBVXXCKsxQh0DJtvg4yKd48O2k9Twwvw5WZEe+L 7UqM5iasIKLVgbyMQ1JqHiHfsXe0ZLM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665788160; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M73D+dncr5/TWrQ5CiXGIZUf2R2xBbwveAi+EFE4fEk=; b=h69QVEtRz3XDdg1ztUIwsu/kWvSqR7UTjh1y/vDo65aQ0PTOo0YnYQZmNmPOiGdMpWDy9g nUBelIzhB+AWHbUkEWtFW7IT/mm4tsrBkl05McuTloPzqsztOMLQxc2PqHdnjrxR9OKP49 Tg0jedAqlb6Vw3YXmzfpvMslAmI3Zb8= X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b=RMqn5vaW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=DgBUj8Wm; spf=pass (imf10.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.19 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BB8ACC001F X-Stat-Signature: jb3xd9kcnc5iin7jtoh7u6efju1tqxqx X-HE-Tag: 1665788159-824762 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Oct 15, 2022 at 12:29:57AM +0200, Jann Horn wrote: > On Sat, Oct 15, 2022 at 12:23 AM Kirill A. Shutemov > wrote: > > On Fri, Oct 14, 2022 at 08:19:42PM +0200, Jann Horn wrote: > > > Hi! > > > > > > I haven't actually managed to reproduce this behavior, so maybe I'm > > > just misunderstanding how this works; but I think the > > > arch_tlbbatch_flush() path for batched TLB flushing in vmscan ought to > > > have some kind of integration with mm_tlb_flush_nested(). > > > > > > I think that currently, the following race could happen: > > > > > > [initial situation: page P is mapped into a page table of task B, but > > > the page is not referenced, the PTE's A/D bits are clear] > > > A: vmscan begins > > > A: vmscan looks at P and P's PTEs, and concludes that P is not currently in use > > > B: reads from P through the PTE, setting the Accessed bit and creating > > > a TLB entry > > > A: vmscan enters try_to_unmap_one() > > > A: try_to_unmap_one() calls should_defer_flush(), which returns true > > > A: try_to_unmap_one() removes the PTE and queues a TLB flush > > > (arch_tlbbatch_add_mm()) > > > A: try_to_unmap_one() returns, try_to_unmap() returns to shrink_folio_list() > > > B: calls munmap() on the VMA that mapped P > > > B: no PTEs are removed, so no TLB flush happens > > > B: munmap() returns > > > > I think here we will serialize against anon_vma/i_mmap lock in > > __do_munmap() -> unmap_region() -> free_pgtables() that A also holds. > > > > So I believe munmap() is safe, but MADV_DONTNEED (and its flavours) is not. > > shrink_folio_list() is not in a context that is operating on a > specific MM; it is operating on a list of pages that might be mapped > into different processes all over the system. s/specific MM/specific page/ > So A has temporarily held those locks somewhere inside > try_to_unmap_one(), but it will drop them before it reaches the point inside try_to_unmap(), which handles all mappings of the page. > where it issues the batched TLB flush. > And this batched TLB flush potentially covers multiple MMs at once; it > is not targeted towards a specific MM, but towards all of the CPUs on > which any of the touched MMs might be active. But, yes, you are right. I thought that try_to_unmap_flush() called inside try_to_unmap() under the lock. -- Kiryl Shutsemau / Kirill A. Shutemov