From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62B77C27C50 for ; Tue, 4 Jun 2024 01:54:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC3F26B0085; Mon, 3 Jun 2024 21:54:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B73D96B0088; Mon, 3 Jun 2024 21:54:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3B7C6B0089; Mon, 3 Jun 2024 21:54:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8630A6B0085 for ; Mon, 3 Jun 2024 21:54:00 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id F0A3580C1F for ; Tue, 4 Jun 2024 01:53:59 +0000 (UTC) X-FDA: 82191535398.28.C222261 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf09.hostedemail.com (Postfix) with ESMTP id DDD57140008 for ; Tue, 4 Jun 2024 01:53:56 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717466037; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kf3kXbvtkHxcVOJmUw1MZ9Qm6ymQ7gNnADf3YjCdeIM=; b=vpct8CUN49aOgTTVYIj1BBq/P8jHYuZkmSLr7curjzyWzYvUmKeJc1ufuHZPe/QrQV+wJK e1ZcENuvgxxJbfMPZhx9hL2LSzg//lfCUqVsIQ+b6QiB/7Z7/vW8jpduciTPysffAv/Rk/ 3PRxUL3GkP2Oqlju05NoW1oxtDWRjVE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717466037; a=rsa-sha256; cv=none; b=W6DMUx1eohyjI6ppIl5RjY79BROocZJbDlTmpfc6ABC+vhs+lPeqGptlsmDZeY/6qo0S9y k9W8Z6NQ60xMqJ7lzGdBagPHMhT6NlScbeBfvErDPkvpMT1tHTyQpCNxCiHVQTUGolmviB u5knVUm529HZ0TL+keUMVI+aCdCRkSA= X-AuditID: a67dfc5b-d6dff70000001748-6d-665e73b234f8 Date: Tue, 4 Jun 2024 10:53:48 +0900 From: Byungchul Park To: Dave Hansen Cc: David Hildenbrand , Byungchul Park , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped Message-ID: <20240604015348.GB26609@system.software.com> References: <20240531092001.30428-1-byungchul@sk.com> <20240531092001.30428-10-byungchul@sk.com> <26dc4594-430b-483c-a26c-7e68bade74b0@redhat.com> <20240603093505.GA12549@system.software.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrBIsWRmVeSWpSXmKPExsXC9ZZnoe6m4rg0g42NohZz1q9hs/i84R+b xaeXDxgtXmxoZ7T4uv4Xs8XTT30sFpd3zWGzuLfmP6vF0c5NzBbnd61ltdixdB+TxaUDC5gs jvceYLKYf+8zm8XmTVOZLY5Pmcpo8fsHUMfJWZNZHIQ8vrf2sXjsnHWX3WPBplKPzSu0PBbv ecnksWlVJ5vHpk+T2D3enTvH7nFixm8Wj3knAz3e77vK5rH1l51H49RrbB6fN8kF8EVx2aSk 5mSWpRbp2yVwZcz/u5+54JRExa+p05gbGP8KdTFyckgImEjsfLiDGcY+dnwvO4jNIqAisXnW UlYQm01AXeLGjZ9gNSJA9qmVy4FquDiYBY4zS3z4uIgRJCEsUCDxasIksGZeAQuJO+/7WECK hASWMkssfD2bFSIhKHFy5hMWEJtZQEvixr+XTF2MHEC2tMTyfxwgYU4BW4ljfbfBlokKKEsc 2HacCWSOhMA+domnH3tZIS6VlDi44gbLBEaBWUjGzkIydhbC2AWMzKsYhTLzynITM3NM9DIq 8zIr9JLzczcxAuNzWe2f6B2Mny4EH2IU4GBU4uE1WBSbJsSaWFZcmXuIUYKDWUmEt68uOk2I NyWxsiq1KD++qDQntfgQozQHi5I4r9G38hQhgfTEktTs1NSC1CKYLBMHp1QD4yTfzCPPN94J WOZwa1pizu3KRZKqxhJ3X0w76+m++ssczYPvGP9l7nx9aI1M8uF3dyNPGWemC/ubW7+f0SQZ ssYo5czxBfy7rkwpP6+t8ufkf06XY2XfmC8+mNh8buUVzquWEh2cNz3+Xp5lKNKiqyigszX/ WvxK59VHuUSzqkKUS+ZL9U83mqLEUpyRaKjFXFScCAC+mXiYywIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprPIsWRmVeSWpSXmKPExsXC5WfdrLupOC7N4Ph5Jos569ewWXze8I/N 4tPLB4wWLza0M1p8Xf+L2eLppz4Wi8NzT7JaXN41h83i3pr/rBZHOzcxW5zftZbVYsfSfUwW lw4sYLI43nuAyWL+vc9sFps3TWW2OD5lKqPF7x9AHSdnTWZxEPb43trH4rFz1l12jwWbSj02 r9DyWLznJZPHplWdbB6bPk1i93h37hy7x4kZv1k85p0M9Hi/7yqbx+IXH5g8tv6y82iceo3N 4/MmuQD+KC6blNSczLLUIn27BK6M+X/3Mxeckqj4NXUacwPjX6EuRk4OCQETiWPH97KD2CwC KhKbZy1lBbHZBNQlbtz4yQxiiwDZp1YuB6rh4mAWOM4s8eHjIkaQhLBAgcSrCZPAmnkFLCTu vO9jASkSEljKLLHw9WxWiISgxMmZT1hAbGYBLYkb/14ydTFyANnSEsv/cYCEOQVsJY713QZb JiqgLHFg23GmCYy8s5B0z0LSPQuhewEj8ypGkcy8stzEzBxTveLsjMq8zAq95PzcTYzAaFtW +2fiDsYvl90PMQpwMCrx8FqsjE0TYk0sK67MPcQowcGsJMLbVxedJsSbklhZlVqUH19UmpNa fIhRmoNFSZzXKzw1QUggPbEkNTs1tSC1CCbLxMEp1cAoISpaW/JqTZfK5SMc813WmAXsNtq3 wNpzztQORq4puR+Nz1+yXHFHZs7rSb4NKZU6+R8m7lS1u5Kbd8birQTb/t07zD7OPsd4WPhh Serfp50/r11pqko7u69g/7EzSm7yuqWbTpxZrKzL+qrCZSXnzgXae56Xiqbt78kXP/8s4GDA YmsxE0EWJZbijERDLeai4kQAW72aJLICAAA= X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: DDD57140008 X-Stat-Signature: rhudrkkqaiaxs63gwim3kup43zrzsxyj X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717466036-57997 X-HE-Meta: U2FsdGVkX1+SbsVTwZT3tsaSEWsbZTOz/6dtLU0OgKpPIzeip4+wgifmvDM09yAQcY8K7AkUSx7Xdg5/DASKBIwsY49LoRRCwfOlDQoLN884jUnc3blVu0ftZtP0YbkAYc/MB/ObOCzI021nezxoGp6U+1Z5+I0M4Tzl/8RdFXFJSMfhy4Q0UZqzJzuFBWPejF5r2IasqNco+RQ/R/8+tNY8hxEjEOZjBIy3EVR8pdU/VOg8rk0L8eltd1oU62UMH2EnFx7ED7RqMuxwyqMTTnnLNl0r0GSRQNetNfCK/lEn7HcrL6w+9NuCSBmEqmehz8+1z79u7TPRabJjblkBTMeyt/rWl9Zj3mMsGxpdT6jFuYTSxmMOP9GC6w8rWsXc25C4CI1wazp3U4AqogEcTUMuqMPdBSDOd8kfI7H+UaojMyAjlyBS0S62FQJShPjNx8DW4xlyWNvelDHlFgRpfqpcNi7+lElSEAkRj9Oz5tSeE8fKKZaD8ZlZkmm0YvkdIA/OHZnavQxCxyiiFiZijxQDrAEVyu1fRxi2oSyz6EsGekJf7ayiDkYY1LdytHZXo+Us8GhPCdHYM3xjgsuZw0v8MeQhJxNYA8ykyBzZZtybEEIM/wU4Foh+niwr74hflXsKtB0sQumpmPR4AlwFyZixQIr6OQc9Ov6RQV0L4OfLm9yhYOVPOdeN70S1WkM9NJCE7fegxuLVx8wGjmQeZHSTKDa179uyHZu0LQvPF4YtQj4qIaGuX+aJRY/I/k+Q8KJ02c6kqvQiW864JdAb+CrsvcjXvtREKHq96griKDPOBY+Yh9NWBCbIKgBF+daiz5iqcKaCLpZxWWlhEfF55cwu0ZW21V8n79IBgBxaSBpIDHfNCBqkNbuSl2MjkmPC6Ed6t/ksUtfMHxKCpfuyY6BL4GCQv+QvgPkdDMScSuP8KSo92eJKGAWR7452KWgNKWNM2dGY2Fc1drlnm4f Wtn7/Vu7 Nyi+8fXhyGbWmwxY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 03, 2024 at 06:23:46AM -0700, Dave Hansen wrote: > On 6/3/24 02:35, Byungchul Park wrote: > ...> In luf's point of view, the points where the deferred flush should be > > performed are simply: > > > > 1. when changing the vma maps, that might be luf'ed. > > 2. when updating data of the pages, that might be luf'ed. > > It's simple, but the devil is in the details as always. Agree with that. > > All we need to do is to indentify the points: > > > > 1. when changing the vma maps, that might be luf'ed. > > > > a) mmap and munmap e.i. fault handler or unmap_region(). > > b) permission to writable e.i. mprotect or fault handler. > > c) what I'm missing. > > I'd say it even more generally: anything that installs a PTE which is > inconsistent with the original PTE. That, of course, includes writes. > But it also includes crazy things that we do like uprobes. Take a look > at __replace_page(). > > I think the page_vma_mapped_walk() checks plus the ptl keep LUF at bay > there. But it needs some really thorough review. > > But the bigger concern is that, if there was a problem, I can't think of > a systematic way to find it. > > > 2. when updating data of the pages, that might be luf'ed. > > > > a) updating files through vfs e.g. file_end_write(). > > b) updating files through writable maps e.i. 1-a) or 1-b). > > c) what I'm missing. > > Filesystems or block devices that change content without a "write" from > the local system. Network filesystems and block devices come to mind. AFAIK, every network filesystem eventully "updates" its connected local filesystem. It could be still handled at the point where updating the local file system. > I honestly don't know what all the rules are around these, but they > could certainly be troublesome. > > There appear to be some interactions for NFS between file locking and > page cache flushing. > > But, stepping back ... > > I'd honestly be a lot more comfortable if there was even a debugging LUF I'd better provide a method for better debugging. Lemme know whatever it is we need. > mode that enforced a rule that said: Why "debugging mode"? The following rules should be enforced always. > 1. A LUF'd PTE can't be rewritten until after a luf_flush() occurs "luf_flush() should be followed when.." is more correct because "luf_flush() -> another luf -> the pte gets rewritten" can happen. So it should be "the pte gets rewritten -> another luf by any chance -> luf_flush()", that is still safe. > 2. A LUF'd page's position in the page cache can't be replaced until > after a luf_flush() "luf_flush() should be followed when.." is more correct too. These two rules are exactly same as what I described but more specific. I like your way to describe the rules. Byungchul > or *some* other independent set of rules that can tell us when something > goes wrong. That uprobes code, for instance, seems like it will work. > But I can also imagine writing it ten other ways where it would break > when combined with LUF.