From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 393F7C25B75 for ; Mon, 3 Jun 2024 09:35:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C38946B008A; Mon, 3 Jun 2024 05:35:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE6F06B009B; Mon, 3 Jun 2024 05:35:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAE896B00A1; Mon, 3 Jun 2024 05:35:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8DCDD6B008A for ; Mon, 3 Jun 2024 05:35:16 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3ECE7120A24 for ; Mon, 3 Jun 2024 09:35:16 +0000 (UTC) X-FDA: 82189069032.13.267A9EF Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf20.hostedemail.com (Postfix) with ESMTP id 3B2881C002A for ; Mon, 3 Jun 2024 09:35:12 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717407314; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zhBV01WjHSmPqa2Yv6ByTU9hjMhFd7puO1RYDfRbsnw=; b=TDvtIW2heIcqwpaN+GRdo6NwjN0YTZUcX0AOkspYi5gt6AJpXKyqvzBPLaeWYjCen+Y/oP /U7DpnHAr4gkTMqcIG2lf2jn+jRAwgy/Z6wZuAZihyh1n8MEtCUQqNtpAbLiCz2XvbbrM5 7hv4FDr97rfHmfPgsRsnzy8EQnE402g= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717407314; a=rsa-sha256; cv=none; b=QpyaaUu4v0DF5F89mCnddshIkt3dqrcw+MsRWG4DrWJgwWuvpkQD4AcF44uqJOK2jnkAgZ g6UTMhIXAR/zqZfGA/eCsArnLPCiYwaP0yH3I2+GFS1Li6S58r78sczuuRAm12Y+mRdbaW hxwiXxuOWPxC1WotkMvGt7yvHYyVbbA= X-AuditID: a67dfc5b-d6dff70000001748-a8-665d8e4e60ff Date: Mon, 3 Jun 2024 18:35:05 +0900 From: Byungchul Park To: David Hildenbrand Cc: Dave Hansen , Byungchul Park , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped Message-ID: <20240603093505.GA12549@system.software.com> References: <20240531092001.30428-1-byungchul@sk.com> <20240531092001.30428-10-byungchul@sk.com> <26dc4594-430b-483c-a26c-7e68bade74b0@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <26dc4594-430b-483c-a26c-7e68bade74b0@redhat.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrKIsWRmVeSWpSXmKPExsXC9ZZnoa5fX2yaweTJyhZz1q9hs/i84R+b xaeXDxgtXmxoZ7T4uv4Xs8XTT30sFpd3zWGzuLfmP6vF0c5NzBbnd61ltdixdB+TxaUDC5gs jvceYLKYf+8zm8XmTVOZLY5Pmcpo8fsHUMfJWZNZHIQ8vrf2sXjsnHWX3WPBplKPzSu0PBbv ecnksWlVJ5vHpk+T2D3enTvH7nFixm8Wj3knAz3e77vK5rH1l51H49RrbB6fN8kF8EVx2aSk 5mSWpRbp2yVwZcy9+pStYKNkxeNZwQ2MN4S7GDk5JARMJB513maEsY/MncIKYrMIqEgs+PSd HcRmE1CXuHHjJzOILSKgIbGpbQOYzSxwjFni/35VEFtYoEDi1YRJYPW8AhYSv1dvAZsjJHCY SeLV9QiIuKDEyZlPWCB6tSRu/HvJ1MXIAWRLSyz/xwES5hSwk2g6dRbsHFEBZYkD244DlXAB nbaNXWJt5yImiDslJQ6uuMEygVFgFpKxs5CMnYUwdgEj8ypGocy8stzEzBwTvYzKvMwKveT8 3E2MwMhcVvsnegfjpwvBhxgFOBiVeHgv5MSkCbEmlhVX5h5ilOBgVhLh7auLThPiTUmsrEot yo8vKs1JLT7EKM3BoiTOa/StPEVIID2xJDU7NbUgtQgmy8TBKdXAmDdbXKq4s7PwlPuEeg8/ Z615WztzsmdH6ytoTamfNdHS7XLN5Mqajw7vfjl1CpnveO4Qz7g9aWPEihWlB9p7T/feWFi5 +9acVTnRCzjWvH3E8SLcatVMD7aFAgVLv711VOC6yXm0yZyp8b2U3P7P8wJFm8+3uDhNaxPq CdU/ejH6Y+KUc8W6SizFGYmGWsxFxYkAyYmT/sgCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprDIsWRmVeSWpSXmKPExsXC5WfdrOvXF5tm0LJQzGLO+jVsFp83/GOz +PTyAaPFiw3tjBZf1/9itnj6qY/F4vDck6wWl3fNYbO4t+Y/q8XRzk3MFud3rWW12LF0H5PF pQMLmCyO9x5gsph/7zObxeZNU5ktjk+Zymjx+wdQx8lZk1kchD2+t/axeOycdZfdY8GmUo/N K7Q8Fu95yeSxaVUnm8emT5PYPd6dO8fucWLGbxaPeScDPd7vu8rmsfjFByaPrb/sPBqnXmPz +LxJLoA/issmJTUnsyy1SN8ugStj7tWnbAUbJSsezwpuYLwh3MXIySEhYCJxZO4UVhCbRUBF YsGn7+wgNpuAusSNGz+ZQWwRAQ2JTW0bwGxmgWPMEv/3q4LYwgIFEq8mTAKr5xWwkPi9egvY HCGBw0wSr65HQMQFJU7OfMIC0aslcePfS6YuRg4gW1pi+T8OkDCngJ1E06mzjCC2qICyxIFt x5kmMPLOQtI9C0n3LITuBYzMqxhFMvPKchMzc0z1irMzKvMyK/SS83M3MQLjbFntn4k7GL9c dj/EKMDBqMTD+yIpNk2INbGsuDL3EKMEB7OSCG9fXXSaEG9KYmVValF+fFFpTmrxIUZpDhYl cV6v8NQEIYH0xJLU7NTUgtQimCwTB6dUA+Oytzv4Psem5O2clRTjzuxfxukRUvatkfOV7/zI JdHnzZkcs1x4wyznZE05xnTg0f8579N6XmbE/bWbM3+L8FvJl0f1jW7oX7CwO2LfE5d++sLD 5eKM85YHttccKssL+KXGfVNsx7wlvEffZmgsnX/hq555/Sfzh2Xdx9cw8OeLcfncbq/+pK3E UpyRaKjFXFScCABmvynUrwIAAA== X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 3B2881C002A X-Stat-Signature: wp71yjc674pk7zgfm7ot8by1huricdxn X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717407312-209276 X-HE-Meta: U2FsdGVkX1/+aZthRA3Umi7XHmdywlxoYlTppng6dSU6yfH2uJTz6YeDWelI77vz6cgfjnrL2T06M2GZ/ENBP3fezUBu1pBZytJqnmLi1EVmK5aqw9BXZuYMZwB0jsE9bpQw8t5ja03H2X2Xto4r+jM/icVG3ze/XXiqgcpc58hX+5KVvXm/HqAOV5erkfqWljjDCJec0WWWCTY988gXP5DCc2tObQKeyMXGVPyUT9frkfVYQ6uy3ftlhAWF+0im5MgBZHpUKadbRDtoAN+4xe15TaQk/B8/tnWPocnjUCc0EgAn75fXqhk7RYqnN9X4isrs8McablLg0C/Pq/3CZj2w2JPgT48wqhtnVG6N1lSANHX5VfXCCQjTz94HL4sWIZSkTVbsf7ys/TMlkFeZX1wQYcU3PYEiTB0n9oiPsKFKvgUSFeex/Z7DFK+16DwQtR/obhvdA1UeFtNuATPnbN19VjrAAo0+R4JiBqd4ag0VAgp7LfP823I9dKej184zvAedjHvLbsnQb3HYHPwF9h7cdMHD5t6/vatjO62/c/ltxBRxNPLkXO05RJPAGAs7Y5R7mFxE/RQp2RItyy7UrFsIDG0rLVRp+hSHh5axNj2mhR18Tzg5DW6F79EK7OQ4d8gibLHvEwHMbsIJw+vAnrP1ErmlCxQBabLXeE7XEvp09FLqEFdbeKdKKvWEILOMgkap2Uc0Fyzm1MMECVoJ9buRhGjTLJ8EUMRM6o+866tuddDZhfvys14AC5s0tXfjrtSkGeF5oEfx3fpkLrvgafQkt++54wkoM4DnU43JzJfmY0Z/VaDaHurTGApcuGRPTtKbzdA5rxJbAiz9KeLq3RJGpiSCGpS34nBTAiffrV3cV0j4lRrOfQ+jR9j7PElNKaowT67mHRb1+aJWAxUNclWRiQqhndAQWe29+wlGvSBxjST5W+LGMcqQNG4tTzsw3reQQyjyaSUvVVYJUv+ qneOMXb9 Ws1yC3/BXKlZQuVE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Jun 01, 2024 at 09:22:17AM +0200, David Hildenbrand wrote: > On 31.05.24 23:46, Dave Hansen wrote: > > On 5/31/24 11:04, Byungchul Park wrote: > > ... > > > I don't believe you do not agree with the concept itself. Thing is > > > the current version is not good enough. I will do my best by doing > > > what I can do. > > > > More performance is good. I agree with that. > > > > But it has to be weighed against the risk and the complexity. The more > > I look at this approach, the more I think this is not a good trade off. > > There's a lot of risk and a lot of complexity and we haven't seen the > > full complexity picture. The gaps are being fixed by adding complexity > > in new subsystems (the VFS in this case). > > > > There are going to be winners and losers, and this version for example > > makes file writes lose performance. > > > > Just to be crystal clear: I disagree with the concept of leaving stale > > TLB entries in place in an attempt to gain performance. > > There is the inherent problem that a CPU reading from such (unmapped but not > flushed yet) memory will not get a page fault, which I think is the most > controversial part here (besides interaction with other deferred TLB > flushing, and how this glues into the buddy). > > What we used to do so far was limiting the timeframe where that could > happen, under well-controlled circumstances. On the common unmap/zap path, > we perform the batched TLB flush before any page faults / VMA changes would > have be possible and munmap() would have returned with "succeess". Now that > time frame could be significantly longer. > > So in current code, at the point in time where we would process a page > fault, mmap()/munmap()/... the TLB would have been flushed already. > > To "mimic" the old behavior, we'd essentially have to force any page > faults/mmap/whatsoever to perform the deferred flush such that the CPU will > see the "reality" again. Not sure how that could be done in a *consistent* In luf's point of view, the points where the deferred flush should be performed are simply: 1. when changing the vma maps, that might be luf'ed. 2. when updating data of the pages, that might be luf'ed. All we need to do is to indentify the points: 1. when changing the vma maps, that might be luf'ed. a) mmap and munmap e.i. fault handler or unmap_region(). b) permission to writable e.i. mprotect or fault handler. c) what I'm missing. 2. when updating data of the pages, that might be luf'ed. a) updating files through vfs e.g. file_end_write(). b) updating files through writable maps e.i. 1-a) or 1-b). c) what I'm missing. Some of them are already performing necessary tlb flush and the others are not. luf has to handle the others, that I've been focusing on. Of course, there might be what I'm missing tho. Worth noting again, luf is working only on *migration* and *reclaim* currently. Thing is when to stop the pending initiated from migration or reclaim by luf. Byungchul > way (check whenever we take the mmap/vma lock etc ...) and if there would > still be a performance win. > > -- > Cheers, > > David / dhildenb