From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C6DEC7EE22 for ; Thu, 11 May 2023 14:06:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7D276B0072; Thu, 11 May 2023 10:06:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D068E6B0074; Thu, 11 May 2023 10:06:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7F936B0075; Thu, 11 May 2023 10:06:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A647B6B0072 for ; Thu, 11 May 2023 10:06:13 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 34D96160360 for ; Thu, 11 May 2023 14:06:13 +0000 (UTC) X-FDA: 80778148626.21.40AA247 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf18.hostedemail.com (Postfix) with ESMTP id D7D671C0181 for ; Thu, 11 May 2023 14:04:25 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=lL2e25Q7; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683813867; a=rsa-sha256; cv=none; b=zbGV/tKlcd8ixSY8XpZzCAzTGiAXgKRnmMN3DoMe4df6hkUT7FPjC32/RY9Mn6TrKtmJIF bNBXTU2jnIm2zU+gXRO6fGT/SHQakrCJmk7tUG2u+N5rdfVr6jfUx5QehCEyItIX7sYcP7 y+PW3/gMrtDt1nWxKWKNqlJoK0YcsY8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=lL2e25Q7; spf=none (imf18.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683813867; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=agTIVDV40QlNuRDmtaMpYQzLmx8ylyt8K8o0YwfCE44=; b=yaoUQDvuZ/ObwKo1L/MxE2MEjoEg2ICZw+yrI0NOC9nrvENpPlr+We5mJiNWenbzasLbGZ 6nEc8X3Nq9+OSXViZu+EEugF1TX/K/Eei0GSzlwGFwuNBQceJSpPRCuo+LX5ED+VveGCr6 zb7HnQTJ22ux1wY2N76thxHNjaAnfS8= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=agTIVDV40QlNuRDmtaMpYQzLmx8ylyt8K8o0YwfCE44=; b=lL2e25Q7Ym9Z2f1ilo2kWwHtLS 4LEtDqoDAi/1aC4kjoeLMlHtLt+a+FS4EW3ttxbaQDhTXZfyi02WOEzNc3/vCsr0owbyF5PEsIwHc 4EGA9zShanxJYpMWANvO4AfYvUXSJigLrvPdNGyjPeVrrNrqH+tCpFoKNI6P6bYEGIKsYYtBzAtzu 8Ajl8g5ng54yFG0pf0AiaE1GeB3gEJ+vkOMxu+NkdXV1EBgexS5qxecZBXe6DpvnyQd7D8QuN/WHz qIZBX6djd2pg1T+ptsLEa/faZBnf92py7b8vnk1aHVuruD1rplz5XIZSxcsHdZBQZO/oPVfH/8958 /oI/ASsA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1px6sh-00HGyK-LW; Thu, 11 May 2023 14:02:55 +0000 Date: Thu, 11 May 2023 15:02:55 +0100 From: Matthew Wilcox To: Hugh Dickins Cc: Andrew Morton , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Russell King , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Greg Ungerer , Michal Simek , Thomas Bogendoerfer , Helge Deller , John David Anglin , "Aneesh Kumar K.V" , Michael Ellerman , Alexandre Ghiti , Palmer Dabbelt , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , John Paul Adrian Glaubitz , "David S. Miller" , Chris Zankel , Max Filippov , Peter Zijlstra , x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michel Lespinasse Subject: Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail Message-ID: References: <77a5d8c-406b-7068-4f17-23b7ac53bc83@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: bpqimp4yqx5opjuyzi46t6qfhztnrr76 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D7D671C0181 X-HE-Tag: 1683813865-897186 X-HE-Meta: U2FsdGVkX1/F7vQzHVFfF33LbFloDX+V2DSv5g0KoRUsUH6ZUugUB6Dn5iFUfsQepGpQ3XUMFVAFHkgDPXJMYL/1MGYTghm9DnF1AV/yoEkx1f3NjbPp/DSLmoIZwghJU7FJxzfOsojyl4hBL5WDIGB6gUYxv5yEaT0JFsZrqv+4W5b7qx72KTzHEgqPQdV7UX8lpIUPlvP1Nh8bxMM2D8dqW1IuZbQc9Ht/VOGquqVbqq6uGrRAuVT/lY/o3te0ELDJ7xhXcHI/SNQr3sA2lIYfRVe2xGL9FWnJsSkc5I9eMCmr0ZxxM3i8j4OWpdvlf2oepadWTLKug8cqgl5M7x/051F+wH94+VF2RfXZG8SKqy6V1JO77eg18Ojt9tUolUU/OBOrWvIK6L9t/PDLDu995EmrseJvf1N4kwH3sNi0Cxarlpld37BaPytt+l/b/Lqf/Y/kTb8SI9WPGs8G5NE906wKAdYmeMIJkac0db4iLvTOQ35fhvEc0jqVkNWh22n8GLspkazEkGw4vaZiKbQUYxBlNKvatLDsVKA9Xt0R+rR662qa5poEdOll1n4gwM61J0JvQhAy+uF63TEb7qCPPU/ETJKt51OvptvXvgAbLZfvonhMgsHpRlUaC915pB/lF6JxYLcUho3riZZGOeOGRve4NdNBk0U9h+NZQ2/IOJi8Qb8CNSv5bp4xKQXuTibHX5dt2GLbK4clUf90+aO/120gA5edNoGMP3eiIt04ktGSEiSschc3/MfyXHdHvKoGosEFqoUgZVN6HMBUNAxg4wmk9LcU5bqk6D4IXesfo1yXmVuIOTtmA9jA6v1V/xJsX4m74SVVoyknOKAGs5SgT4F4Bjh4tcoMjVIEfjpb345ForzjFDFsDntnMhabK2RAMAosNKIzCDmEG8Vdo8S3t9FwM+/m8p5lOEPLEFnIkxOQ47OEDPK8g1LU1t3JYA3Im9ir4FpX78tds89 1KLRTmS3 WZrkOrjP9FO30bBn5QnDb4m9LpsZUkOtRP6O5CvTGYKd328bZPqjtbo+tZq98j7XKjqgYlq/xJ+ijRFLYWSTfB9X2MfefCbLpdGv73ZGF9NSX4HW2EaYegC2AzxqhXsabIt50TrwJSd1KVLCkI8y3tknrb+b7rp97jTtX5UbxO8dtW0DAdA74ONoUaZb4Z+9NunbI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 10, 2023 at 09:35:44PM -0700, Hugh Dickins wrote: > On Wed, 10 May 2023, Matthew Wilcox wrote: > > On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote: > > > Two: pte_offset_map() will need to do an rcu_read_lock(), with the > > > corresponding rcu_read_unlock() in pte_unmap(). But most architectures > > > never supported CONFIG_HIGHPTE, so some don't always call pte_unmap() > > > after pte_offset_map(), or have used userspace pte_offset_map() where > > > pte_offset_kernel() is more correct. No problem in the current tree, > > > but a problem once an rcu_read_unlock() will be needed to keep balance. > > > > Hi Hugh, > > > > I shall have to spend some time looking at these patches, but at LSFMM > > just a few hours ago, I proposed and nobody objected to removing > > CONFIG_HIGHPTE. I don't intend to take action on that consensus > > immediately, so I can certainly wait until your patches are applied, but > > if this information simplifies what you're doing, feel free to act on it. > > Thanks a lot, Matthew: very considerate, as usual. > > Yes, I did see your "Whither Highmem?" (wither highmem!) proposal on the I'm glad somebody noticed the pun ;-) > list, and it did make me think, better get these patches and preview out > soon, before you get to vanish pte_unmap() altogether. HIGHMEM or not, > HIGHPTE or not, I think pte_offset_map() and pte_unmap() still have an > important role to play. > > I don't really understand why you're going down a remove-CONFIG_HIGHPTE > route: I thought you were motivated by the awkardness of kmap on large > folios; but I don't see how removing HIGHPTE helps with that at all > (unless you have a "large page tables" effort in mind, but I doubt it). Quite right, my primary concern is filesystem metadata; primarily directories as I don't think anybody has ever supported symlinks or superblocks larger than 4kB. I was thinking that removing CONFIG_HIGHPTE might simplify the page fault handling path a little, but now I've looked at it some more, and I'm not sure there's any simplification to be had. It should probably use kmap_local instead of kmap_atomic(), though. > But I've no investment in CONFIG_HIGHPTE if people think now is the > time to remove it: I disagree, but wouldn't miss it myself - so long > as you leave pte_offset_map() and pte_unmap() (under whatever names). > > I don't think removing CONFIG_HIGHPTE will simplify what I'm doing. > For a moment it looked like it would: the PAE case is nasty (and our > data centres have not been on PAE for a long time, so it wasn't a > problem I had to face before); and knowing pmd_high must be 0 for a > page table looked like it would help, but now I'm not so sure of that > (hmm, I'm changing my mind again as I write). > > Peter's pmdp_get_lockless() does rely for complete correctness on > interrupts being disabled, and I suspect that I may be forced in the > PAE case to do so briefly; but detest that notion. For now I'm just > deferring it, hoping for a better idea before third series finalized. > > I mention this (and Cc Peter) in passing: don't want this arch thread > to go down into that rabbit hole: we can start a fresh thread on it if > you wish, but right now my priority is commit messages for the second > series, rather than solving (or even detailing) the PAE problem. I infer that what you need is a pte_access_start() and a pte_access_end() which look like they can be plausibly rcu_read_lock() and rcu_read_unlock(), but might need to be local_irq_save() and local_irq_restore() in some configurations? We also talked about moving x86 to always RCU-free page tables in order to make accessing /proc/$pid/smaps lockless. I believe Michel is going to take a swing at this project.