From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B586BC4828F for ; Wed, 7 Feb 2024 14:17:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4EF226B0072; Wed, 7 Feb 2024 09:17:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 49F396B0074; Wed, 7 Feb 2024 09:17:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38F516B0075; Wed, 7 Feb 2024 09:17:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 22E9F6B0074 for ; Wed, 7 Feb 2024 09:17:44 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EC7B71C16C8 for ; Wed, 7 Feb 2024 14:17:43 +0000 (UTC) X-FDA: 81765211206.19.FB03689 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 7001A40018 for ; Wed, 7 Feb 2024 14:17:40 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=cdkGZZ6H; dmarc=none; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707315461; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6nXIcLmaWrCikPStJ3sC7+E3lnqfPQhTFLM5Qd6/v+g=; b=mhdHh05EVh2SJMRXhZhI0a0HgPD7K2KwtUsGo24+FwnRWoDkSPURaQaWY0CYW9lxcAYRTG opdcXkp0avd0frKNemIzCScU/FUqLiLmnaGqCQUi6vPFJS6ZcW1+ou6c/59+5m6LjpnnmC 677WsZ7+/ztbJuTr0XPKYaztSFu0TA8= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=cdkGZZ6H; dmarc=none; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707315461; a=rsa-sha256; cv=none; b=5HV/FWU3uVKqgNrfSdyWZ5+DwUCGp+ihD4+r4R9YF90RYsx5gW7ulC7CpvCXQFEfxYPm66 kbMUP1nn6lRq8h0HIIxmjVjynaRnmT8PRqOwjjfC8/K0kX8Esl/+JVsYxXXGpF2FdaItyd 92AW+U2KUdJ2+udHcGqmZpNZvJuLBNU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=6nXIcLmaWrCikPStJ3sC7+E3lnqfPQhTFLM5Qd6/v+g=; b=cdkGZZ6Hx+vVpgO8vYvlPMpbTQ K/alL+Cx5LLR6Ap1iO9ffjd5Zemo3g7Yxeb+FVj0ctQakDgQOQmp/0dejt+dxj2jLyp4yuy3JsQSl SWiQnhSIOO2KsCP0NzIQXMSuEDLhc6pNebUYzIpudZECGPlJ3SceMr2VSHPYwtJ46sjub2r/tqNF4 kKUBc3VzfT3CfuU20g2Pv1nkSocvbqOg+93mDqfoNVrSMFrUn1qltiUiiojz4RTO5nha4YNl/qpvk O5E1WZOrF1yaNXlVVP3qW6qg/J3DYr2vx8qLnzEzHPWkFOMOSXNAgyA4kg3QqHRKbeqa952oZ5/SU qJtjatFQ==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rXijz-0000000FJjg-2Fbc; Wed, 07 Feb 2024 14:17:31 +0000 Date: Wed, 7 Feb 2024 14:17:31 +0000 From: Matthew Wilcox To: Will Deacon Cc: Nanyong Sun , Catalin Marinas , muchun.song@linux.dev, akpm@linux-foundation.org, anshuman.khandual@arm.com, wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Message-ID: References: <20240113094436.2506396-1-sunnanyong@huawei.com> <20240207111252.GA22167@willie-the-truck> <20240207121125.GA22234@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240207121125.GA22234@willie-the-truck> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 7001A40018 X-Stat-Signature: 3ko8i41tp9ccqmq58b4qmpypqni5a6u5 X-HE-Tag: 1707315460-542874 X-HE-Meta: U2FsdGVkX1/Po/w2RpLvB63z4/g5X3WyRlN2p3a3MM4cEBcuzhDYtAsdsp4540BYA2O/xeEIZ2sYVV6S4DjfGorx/rCLrrzuvEC7Jp5JVT0jTKJNl8t+0B7yaOm8O9SOQo6Cff2QzF/QDGKaYN4zmTzNl1fj3AU+mxT2iWvE1439roXTOLhOh4/e9HnZKFEh7u86uesGZCz7H87WC7nDrJAcnHt80RRctOVCF8rSgN0EaJHcitnYdfzRyi+9/N2i6rQ9UhBZPdWh+N3yUGK6CpcoAc/acO4OAPuKIvc6Mw3/zSH3U/OfDx3XYSnV/tKZAivB9ITCmtRUy5d5Xfi1XEgya4BfDWAr+HPqVbw8/w3yXMDo0f0IS8yD7eDpkf1CX2OrjcxC4uXa0z0FHqBT7i1DmNGm/84OnYWmWWlVsBglIEBOwEv7O1PUVWTGM1Hbpx3p/lDhOw0iAD4gzrhEhznCejedZhFUMAW/4P83bpfq9/rkR/t2X+HCXP7AijMWBkTM0TDmQDA37MoXEEKmWOLBpuIr/GrrpY9Po052CUnnu3AcoZb2HS2CJE4nszXX/6p1xLs4wbwgHnJQjheahguzdpOA6phFE+RkpBAB9yigS34v8kxe+Y9wZGo2zefKzK4WgeTJirna9EzAt+QRdcec+KAmVUx6iPnBJR8kVMLxad/s5TLV0cgf5U1vB+Xjubiyz5N6kh87wVTnDABUQ+9GTqA8gMIfjbEUOxWgBy+cQvjoUk+mapGIoMlMeRKU8s9PFyJpv+u34oHCE4HfmDLGBYUBIyYrcLRSTrbpax2R70Dls4rjqDqvimBbbMFVZ1oaQj/W2x7+3lYmMpld4kr4gSR2U8bphXxlQeQ3ZEqGC1UCEB+BUeoQi2BMrFvr+Tbq+iu00kMUD4PyTNV6W7d+Eqpnf627VyEjxGXGqMSUBZyIxP/Tbs+PGn1DchVyXC81zxlHtEMmK3AYUGy 8tz+PKJe FpzkWZXOSGpfWOpIndTTr0wqrxCINBbusgTFGTHQ+gbLVSiLf0f6Cdl9fxDq7p/ffSSzi3VJjtmEINoHaX87gQoY0I51tEWbkt7Frw6f2Jrwuhv1KA8YiylzIMtrYn7UnBugotTw/EYr57As= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 07, 2024 at 12:11:25PM +0000, Will Deacon wrote: > On Wed, Feb 07, 2024 at 11:21:17AM +0000, Matthew Wilcox wrote: > > The pte lock cannot be taken in irq context (which I think is what > > you're asking?) While it is not possible to reason about all users of > > struct page, we are somewhat relieved of that work by noting that this is > > only for hugetlbfs, so we don't need to reason about slab, page tables, > > netmem or zsmalloc. > > My concern is that an interrupt handler tries to access a 'struct page' > which faults due to another core splitting a pmd mapping for the vmemmap. > In this case, I think we'll end up trying to resolve the fault from irq > context, which will try to take the spinlock. Yes, this absolutely can happen (with this patch), and this patch should be dropped for now. While this array of ~512 pages have been allocated to hugetlbfs, and one would think that there would be no way that there could still be references to them, another CPU can have a pointer to this struct page (eg attempting a speculative page cache reference or get_user_pages_fast()). That means it will try to call atomic_add_unless(&page->_refcount, 1, 0); Actually, I wonder if this isn't a problem on x86 too? Do we need to explicitly go through an RCU grace period before freeing the pages for use by somebody else?