From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2F27C4828D for ; Wed, 7 Feb 2024 12:44:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49E926B007E; Wed, 7 Feb 2024 07:44:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 44F536B0080; Wed, 7 Feb 2024 07:44:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33DA76B0081; Wed, 7 Feb 2024 07:44:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 204456B007E for ; Wed, 7 Feb 2024 07:44:55 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id EED21A0D86 for ; Wed, 7 Feb 2024 12:44:54 +0000 (UTC) X-FDA: 81764977308.25.4F76A37 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf19.hostedemail.com (Postfix) with ESMTP id 5186F1A000A for ; Wed, 7 Feb 2024 12:44:53 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf19.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707309893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+EIYK1IWY3/M1CvOOaeGgB9HdC/drk2D42ROjbBIQ08=; b=HnuznG2CBULCZDdUzu5OetGXQR2DwdlpmjdNLudaliV7EEtzOjISzxQ2XONRILLC5CJFAI J4kh2vyVcuDX6xrUjXV9E+mWoGoeX1dL6bOrlXA8DpndRcfsSl3aGmG0WNR66gnTIsRQam /tNNi6weB2fEaW0Hd3D3aAF4GuZsF4U= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf19.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707309893; a=rsa-sha256; cv=none; b=8jmUvBnMyEYw09v0XaWf394z2adgx2fB+QZuzmOpwnEsKFkDqVwk0jVNGP0ozZWL3gw8SQ 96nJBUV69iOaCAhfS4nQKCu8NL1MS8UYQ0xXvqOLvaky1cpDbszEbZO8wSCTlE0t8d6DRJ iws2dhnPCDJSPAkVxOgtP0TOc287yG8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 64E746182C; Wed, 7 Feb 2024 12:44:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 21937C433C7; Wed, 7 Feb 2024 12:44:49 +0000 (UTC) Date: Wed, 7 Feb 2024 12:44:47 +0000 From: Catalin Marinas To: Nanyong Sun Cc: will@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org, anshuman.khandual@arm.com, willy@infradead.org, wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Message-ID: References: <20240113094436.2506396-1-sunnanyong@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5186F1A000A X-Stat-Signature: h1t98fqwq1pd5cxxhx9pi8f3u7gwt6qz X-HE-Tag: 1707309893-680073 X-HE-Meta: U2FsdGVkX1+SuKaA54959JLPA+yPWWcXhfWDHAcggwxZdu3/eqshlipunEKCNMqYFabh1U5+fBlVEl/RrW3abgI8q/A3i99R3BToSDygsmuL5gcy7Y8o0BWZ3ZgZGBqMeehPBazy4vdKsQevfWjnLg+kF8MS8N5VSaYEFeHmwqfhBlNiknr2s/CSxQxezeqQGQ78O0W9jNhA3GbyARsQsI7DbGS/hAoP0zfbhss4G/p0VB83FBiRsCVKtKxhsCrXycCoMpg76fNBRf4KYkTDKQAWZmAqxv5CafrX4GGN870e0CED0b1iz1ee35MPk5wk6H5wQjQW4zvi57Vnzt9jZ8p4Mq4RTZIt9xs05sbNh/zMkFZub9PtTPbVYI3UyznkIPBu5/DiUcit4U7vvZN1HT7d7Jsz/zFNj5S2/M8u7l7HMl/1TjNL1T7Sb7FP4ldyTjN7o4jZiRJ1SFCvDOJQOMNYojGk2YfWCp7eEjNUS8oBoiBXt2A2TtQIvLhnLXEkkg+CgfAw/VZ5RAvgVktzaDcNQ6Fc54PpXuVv1IcXQ3bylZUOzqfqtn25R03wjPT6S6zoK3FGpL3d2nMHINdRJ51gM2JMfNBk4QxD+SHnKrlV6QI7ChdwCIPJhDbMIBaW/aoqV1WMhvh2UcXRZW6CD4ZkyA5IpP6rb1/V3hGbi5rrZY8B2+f2fwORg3NdnE8Y5lY0iEA7u7XZCXNIC5N1+rb13+Al+cf7Xan016Kxp6R0ZEGuOuneGxAq1gj+8FPZ0SEZ+HgTuJEpIWTpqKCC4SEHjffGe4mBuvIxkDPj3ulaDhKqreosfBiKptNnrD7nTHHTRye14p5gnquBVEbe3zvZbvQOVxpUwxxJgv6SrNRoKkUo7THrzZl+NfP2QgsqFMJVbYyRrEr0vlvVr8I6wRcJ+7ALZK3r9mkiMD6RLAiyGO54yamw8fPkJPL16b5+WV3zdq3cLrDrVq8Sn5K 4hrScdUT WIaukt4xo9US/7YoLJ8/G3LQzbdN65ypSj/Uv77evtnUxPog= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Jan 27, 2024 at 01:04:15PM +0800, Nanyong Sun wrote: > On 2024/1/26 2:06, Catalin Marinas wrote: > > On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote: > > > HVO was previously disabled on arm64 [1] due to the lack of necessary > > > BBM(break-before-make) logic when changing page tables. > > > This set of patches fix this by adding necessary BBM sequence when > > > changing page table, and supporting vmemmap page fault handling to > > > fixup kernel address translation fault if vmemmap is concurrently accessed. [...] > > How often is this code path called? I wonder whether a stop_machine() > > approach would be simpler. > As long as allocating or releasing hugetlb is called.  We cannot limit users > to only allocate or release hugetlb > when booting or not running any workload on all other cpus, so if use > stop_machine(), it will be triggered > 8 times every 2M and 4096 times every 1G, which is probably too expensive. I'm hoping this can be batched somehow and not do a stop_machine() (or 8) for every 2MB huge page. Just to make sure I understand - is the goal to be able to free struct pages corresponding to hugetlbfs pages? Can we not leave the vmemmap in place and just release that memory to the page allocator? The physical RAM for those struct pages isn't going anywhere, we just have a vmemmap alias to it (cacheable). -- Catalin