From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B21EC6FD1F for ; Tue, 26 Mar 2024 12:54:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D7DC6B0087; Tue, 26 Mar 2024 08:54:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 087656B0088; Tue, 26 Mar 2024 08:54:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E91F86B0089; Tue, 26 Mar 2024 08:54:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DD4E26B0087 for ; Tue, 26 Mar 2024 08:54:23 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7CB6BA0F1C for ; Tue, 26 Mar 2024 12:54:23 +0000 (UTC) X-FDA: 81939183606.27.F6070D9 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf06.hostedemail.com (Postfix) with ESMTP id 20CF7180016 for ; Tue, 26 Mar 2024 12:54:20 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=J9PmjWXA; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf06.hostedemail.com: domain of will@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=will@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711457661; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bkaxEvwltAF0+NzYjd16PmyzL7mxruRNddkiivioIfQ=; b=QHTlmiEBs7+NMIQU8cexg0HSuhmB5k67uyKfG5FhydLVYK5PJ1K2pTiLzwO1yZngg79Ye4 fR4ueiOXaFETDmjHc0XgIMVBwYbEJkiUj7t0OoV8a4r7FNI0matxGjJPoyTJUx4D176hzr 9cYjqItFn+jkWyAHPH8mbu5CwGdHxm4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=J9PmjWXA; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf06.hostedemail.com: domain of will@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=will@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711457661; a=rsa-sha256; cv=none; b=Of2y1ntNX148eydULf/KqbBBx6n24+6dcNLO28IphUkDuT7R5tZjSLRXD3c1023+WqqYag nwdGUr7pIfnUqLoa8/KsMzyXI8o3Cg7JO1t9J5VVPLmh462Y60q6iNFku2yJjQTH6adolc FYSyd0MIPd9rXAxRjB+0CMerxUivNFI= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 2AD60CE2101; Tue, 26 Mar 2024 12:54:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DEEEBC433C7; Tue, 26 Mar 2024 12:54:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711457656; bh=HANE9Ck98nA4yLDO1qZtV2vsNr/fyY+1mdplJg0Nl0w=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=J9PmjWXAuvpNf0kNeIJ0S0w4XhOn94evh/3pN3bRrDW2/KVo+R51sJ3qo5GE1elZP f5Wkxt40uLV+bjhEf0PkwctDFfveSWI+k/otPA46eBO8agrwarzJp8aAWAIhNzOiH1 2qUZyC9O+TrOvy/J4jEajfH8bBlf3sHoq3vit2j/x8qWEI5CFIuBzIarSvh5ptim1d RZo4rEcaoxRcqSRs4k/PXDSRybpM7rH3VvtkgeHzhdbniaU7FwF5mocez2Foim4YXY 2xuI9jLgUvoLTYVVcO6/IRIwLpAuwQQJYx0FnEhVK9OOMYeMzaqMu5kV/da7HkDX+1 /RPLaXzvje0kA== Date: Tue, 26 Mar 2024 12:54:10 +0000 From: Will Deacon To: Nanyong Sun Cc: David Rientjes , Catalin Marinas , Matthew Wilcox , muchun.song@linux.dev, Andrew Morton , anshuman.khandual@arm.com, wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao , Yosry Ahmed , Sourav Panda Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Message-ID: <20240326125409.GA9552@willie-the-truck> References: <20240113094436.2506396-1-sunnanyong@huawei.com> <20240207111252.GA22167@willie-the-truck> <44075bc2-ac5f-ffcd-0d2f-4093351a6151@huawei.com> <20240208131734.GA23428@willie-the-truck> <22c14513-af78-0f1d-5647-384ff9cb5993@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <22c14513-af78-0f1d-5647-384ff9cb5993@huawei.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: 20CF7180016 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 8mo9af5s9e1axmpozmabu9ja8xannu5i X-HE-Tag: 1711457660-560875 X-HE-Meta: U2FsdGVkX1+PziM0/jKqeY+TtrH7JBSvgHYnKWO09eufex3HHGZ8ZUVMd2ns7e3ld70WdRMSGKdpx2jBSNXREMTVDDvcuepPBdFkoL5g9rnqe3j/NUSdFvHMoGZ+qalqIbwclpkPVzBrlTVXcDLZ2/InoQWw4E9oFycJpOMMfx5dSc1LY+iwDO9OZv1cIibP293iuOJl07EnlOIJdS4zkJD15ljrLjWTfIA0GdV497pPQGLWjdeG2NZKrLF2ObtEZ5OGY190iRb8nAlgMRPLL8+4LNUGnvS4i5oOJpL5aK9iOvw/O/uaa2ZVXdRz93ocyQHN/wugC/LcbJxWastM/vJOo4C9UOOlsxT5KV2av75Nagv+d/kcZRvbhG8fKneL5eCCG0Mkb6NzWVUzCL0oemopvQtJHoQiOs54kyt1RZw3KF4QVvfKraMmmByo7LdhA7cdetzpJV9Ftlz2WCdWh129bu9dbpXakYaGwOQsgXf8jM09GzlZA0RHzFWpOGy8HhI6rC2e66/6b5m+eZ5q88LImiKYLRT0bWNq4Hb6hKog/DVdJqoMFnbXDbLh+jp00/xmL549cmC+WMXKJhu8Cu59ZvGUhpffzVa8cQX/rEEHDNr6TxOmIWX61Wdf90UnO6HvAzL8lO4xK9U49TKFjRa9OoAuWf0zEAx7Rv0l2hRZKIqwo1ciSEGLnXYD49K9Qruw+c1OA7y4WrHWGR9/hSKFN4Ga05pX4p/sZD3KagiVy9qSTAUPj2x+5PXagiPkhD75knINDKQJ3wrYLyDwsaxRBfZYf2V9xrMLORA56fDgUMz9q3xXINl8HZBcIuI/DeYXB1z2yBa/+YBY1IGOarZG4vTMlcDqQAoVyb2MtFdFFkGGYqdsYq04wMRl8URaZIkRC9DO+yoyL288zO3Q0mjnwsqw8MnL9xZJi1Lg87oCgzjd1JSWvUh4Q5O0inr8o/4k+Dbw2R2ApqqXptT dqNfP5K7 G3PkzGdr+lh2DSz5042TbvUm9x817vGcGksAepedGdiPyAKayaN7G1zenxvthI+JZEYDjR0FqYqrfSuQpftb+y+tgbDdJWKqNYCO+60+rSf1m6wtQCyezHF4F6HMDZ6aAaXqzhAOPForUrDvcQmkcv4A4U7pj9FxVhGFVFEFTvP5d6edhY5LcYKDsRs9XBy9Ru0eTyRL/cezLbXq4vJc4xw5tvg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 25, 2024 at 11:24:34PM +0800, Nanyong Sun wrote: > On 2024/3/14 7:32, David Rientjes wrote: > > On Thu, 8 Feb 2024, Will Deacon wrote: > > > > How about take a new lock with irq disabled during BBM, like: > > > > > > > > +void vmemmap_update_pte(unsigned long addr, pte_t *ptep, pte_t pte) > > > > +{ > > > > +    (NEW_LOCK); > > > > +    pte_clear(&init_mm, addr, ptep); > > > > +    flush_tlb_kernel_range(addr, addr + PAGE_SIZE); > > > > +    set_pte_at(&init_mm, addr, ptep, pte); > > > > +    spin_unlock_irq(NEW_LOCK); > > > > +} > > > I really think the only maintainable way to achieve this is to avoid the > > > possibility of a fault altogether. > > > > > Nanyong, are you still actively working on making HVO possible on arm64? > > > > This would yield a substantial memory savings on hosts that are largely > > configured with hugetlbfs. In our case, the size of this hugetlbfs pool > > is actually never changed after boot, but it sounds from the thread that > > there was an idea to make HVO conditional on FEAT_BBM. Is this being > > pursued? > > > > If so, any testing help needed? > I'm afraid that FEAT_BBM may not solve the problem here, because from Arm > ARM, > I see that FEAT_BBM is only used for changing block size. Therefore, in this > HVO feature, > it can work in the split PMD stage, that is, BBM can be avoided in > vmemmap_split_pmd, > but in the subsequent vmemmap_remap_pte, the Output address of PTE still > needs to be > changed. I'm afraid FEAT_BBM is not competent for this stage. Perhaps my > understanding > of ARM FEAT_BBM is wrong, and I hope someone can correct me. > Actually, the solution I first considered was to use the stop_machine > method, but we have > products that rely on /proc/sys/vm/nr_overcommit_hugepages to dynamically > use hugepages, > so I have to consider performance issues. If your product does not change > the amount of huge > pages after booting, using stop_machine() may be a feasible way. > So far, I still haven't come up with a good solution. Oh, I hadn't appreciated that you needed to remap the memmap live. How do you synchronise the two copies in that case? I think we (i.e. the arch folks) probably need some more explanation on exactly who can race with what here, otherwise I don't grok how this can work. Thanks, Will