From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3DC6E77188 for ; Tue, 24 Dec 2024 14:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 24FF26B0082; Tue, 24 Dec 2024 09:10:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1FFDD6B0083; Tue, 24 Dec 2024 09:10:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EEB46B0085; Tue, 24 Dec 2024 09:10:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E6FEF6B0082 for ; Tue, 24 Dec 2024 09:09:59 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6BBE8A169D for ; Tue, 24 Dec 2024 14:09:59 +0000 (UTC) X-FDA: 82930034922.27.8DD7E01 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf14.hostedemail.com (Postfix) with ESMTP id 45D6B10001B for ; Tue, 24 Dec 2024 14:09:14 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735049353; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lu38HyVOUlMWCVz6NqZSynwvREdzttKmjnylYmzgMgs=; b=7KHgMpCAHkAmkPxrejdpYxh9lORrFuXtIyaf5gRYSKefeP5d7E1IJiXkX1zU+4P70I4GFr pK9NitvxN9J1e3DCouXPnMU5bqj6Tt+1kBoaO4hzk0gCQ1gnqH5AhdG3HxwBgJ1pnvQUi6 4zTqscPtpfle6Zx0kNQCakJqmL5S3pg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735049353; a=rsa-sha256; cv=none; b=GjFkjGaSw9C+XfEVYwYQHmZU1s1yDjkZeL0mpwN59/P5gXpWMjnnNHC2ES+6d5UI4neXGD PPH4PWfjZu/6qAh1DKOqWbbw99wHcFV7wcFWPv5aRNi9Aizqe5tDOsUh4qC2ynegjKc7d4 1hdSzL2ryhvQ1YwJpOy30tRpcXURjrA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cmarinas@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id CF16CA41585; Tue, 24 Dec 2024 14:08:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36591C4CED0; Tue, 24 Dec 2024 14:09:54 +0000 (UTC) Date: Tue, 24 Dec 2024 14:09:51 +0000 From: Catalin Marinas To: Zhenhua Huang Cc: will@kernel.org, ardb@kernel.org, ryan.roberts@arm.com, mark.rutland@arm.com, joey.gouly@arm.com, dave.hansen@linux.intel.com, akpm@linux-foundation.org, chenfeiyang@loongson.cn, chenhuacai@kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Anshuman Khandual Subject: Re: [PATCH v2 1/2] arm64: mm: vmemmap populate to page level if not section aligned Message-ID: References: <20241209094227.1529977-1-quic_zhenhuah@quicinc.com> <20241209094227.1529977-2-quic_zhenhuah@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 45D6B10001B X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: qnijyzioj7nh6wk6u9w5wsqwmwmc5mek X-HE-Tag: 1735049354-438592 X-HE-Meta: U2FsdGVkX18EgsERWz9ye+srJnJg9ScG+dM77dOROAB41rQROUTouTkOxTWOqGjs8HyvN3MJaWO1F+8FyKMAkZlYePtLO7SxtoohRu86gm6vUR8qyNHAkwWUhFWlsia1t0xKxCbrbPldpApGIZ5p2xoCUWHTfhf0MJ8H3SyXaH33yv5fXYbs69C+5MZ/ibzRCzDototdBEUUpct/8pK3Okwu0/lc/91xdd1knNtRVJTF/hEPs43ebaqjvRzZYXP4UxZRDMIFMrNp4Eu02667m7yPoliuKf/LY0DkckWOPPqY7C3A+4teQBv/0vlXuHmmn39vlTNba0i2xoYOaQEm9kOK63gz7CGazOPAjMoAhv/SoisnDPZWiu9jVC6SJl2A+lfBoGUSpPUJOttsgu8GfbqfzUUHO5tj9t6oGOupnphw9kPkkvMilHZm3YX9Oo2x5EdDYfwRjRQLXlGCt3kaqN366wgExAGYugcPnHanKWK9YW2hIokd+9W/cGrgpgRXAfUKfWtecrAn3pp2ZOaSo8OZR6ge/w9agFgeW5qIABWjsSoboGhEM+cDVT6AXg/DyOuAw0nntK0+fLiu/wpaZRBCGkCDGbm3r29pPHHysRu1EnNNijOdsPGxvCib87cGp0VtroJm7ts3Nh5u13b31KSfYx1/tKQTx+dcg0oEiXHlryxlL7xHidSwxjzy959xOyhfcD5Ox4OR5NPJhWD1i1OZcNaY9Flph7V58eJRrtHeXm373zRoHHiL43guPSyEliA7AB31IyGTvLwYJuEkd3Oj6MrJ5ElNI8MrrIJK+zmRCBB1eoFcnoRiuU639MA4E5oG5zysLLQXciWRY99sa41FEmdANuVkwZiyc489RA+LbhR8qC+ydPoM0OA7Iz0rmCnfrO7m/HY/FgSw5gtoMg98sg/iO5XjM5Ra3teWh81SchuGyPmh8taHISholV5sQEYGG9AcJQOt1uv+EhN lJoiIMnF 28wNn/4u3PQqOmht/GdF9K8Pesgb3YDLEk9fL30GDHmfpwAeUcz4EU6fcQv+UbEk3VBlKHwgWZP/2giy3TA4OhCc933m3hhn2Ysv97gOSQUdfxGSJjLTu65bHPbur01f7beYtQVts1h2849Y4cu2iKYbs1f2XEZ/leJYItKF0sL1i6icdIQT7npZ58nZenC4E7A9XHSd4RfN9Vw5dQlsLL8wb33XwfI5LoUcykKdpfZiVWU1PUtO4bvbHheynd2joayafXXKJyPARGQ7dDVtp33WVFumwctLvoBY+RyZzDhBxTOPhAt1tvAvZV7/+Gj3GXFLJagB11eF2v1RvGEiAGR2dnN7DGkMUDUrrnEk3BTQKIW86RxC/M8pemA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 24, 2024 at 05:32:06PM +0800, Zhenhua Huang wrote: > Thanks Catalin for review! > Merry Christmas. Merry Christmas to you too! > On 2024/12/21 2:30, Catalin Marinas wrote: > > On Mon, Dec 09, 2024 at 05:42:26PM +0800, Zhenhua Huang wrote: > > > Fixes: c1cc1552616d ("arm64: MMU initialisation") > > > > I wouldn't add a fix for the first commit adding arm64 support, we did > > not even have memory hotplug at the time (added later in 5.7 by commit > > bbd6ec605c0f ("arm64/mm: Enable memory hot remove")). IIUC, this hasn't > > been a problem until commit ba72b4c8cf60 ("mm/sparsemem: support > > sub-section hotplug"). That commit broke some arm64 assumptions. > > Shall we add ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > because it broke arm64 assumptions ? Yes, I think that would be better. And a cc stable to 5.4 (the above commit appeared in 5.3). > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > > index e2739b69e11b..fd59ee44960e 100644 > > > --- a/arch/arm64/mm/mmu.c > > > +++ b/arch/arm64/mm/mmu.c > > > @@ -1177,7 +1177,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, > > > { > > > WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); > > > - if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES)) > > > + if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || > > > + !IS_ALIGNED(page_to_pfn((struct page *)start), PAGES_PER_SECTION) || > > > + !IS_ALIGNED(page_to_pfn((struct page *)end), PAGES_PER_SECTION)) > > > return vmemmap_populate_basepages(start, end, node, altmap); > > > else > > > return vmemmap_populate_hugepages(start, end, node, altmap); > > > > An alternative would be to fix unmap_hotplug_pmd_range() etc. to avoid > > nuking the whole vmemmap pmd section if it's not empty. Not sure how > > easy that is, whether we have the necessary information (I haven't > > looked in detail). > > > > A potential issue - can we hotplug 128MB of RAM and only unplug 2MB? If > > that's possible, the problem isn't solved by this patch. > > Indeed, seems there is no guarantee that plug size must be equal to unplug > size... > > I have two ideas: > 1. Completely disable this PMD mapping optimization since there is no > guarantee we must align 128M memory for hotplug .. I'd be in favour of this, at least if CONFIG_MEMORY_HOTPLUG is enabled. I think the only advantage here is that we don't allocate a full 2MB block for vmemmap when only plugging in a sub-section. > 2. If we want to take this optimization. > I propose adding one argument to vmemmap_free to indicate if the entire > section is freed(based on subsection map). Vmemmap_free is a common function > and might affect other architectures... The process would be: > vmemmap_free > unmap_hotplug_range //In unmap_hotplug_pmd_range() as you mentioned:if > whole section is freed, proceed as usual. Otherwise, *just clear out struct > page content but do not free*. > free_empty_tables // will be called only if entire section is freed > > On the populate side, > else if (vmemmap_check_pmd(pmd, node, addr, next)) //implement this function > continue; //Buffer still exists, just abort.. > > Could you please comment further whether #2 is feasible ? vmemmap_free() already gets start/end, so it could at least check the alignment and avoid freeing if it's not unplugging a full section. It does leave a 2MB vmemmap block in place when freeing the last subsection but it's safer than freeing valid struct page entries. In addition, it could query the memory hotplug state with something like find_memory_block() and figure out whether the section is empty. Anyway, I'll be off until the new year, maybe I get other ideas by then. -- Catalin