From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF85AC001DE for ; Mon, 24 Jul 2023 01:25:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E5EB8D0001; Sun, 23 Jul 2023 21:25:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 396096B0074; Sun, 23 Jul 2023 21:25:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25DF08D0001; Sun, 23 Jul 2023 21:25:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 154476B0071 for ; Sun, 23 Jul 2023 21:25:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C30A8B1DEE for ; Mon, 24 Jul 2023 01:25:32 +0000 (UTC) X-FDA: 81044762904.04.A97CD72 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf05.hostedemail.com (Postfix) with ESMTP id 7778E10000B for ; Mon, 24 Jul 2023 01:25:29 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of mawupeng1@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=mawupeng1@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690161930; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HHtaXeqmuogvH+Z+iEj1cganz7gMsWSt0aMeGEWByy4=; b=I3SO2VlToH39nzP5Y16nZdgmmKSs5hqfTfDLtr1QkWveOj2rmJdMYDdk3AsfDQA1WMAmeF 7X5kVDeQL2UN2iJ2ygO9HfWi+9MEZfpIX2Vsp6J9YJuFT5pa8rR1iCq/yX/EGw0+DtrbR1 Cz+yQDQ0jlepwAOEVvJKL/mii6Z+Hlg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690161930; a=rsa-sha256; cv=none; b=u9f8I2cDEPTlL3BuEUrqFudPFNmRmFxzGPoEhk0Izb3GL/zrcHLmpbT4l7iAsiCqR9URW/ EdDqQpKtKmot19pW/svmLUe3lhOdG1yC+LriEG1JSuSi7x7i6xkTQVjla6zOAp3gHgmTk2 B2UMlr5rLTAeXz0mu2y4arS+IWF86MA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; spf=pass (imf05.hostedemail.com: domain of mawupeng1@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=mawupeng1@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from dggpemm500014.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4R8MpC3W0xztRZ0; Mon, 24 Jul 2023 09:22:11 +0800 (CST) Received: from [10.174.178.120] (10.174.178.120) by dggpemm500014.china.huawei.com (7.185.36.153) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 24 Jul 2023 09:25:22 +0800 Message-ID: <35a0dad6-4f3b-f2c3-f835-b13c1e899f8d@huawei.com> Date: Mon, 24 Jul 2023 09:25:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird CC: , , , , , , , , , Subject: Re: [RFC PATCH] arm64: mm: Fix kernel page tables incorrectly deleted during memory removal Content-Language: en-US To: References: <20230717115150.1806954-1-mawupeng1@huawei.com> <20230721103628.GA12601@willie-the-truck> From: mawupeng In-Reply-To: <20230721103628.GA12601@willie-the-truck> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.120] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm500014.china.huawei.com (7.185.36.153) X-CFilter-Loop: Reflected X-Stat-Signature: mu4nnj4774t4qgwshwtt4xr5rcxy61r3 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7778E10000B X-Rspam-User: X-HE-Tag: 1690161929-524722 X-HE-Meta: U2FsdGVkX1+LoWCt5bTGWNPgkFvowjV6pNcz3FS9tnjnEwtF5kJay3VSJXt146Gf5Zk0SvgJbXwNXYtIL/ISeWUQoWR6Q1GUdsv6PMAO22bG55G6ACt1vogEwy2CsQSbxG8moV3hGoD6h6PPN/QsMD7Dc0/2rVC3LC71WHlP1pb8V87cq2offFiHlVZo1Rsf+cjhQ2mtgdBMDVUhvvMDTpf0g6uYDqD2u2xoyGuopkzwNNVd+PFnV6ZKwN4tdzCipZpyytMVk74YJvlmVhH4jRFIl62jB/Rz3LYGikvY2PJyFP7OjbFeuE/XgBRVcLe8LUy8cCXxKEkA0SzNoJekSiHR3rtS+M+JTc74wSX4oUs5QAm8ndOgZFHthIcCKXNN21B8g+hMPYhIL8b06FWnE/+ev81DXZp5/tijQ3AES0Y6hxrAdakzX1GddRrgnT9gOQ4hiSHwxVNhQtfGXoJj6e+Rryo63/c8CG6OQD5m3jWzKPP4uLOkNX86HX7KDIIY51jZh3D9JMEGCZradbPGJrMDBeb4Tb0N/wDkqOMHp9W+fQ79YWeEi8OIcHnLD9043uK8GCjl5HKAJ6CBR63LpjOPECeHGVjoKEC0BgoNl43GdJ8g77rGgdLh/2chdRVZMQF61bEKo/9d8s08h//3749EcN8Ua8dT7CpXCRodQHDt+1aFRyrW7kIwCsDq0j6YcjaHoCV1UD6cvej/Zsi3tlrysC2gyufHPU8t0cxn+UMpqF0v13aEqjwuJE8dQxOuhzLNlsKBbXpMkAnQh9Gle1MQ2P79IEJlEsaZRm9SUxa0pYxsCbRVhtJveM5nUnYXElpt8KKEAQ7EwNzxCa4vgxAFxDugbtdORcy1Yr3F3+pp40xKagKmbKpSFeoQi9JGU1PG1C+iFv3oBggeK3Sox1nEGFheJ1CJECCOEtKfwMi9MICNbucsWiKwEvK6auPz2z456gFm8nwcXQuZCG5 /dmpXONQ rG8I+oMVzzMvgzA3SlOw/NqKGq6NUKSO/F9Hk1szqXhbHG3tP+FFXuzzJYZ1WC+s9r2BIFz/mA4e276uA/JV4vbkdfTZw7hNikvWKU6SMyBd3ILaZt44fQKTk4PlbEr41KOB8pZqOHTEP+HrUpH2HqgwS6KlIUeWPkugHrBtbDs+8EEYrYt+FmBsqgSEm9WPpDv8F X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/7/21 18:36, Will Deacon wrote: > On Mon, Jul 17, 2023 at 07:51:50PM +0800, Wupeng Ma wrote: >> From: Ma Wupeng >> >> During our test, we found that kernel page table may be unexpectedly >> cleared with rodata off. The root cause is that the kernel page is >> initialized with pud size(1G block mapping) while offline is memory >> block size(MIN_MEMORY_BLOCK_SIZE 128M), eg, if 2G memory is hot-added, >> when offline a memory block, the call trace is shown below, >> >> offline_and_remove_memory >> try_remove_memory >> arch_remove_memory >> __remove_pgd_mapping >> unmap_hotplug_range >> unmap_hotplug_p4d_range >> unmap_hotplug_pud_range >> if (pud_sect(pud)) >> pud_clear(pudp); > > Sorry, but I'm struggling to understand the problem here. If we're adding > and removing a 2G memory region, why _wouldn't_ we want to use large 1GiB > mappings? > Or are you saying that only a subset of the memory is removed, > but we then accidentally unmap the whole thing? Yes, umap a subset but the whole thing page table entry is removed. > >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >> index 95d360805f8a..44c724ce4f70 100644 >> --- a/arch/arm64/mm/mmu.c >> +++ b/arch/arm64/mm/mmu.c >> @@ -44,6 +44,7 @@ >> #define NO_BLOCK_MAPPINGS BIT(0) >> #define NO_CONT_MAPPINGS BIT(1) >> #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ >> +#define NO_PUD_MAPPINGS BIT(3) >> >> int idmap_t0sz __ro_after_init; >> >> @@ -344,7 +345,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end, >> */ >> if (pud_sect_supported() && >> ((addr | next | phys) & ~PUD_MASK) == 0 && >> - (flags & NO_BLOCK_MAPPINGS) == 0) { >> + (flags & (NO_BLOCK_MAPPINGS | NO_PUD_MAPPINGS)) == 0) { >> pud_set_huge(pudp, phys, prot); >> >> /* >> @@ -1305,7 +1306,7 @@ struct range arch_get_mappable_range(void) >> int arch_add_memory(int nid, u64 start, u64 size, >> struct mhp_params *params) >> { >> - int ret, flags = NO_EXEC_MAPPINGS; >> + int ret, flags = NO_EXEC_MAPPINGS | NO_PUD_MAPPINGS; > > I think we should allow large mappings here and instead prevent partial > removal of the block, if that's what is causing the issue. This could solve this problem. Or we can prevent partial removal? Or rebulid page table entry which is not removed? > > Will