From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB91EEB64DD for ; Mon, 31 Jul 2023 01:39:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E7F4F280006; Sun, 30 Jul 2023 21:39:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E2E946B009B; Sun, 30 Jul 2023 21:39:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1CEB280006; Sun, 30 Jul 2023 21:39:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C35E86B009A for ; Sun, 30 Jul 2023 21:39:45 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6852CC09E9 for ; Mon, 31 Jul 2023 01:39:45 +0000 (UTC) X-FDA: 81070200330.06.8565A81 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf04.hostedemail.com (Postfix) with ESMTP id ACC674000D for ; Mon, 31 Jul 2023 01:39:42 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690767583; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wLg5jFWiJcotrfzhlu7AajpXrC3UiT5eLvShCiTWh00=; b=xTCZKG4XLnCH5awypLELSnYk1mASYljZvsaHOgdKoU9wEAtXPdxRo6ni1a/nnx3EsEbm8c vmSAvSYLavTFQsgJc0eHb+R8vZcekl4pqQVCv28AUPRj4CxLrJ1e/YXB8Hg2hfGeOQjEfd oMlt+jS/D/nFXoDznl5pAzfVinAYKpA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690767583; a=rsa-sha256; cv=none; b=ulxXJeTDopp0WVPJUotAWn3b0j7RRMskuj9kBpexdjCKWq9BOclMq8XqCwPHe2t0DVijTY SPN86XeTsD+o22gCUadDI1QEdt/2erhdEStTewBAUkjAI3uKTHGzABjny1iFcicmRpv47d xdwbcZB8IFV9G6yVTbQ6ZilQAYAfOTk= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RDgqz395jz1GDJ7; Mon, 31 Jul 2023 09:38:39 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 31 Jul 2023 09:39:37 +0800 Message-ID: Date: Mon, 31 Jul 2023 09:39:37 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: [PATCH] mm: disable kernelcore=mirror when no mirror memory Content-Language: en-US To: Mike Rapoport CC: Wupeng Ma , , References: <20230728040124.4093229-1-mawupeng1@huawei.com> <20230729081218.GH1901145@kernel.org> <20230730065353.GJ1901145@kernel.org> From: Kefeng Wang In-Reply-To: <20230730065353.GJ1901145@kernel.org> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: ACC674000D X-Stat-Signature: 1hb517wjjnjokmwyx7xwab5kxxwy6o34 X-Rspam-User: X-HE-Tag: 1690767582-697883 X-HE-Meta: U2FsdGVkX19AwK8Aj1hdNUNsyk4UlNeYO2uob+/xUUFkxnxiMHF8WtSl+v2hfaPnlPRHsHcKyUBgKRlp3pi3/9kK088ukeNxKtJb0OcON/syEcNPw+vXURPgse8p+Zz3uVrM8m4pHRoAA2aMWxKtfg1OOyhw35HAb16QvnNUYVuOc4FxYv/J3dELmEzIP5U2QbJnxWKpmuzS5mPAAeQIa3gIIqoHsaFKrbFwkalTggqTviR8r4vvmj8s31tzoxiJproH2qZv1ddKG8r9achmTuh6NPKnYdUVuKK3LG5TevhsXbx/zynXBvnV3sOEqTHHQJhUKdz98wOaFicnoH4WZY54EgGjk4bbyRMSFOhkCzFMiPGFwCjRvZnJ7LA1xNYNxJJuJHgQh5+pMYe7DKIBFauM9WfJqdkV6l0bJcLNdepM5gCLNMs0mzBwL4Q1F/jHMKGOdO7ZBmHekKNiZeRUhqAbFzjucddoyqCgA+puvfUnnnHfRx9NIRhdGv1Ot7z9H9UnK2cnMY1bKDNO0HHT2MacZdsPoBqpvNPE4Sbh+R+9wVuinBpsi2vuVXGGN6RgdXFzTUR0bmJ2vp0/rZoYVGZIlTejGzX+gu1eR3tSt+H/m+J5XUyo6GR65GKE8Mq2WbLCg3F7+aWlXKuyK85n8IaXp5QJevFeH67JcqOynG4cZuMaWVbingY5BupzAs+fGhU143LRURJ7Uk28TUJcicq1ZRxC6yPmZiBwShs7GvRMXgZCMLpDnsJ4oYkttt7+aWtuIZosIjX3ptj9Bk2AMpnJR3m3o+T4bnrpTJf6WEJatNEFjzJWzxzZXksTr7GQN632+aobAR8LXEjz1GpcfLB8NZ/DSOHVdzz/ZKv0S9DySrzpuyWrYzICqsBGYOtzNbpDAUTxgSeUSU3ZvuPliSQvZXVxiG7AnUHE8NqvlJ3IhtPnOiLdaEWEIiOUIuWfe7AMkCQxsfBBWxAfeLW WwodZFBr 2Hb+VNA+9GFn2dWdU2SqJx+5tacDl6X7N94b+pqNgjU8D+vxG/qhXNuG3ZH/wAmEmIVxVubZEiwVDRPqRCZKmAcOd0aNMPdZhcrXxaPpBoGo6kPy4m/2fdxC/TFMT3V9KCqNM2k3syM4dYmBLjqjtkAbxag== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/7/30 14:53, Mike Rapoport wrote: > On Sat, Jul 29, 2023 at 04:57:17PM +0800, Kefeng Wang wrote: >> >> >> On 2023/7/29 16:12, Mike Rapoport wrote: >>> On Fri, Jul 28, 2023 at 12:01:24PM +0800, Wupeng Ma wrote: >>>> From: Ma Wupeng >>>> >>>> For system with kernelcore=mirror enabled while no mirrored memory is >>>> reported by efi. This could lead to kernel OOM during startup since >>>> all memory beside zone DMA are in the movable zone and this prevents >>>> the kernel to use it. >>>> >>>> Zone DMA/DMA32 initialization is independent of mirrored memory and >>>> their max pfn is set in zone_sizes_init(). Since kernel can fallback >>>> to zone DMA/DMA32 if there is no memory in zone Normal, these zones >>>> are seen as mirrored memory no mather their memory attributes are. >>> >>> Using kernelcore= and movablecore= always come with the risk there will be >>> to little memory for the kernel to use. Even if EFI reports mirrored memory >>> it's possible to have OOM with kernelcore=mirror because there could be >>> just not enough mirrored memory. >> >> Yes, this is a big problem, could we add an option to move some >> ZONE_MOVABLE pages into ZONE_NORMAL(MIGRATE_MOVABLE) when low free >> memory?> >>>> To solve this problem, disable kernelcore=mirror when there is no real >>>> mirrored memory exists. >>>> >>>> Signed-off-by: Ma Wupeng >>>> --- >>>> mm/internal.h | 2 ++ >>>> mm/memblock.c | 2 +- >>>> mm/mm_init.c | 6 +++++- >>>> 3 files changed, 8 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/mm/internal.h b/mm/internal.h >>>> index a7d9e980429a..98a03ac74ca7 100644 >>>> --- a/mm/internal.h >>>> +++ b/mm/internal.h >>>> @@ -374,6 +374,8 @@ static inline void clear_zone_contiguous(struct zone *zone) >>>> zone->contiguous = false; >>>> } >>>> +extern bool system_has_some_mirror; >>>> + >>>> extern int __isolate_free_page(struct page *page, unsigned int order); >>>> extern void __putback_isolated_page(struct page *page, unsigned int order, >>>> int mt); >>>> diff --git a/mm/memblock.c b/mm/memblock.c >>>> index f9e61e565a53..e7a7a65415fb 100644 >>>> --- a/mm/memblock.c >>>> +++ b/mm/memblock.c >>>> @@ -156,10 +156,10 @@ static __refdata struct memblock_type *memblock_memory = &memblock.memory; >>>> } while (0) >>>> static int memblock_debug __initdata_memblock; >>>> -static bool system_has_some_mirror __initdata_memblock; >>>> static int memblock_can_resize __initdata_memblock; >>>> static int memblock_memory_in_slab __initdata_memblock; >>>> static int memblock_reserved_in_slab __initdata_memblock; >>>> +bool system_has_some_mirror __initdata_memblock; >>>> static enum memblock_flags __init_memblock choose_memblock_flags(void) >>>> { >>>> diff --git a/mm/mm_init.c b/mm/mm_init.c >>>> index a1963c3322af..6267b9f75927 100644 >>>> --- a/mm/mm_init.c >>>> +++ b/mm/mm_init.c >>>> @@ -269,7 +269,11 @@ static int __init cmdline_parse_kernelcore(char *p) >>>> { >>>> /* parse kernelcore=mirror */ >>>> if (parse_option_str(p, "mirror")) { >>>> - mirrored_kernelcore = true; >>>> + if (system_has_some_mirror) >>>> + mirrored_kernelcore = true; >>> >>> On many architectures early parameters are parsed before memblock is setup, >>> so system_has_some_mirror will always be true. >> >> Only x86/arm64 support kernelcore=mirror, system_has_some_mirror is >> false by default, so it should no issue for now, but it is better to >> move this check into find_zone_movable_pfns_for_nodes(). > > Sorry, I meant that system_has_some_mirror is false by default, and both > x86/arm64 parse early parameters before they set up memblock, so > system_has_some_mirror will be always false at this point. Clear, so let's move check into find_zone_movable_pfns_for_nodes()(no test), is this ok? diff --git a/mm/internal.h b/mm/internal.h index a7d9e980429a..1599becc9079 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1005,6 +1005,7 @@ static inline bool gup_must_unshare(struct vm_area_struct *vma, } extern bool mirrored_kernelcore; +extern bool system_has_some_mirror; static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma) { diff --git a/mm/memblock.c b/mm/memblock.c index f9e61e565a53..e7a7a65415fb 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -156,10 +156,10 @@ static __refdata struct memblock_type *memblock_memory = &memblock.memory; } while (0) static int memblock_debug __initdata_memblock; -static bool system_has_some_mirror __initdata_memblock; static int memblock_can_resize __initdata_memblock; static int memblock_memory_in_slab __initdata_memblock; static int memblock_reserved_in_slab __initdata_memblock; +bool system_has_some_mirror __initdata_memblock; static enum memblock_flags __init_memblock choose_memblock_flags(void) { diff --git a/mm/mm_init.c b/mm/mm_init.c index a1963c3322af..c444da6065a6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -377,6 +377,11 @@ static void __init find_zone_movable_pfns_for_nodes(void) if (mirrored_kernelcore) { bool mem_below_4gb_not_mirrored = false; + if (!system_has_some_mirror) { + pr_warn("The system has no mirror memory, ignore kernelcore=mirror.\n"); + goto out; + } + for_each_mem_region(r) { if (memblock_is_mirror(r)) continue;