From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5AC4C433F5 for ; Fri, 17 Sep 2021 07:24:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6377E611C4 for ; Fri, 17 Sep 2021 07:24:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6377E611C4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E74966B0071; Fri, 17 Sep 2021 03:24:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E22766B0072; Fri, 17 Sep 2021 03:24:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEB5F6B0073; Fri, 17 Sep 2021 03:24:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id BABB26B0071 for ; Fri, 17 Sep 2021 03:24:39 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 491E539BAA for ; Fri, 17 Sep 2021 07:24:39 +0000 (UTC) X-FDA: 78596227878.37.752B7CD Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf15.hostedemail.com (Postfix) with ESMTP id E4E72D000096 for ; Fri, 17 Sep 2021 07:24:37 +0000 (UTC) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4H9ln74vYMz8tF8; Fri, 17 Sep 2021 15:23:55 +0800 (CST) Received: from dggpemm500001.china.huawei.com (7.185.36.107) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Fri, 17 Sep 2021 15:24:32 +0800 Received: from [10.174.177.243] (10.174.177.243) by dggpemm500001.china.huawei.com (7.185.36.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Fri, 17 Sep 2021 15:24:32 +0800 Subject: Re: [PATCH v4 2/3] arm64: Support page mapping percpu first chunk allocator To: Greg KH CC: , , , , , , , , , , References: <20210910053354.26721-1-wangkefeng.wang@huawei.com> <20210910053354.26721-3-wangkefeng.wang@huawei.com> <9b2e89c4-a821-8657-0ffb-d822aa51936c@huawei.com> From: Kefeng Wang Message-ID: Date: Fri, 17 Sep 2021 15:24:31 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemm500001.china.huawei.com (7.185.36.107) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E4E72D000096 Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com X-Stat-Signature: re4qkyd3xerrfm9smxdz3wyzo434sbkn X-HE-Tag: 1631863477-992650 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/9/17 15:04, Greg KH wrote: > On Fri, Sep 17, 2021 at 02:55:18PM +0800, Kefeng Wang wrote: >> On 2021/9/17 14:24, Greg KH wrote: >>> On Fri, Sep 10, 2021 at 01:33:53PM +0800, Kefeng Wang wrote: >>>> Percpu embedded first chunk allocator is the firstly option, but it >>>> could fails on ARM64, eg, >>>> "percpu: max_distance=3D0x5fcfdc640000 too large for vmalloc spa= ce 0x781fefff0000" >>>> "percpu: max_distance=3D0x600000540000 too large for vmalloc spa= ce 0x7dffb7ff0000" >>>> "percpu: max_distance=3D0x5fff9adb0000 too large for vmalloc spa= ce 0x5dffb7ff0000" >>>> then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu= _get_vm_areas+0x488/0x838", >>>> even the system could not boot successfully. >>>> >>>> Let's implement page mapping percpu first chunk allocator as a fallb= ack >>>> to the embedding allocator to increase the robustness of the system. >>>> >>>> Reviewed-by: Catalin Marinas >>>> Signed-off-by: Kefeng Wang >>>> --- >>>> arch/arm64/Kconfig | 4 ++ >>>> drivers/base/arch_numa.c | 82 +++++++++++++++++++++++++++++++++++= ----- >>>> 2 files changed, 76 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>>> index 077f2ec4eeb2..04cfe1b4e98b 100644 >>>> --- a/arch/arm64/Kconfig >>>> +++ b/arch/arm64/Kconfig >>>> @@ -1042,6 +1042,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK >>>> def_bool y >>>> depends on NUMA >>>> +config NEED_PER_CPU_PAGE_FIRST_CHUNK >>>> + def_bool y >>>> + depends on NUMA >>> Why is this a config option at all? >> The config is introduced from >> >> commit 08fc45806103e59a37418e84719b878f9bb32540 >> Author: Tejun Heo >> Date:=C2=A0=C2=A0 Fri Aug 14 15:00:49 2009 +0900 >> >> =C2=A0=C2=A0=C2=A0 percpu: build first chunk allocators selectively >> >> =C2=A0=C2=A0=C2=A0 There's no need to build unused first chunk alloca= tors in. Define >> =C2=A0=C2=A0=C2=A0 CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs en= able them >> =C2=A0=C2=A0=C2=A0 selectively. >> >> For now, there are three ARCHs support both PER_CPU_EMBED_FIRST_CHUNK >> >> and PER_CPU_PAGE_FIRST_CHUNK. >> >> =C2=A0 arch/powerpc/Kconfig:config NEED_PER_CPU_PAGE_FIRST_CHUNK >> =C2=A0 arch/sparc/Kconfig:config NEED_PER_CPU_PAGE_FIRST_CHUNK >> =C2=A0 arch/x86/Kconfig:config NEED_PER_CPU_PAGE_FIRST_CHUNK >> >> and we have a cmdline to choose a alloctor. >> >> =C2=A0=C2=A0 percpu_alloc=3D=C2=A0=C2=A0 Select which percpu first ch= unk allocator to use. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Currently supported values are "e= mbed" and "page". >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Archs may support subset or none = of the selections. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 See comments in mm/percpu.c for d= etails on each >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 allocator.=C2=A0 This parameter i= s primarily for debugging >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 and performance comparison. >> >> embed percpu first chunk allocator is the first choice, but it could f= ails >> due to some >> >> memory layout(it does occurs on ARM64 too.), so page mapping percpu fi= rst >> chunk >> >> allocator is as a fallback, that is what this patch does. >> >>>> + >>>> source "kernel/Kconfig.hz" >>>> config ARCH_SPARSEMEM_ENABLE >>>> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c >>>> index 46c503486e96..995dca9f3254 100644 >>>> --- a/drivers/base/arch_numa.c >>>> +++ b/drivers/base/arch_numa.c >>>> @@ -14,6 +14,7 @@ >>>> #include >>>> #include >>>> +#include >>>> struct pglist_data *node_data[MAX_NUMNODES] __read_mostly; >>>> EXPORT_SYMBOL(node_data); >>>> @@ -168,22 +169,83 @@ static void __init pcpu_fc_free(void *ptr, siz= e_t size) >>>> memblock_free_early(__pa(ptr), size); >>>> } >>>> +#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK >>> Ick, no #ifdef in .c files if at all possible please. >> The drivers/base/arch_numa.c is shared by RISCV/ARM64, so I add this c= onfig >> to >> >> no need to build this part on RISCV. > Ok, then you need to get reviews from the mm people as I know nothing > about this at all, sorry. This file ended up in drivers/base/ for some > reason to make it easier for others to use cross-arches, not that it ha= d > much to do with the driver core :( Ok, I has Cc'ed Andrew and mm list ;) Hi Catalin and Will, this patchset is mostly changed for arm64, and the change itself=C2=A0 is not too big,=C2=A0 could you pick it up fr= om arm64 tree if there are no more comments,=C2=A0 many thanks. > > thanks, > > greg k-h > . >