From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFE92C021BE for ; Wed, 26 Feb 2025 02:09:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C4C06B0098; Tue, 25 Feb 2025 21:09:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 873C66B0099; Tue, 25 Feb 2025 21:09:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71501280001; Tue, 25 Feb 2025 21:09:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5060E6B0098 for ; Tue, 25 Feb 2025 21:09:21 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D4D248076D for ; Wed, 26 Feb 2025 02:09:20 +0000 (UTC) X-FDA: 83160463680.22.0A08809 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf29.hostedemail.com (Postfix) with ESMTP id CF47D120008 for ; Wed, 26 Feb 2025 02:09:18 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NBpay9In; spf=pass (imf29.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740535758; a=rsa-sha256; cv=none; b=knlK6tlH4yszwautGt0lKLSmrY4hQGkBqmhBl/ZVjGwNEKu3uR4EX2zPfEp7Hxyr+JK3u2 27CUYSeebSCBxxPTPHxXeyk76Z9rHX9ueQk3WI+355HX7HRT7OwTf2gFlopCWL38mRX2xk soVo8pOsBDa7cOKdiUAUkV9rc7+X2FQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NBpay9In; spf=pass (imf29.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740535758; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jNDX0bDy82gk9QmnrN58DpzxWTxMxMF4PEgthtYO/y8=; b=FIiaCjb7sw7Sho/mVwZ89nZeWDuyNrVNzi/1sDaUxXX0YJjoqeZUJ5P8rP5nMBkzm+k698 fZJd82AA0Uwxu5865VYkxqgpy1IXLrCZ12CXzZiEm6x4NBxnb32ScCbw73KThWfVr9Vpac IT7DLrWHd9vMoMbde7VMpe2Gtya55+A= Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5e04f87584dso9519624a12.3 for ; Tue, 25 Feb 2025 18:09:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740535757; x=1741140557; darn=kvack.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=jNDX0bDy82gk9QmnrN58DpzxWTxMxMF4PEgthtYO/y8=; b=NBpay9InqXN1WUfMS4LedW44mMXOK0iC7tkzjCfNW4xSLC3lh+/j950g1HSqfwX43I CTiEdeWu7+K4CfghoiIQen/TXv1SxcmGK8ywq7v9NlUGEFHIcToKdjveLqbsnrEJEhp0 RIpFhPw22t+ZsVgeFTii5/TJngLFryLusptgHiQDrcEVaWizkYB4rVTwj7OpLgYYW4Zd upjy410EUsIXriPz+csmSe+AUZ935tArTfnyIva9dszlkStt0zaRBq+nfHWwJKPc8+/8 2OiOpzScfwaWCGGf75/XteY8OpuqRWFbQSdpaIHOHjdM6UPXaLivHvrxSO3DMo8ElP6X 4H9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740535757; x=1741140557; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jNDX0bDy82gk9QmnrN58DpzxWTxMxMF4PEgthtYO/y8=; b=DGhIbaiuEu8xcdmmGBkuijL/3YbxF4JYmGSycdzbGDhTqC4hqV9uPB3h15XQn/HksX X4Gf3OpVuHd5LxQY0pbGCcJIP5fkwMBQkzR2Y8j40WZsa55Pgp3eKcmhGApbzp4KTSej kQYLiK1QCnOSEs0dHbaRy+4cCxOFqu4iHmS0noH6zFo1KbxnKlhL7edogiVrB72HDPKe PanE6GanobV7QGux4M9DL+UIcDCK6zlhCPYwTeJ0AOd3NH5GZktlSWDdCn4GZGvhlHGq n9DV6DFU3Uyle89vTSNUEdeWdq6crNIQDc/YuzipBj62MpLw9b885dIFrvd04UukzsSl DZeA== X-Forwarded-Encrypted: i=1; AJvYcCXW8OiWDg+/2errVHLmuOlYYQ/otLH1NbvJB7kFFpJdYjhTCdVYEbuqVVp6NnL91G5JShfmKL9CCg==@kvack.org X-Gm-Message-State: AOJu0Yxz6/YUGWywaQ6IeIhh7S+sPd+8IdKqGsIHlg63fzkkffq1d84B 8iJwLHaVlF5mkcfYtjrT8jPNYYq7RY72vxjOg0kzsAbRHm1MuEQ6 X-Gm-Gg: ASbGncsn8NNKr/qiUHXozzNp11GkBn6KlUxIpf/a/6EQZ2PRSnZPO0Gkl4kOXWewKKW kkgKFag+pksZ05TdQq+KkEiXtsPBTp2jSzZXfcKpydt4x1j+3+jlSLcEqnWBzY3L0RZO4l8dmC0 BS7S3ve7Fl/gmkYwd/UsbrWZTQ0+SvsjGhZXj6/Iro3GAVoD095Ay0QxO1twVyrxhA+E87/QAfQ 07aKx5eXt7LdXcFhb5CIViDorkFSpGj+XZ+kbKSwpg2brLKvHDPb2cops+0sihDJr8KR3m6LffW Jf9fxcZww7nPinLR6EBIOPA5HA== X-Google-Smtp-Source: AGHT+IGRtYD7rWXN/Sb7N5YCwL4iwg6Gy2cLXak4su8dMR+fikHn63505aewDuiYmopReDoTjLpqLQ== X-Received: by 2002:a05:6402:5508:b0:5dc:cc02:5d25 with SMTP id 4fb4d7f45d1cf-5e4a0d71d15mr1703278a12.11.1740535757056; Tue, 25 Feb 2025 18:09:17 -0800 (PST) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5e461f3e675sm1976282a12.72.2025.02.25.18.09.15 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 25 Feb 2025 18:09:16 -0800 (PST) Date: Wed, 26 Feb 2025 02:09:15 +0000 From: Wei Yang To: Mike Rapoport Cc: Wei Yang , linux-kernel@vger.kernel.org, Alexander Graf , Andrew Morton , Andy Lutomirski , Anthony Yznaga , Arnd Bergmann , Ashish Kalra , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Dave Hansen , David Woodhouse , Eric Biederman , Ingo Molnar , James Gowans , Jonathan Corbet , Krzysztof Kozlowski , Mark Rutland , Paolo Bonzini , Pasha Tatashin , "H. Peter Anvin" , Peter Zijlstra , Pratyush Yadav , Rob Herring , Rob Herring , Saravana Kannan , Stanislav Kinsburskii , Steven Rostedt , Thomas Gleixner , Tom Lendacky , Usama Arif , Will Deacon , devicetree@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Subject: Re: [PATCH v4 02/14] memblock: add MEMBLOCK_RSRV_KERN flag Message-ID: <20250226020915.ytxusrrl7rv4g64l@master> Reply-To: Wei Yang References: <20250206132754.2596694-1-rppt@kernel.org> <20250206132754.2596694-3-rppt@kernel.org> <20250218155004.n53fcuj2lrl5rxll@master> <20250224013131.fzz552bn7fs64umq@master> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Rspamd-Queue-Id: CF47D120008 X-Stat-Signature: fj5rg3outegib7mkcyywaaryu7eyqftk X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740535758-655823 X-HE-Meta: U2FsdGVkX1/TzSkDZx365ZiTACbzWwDp2qWbi14tQ3H0QX3totMaPnNcCQZqccnc3EqKlHXHKwPdOVUUJtdduI4Nj5h2QU6tavc2ONOcIutPEGeJtn/JqFUTjIkXFhhYkXoOzwmUfyNTZfV9qdA3JmVvsyPISTVLtuKp+ZHFlxzt7ayCFQf1DrVWtqv9cZc8rIo1vvK8jWwym1O+4lvdWP9d4HAun9etBEjeZ+QUvND0zaJwOxXr/OcfbnNrVPiAZx6hlExvDcSJCykeovtVr54tLMgXEEeAGObewX14hGD6Jq4HdpRcxDtO6S4jrpo/rd+/N20u0GSYWSsVnr3KtJM0d19OFR7NBBlRd8KdXcpRj3zsrS27A9cwD6BfLtooejlfMK/lyvMRpjTDhkKX2nwtKU/Nx+LCp4ObjS2LJBXKkv6ZxaT8MAzOKOGFQgIhuhoV+t0ZVoQdAHL8A/hP5GZMq4+eSCTPea9Fb4/9As8p9PNQG1dSbSJ6DHzvtxJ+KsMJ81WMOakPzL8JUIlKq2PcdeT4l8nHBDPoe35SHSsOVtuG/D657w8ZVFvRepLNEYKVI/xtyPpqShlflmeHZ3tDeOpcNWIleGlorIJ5/L0/tJhRoBJ22LYyQx6Em27jrPlkDgLcg6PojGZ6gfKbUsS65/1C3kn6Ycw+5KxAv6H1h/7mFVQlQpn4B5aH3Tf4OY1ObQMsDnERMS+nfWGuOT+q/jqF6A2S4QjZlyzZw8XRE11NbTIL6vlbn6hHHSwsy9s3sJwt8BzDwBsRc3GxBd2pVvjOvV+z1QsaDRSluzyFVk4T8336kqJFRlkT1J3+iRniOEXW7ouZdJEAfpfc6WvtFzezd1BYHwh120Fs5QiQNwb68BnwFgchu1QSTz+Ey1ERveDGaW9b6TuKjhPyhhrQ0XYQWWQlahAsZ7QnFzD9qPQPs8maz0TjeNF7flNZq8US70UWn9KHzR1WONx PFgcLt8j xg2LUOb1liB2f9zVbDS24aUTC4mfN+MZP/wxryWxlRlaZJs1eYuBkLoDSyMtVT7Ru1HKlcYKqqLahrBOxsrGQlTon1PYNjSZoz8I8Ig1Bv7DFJ/mq599Js9gOy9dpMsNwbInIlBNqV/bAujYeuzUnqmbl+UJgBsNu8z6Ueuky4wnWOVx2LH8YsJNln3OQbsWgetMq+8gd1GTFXmNoM4RpDl6hKSDsandm/LxOlDtkay6s6jmm8SF5l0y07K/vlXxlUSu83Q6ZtRzPQktb8M+7vCCiTOuvEjegwjEzwK3tuhjSj2nRM2TeTKYnx3rmuHrfkGlMkZlxvTSut1WjNf8VYSntT1VCGkuCS3XIsI23UE9CsMAZ0jtmx7vtuoGqj0VWRMYqJSU6ZsuZD2gTfaMfR8GVrKnw2/Q9/CqvrDcFSef1A0mf17YFtf5Hvnb2WIjx47Fd9ogzh4H67z9wdTYXI3ItoLrGt4B1B9FUPpBmfdsrurXlBmb43rc4oEUHnmCIbgM0wlQRUo1zDM5k5vXDwzRLeaaX5PoEv4bUPrZPnWyEOeQCylZkP2ko+m/vfL2sG7z0R/dyQMvrysY1iygtA9EG/ZcMNEBwkvfOnQRTF9Vobd6xrDkkaNq+Fv5lNtGgygDgMGlAyYYG9S97QioEQ9gWVgxn9xz7mvJsYsh5tvjsIOK/zV6ciVQuAWh28zfuk/nkaOD0jQSNJWYUxqA+NRCgMoWqNX2vlMSJnEojub41giAW/XkLW0iYsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 25, 2025 at 09:46:28AM +0200, Mike Rapoport wrote: >On Mon, Feb 24, 2025 at 01:31:31AM +0000, Wei Yang wrote: >> On Wed, Feb 19, 2025 at 09:24:31AM +0200, Mike Rapoport wrote: >> >Hi, >> > >> >On Tue, Feb 18, 2025 at 03:50:04PM +0000, Wei Yang wrote: >> >> On Thu, Feb 06, 2025 at 03:27:42PM +0200, Mike Rapoport wrote: >> >> >From: "Mike Rapoport (Microsoft)" >> >> > >> >> >to denote areas that were reserved for kernel use either directly with >> >> >memblock_reserve_kern() or via memblock allocations. >> >> > >> >> >Signed-off-by: Mike Rapoport (Microsoft) >> >> >--- >> >> > include/linux/memblock.h | 16 +++++++++++++++- >> >> > mm/memblock.c | 32 ++++++++++++++++++++++++-------- >> >> > 2 files changed, 39 insertions(+), 9 deletions(-) >> >> > >> >> >diff --git a/include/linux/memblock.h b/include/linux/memblock.h >> >> >index e79eb6ac516f..65e274550f5d 100644 >> >> >--- a/include/linux/memblock.h >> >> >+++ b/include/linux/memblock.h >> >> >@@ -50,6 +50,7 @@ enum memblock_flags { >> >> > MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */ >> >> > MEMBLOCK_DRIVER_MANAGED = 0x8, /* always detected via a driver */ >> >> > MEMBLOCK_RSRV_NOINIT = 0x10, /* don't initialize struct pages */ >> >> >+ MEMBLOCK_RSRV_KERN = 0x20, /* memory reserved for kernel use */ >> >> >> >> Above memblock_flags, there are comments on explaining those flags. >> >> >> >> Seems we miss it for MEMBLOCK_RSRV_KERN. >> > >> >Right, thanks! >> > >> >> > >> >> > #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP >> >> >@@ -1459,14 +1460,14 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, >> >> > again: >> >> > found = memblock_find_in_range_node(size, align, start, end, nid, >> >> > flags); >> >> >- if (found && !memblock_reserve(found, size)) >> >> >+ if (found && !__memblock_reserve(found, size, nid, MEMBLOCK_RSRV_KERN)) >> >> >> >> Maybe we could use memblock_reserve_kern() directly. If my understanding is >> >> correct, the reserved region's nid is not used. >> > >> >We use nid of reserved regions in reserve_bootmem_region() (commit >> >61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")) but KHO needs to >> >know the distribution of reserved memory among the nodes before >> >memmap_init_reserved_pages(). >> > >> >> BTW, one question here. How we handle concurrent memblock allocation? If two >> >> threads find the same available range and do the reservation, it seems to be a >> >> problem to me. Or I missed something? >> > >> >memblock allocations end before smp_init(), there is no possible concurrency. >> > >> >> Thanks, I still have one question here. >> >> Below is a simplified call flow. >> >> mm_core_init() >> mem_init() >> memblock_free_all() >> free_low_memory_core_early() >> memmap_init_reserved_pages() >> memblock_set_node(..., memblock.reserved, ) --- (1) >> __free_memory_core() >> kmem_cache_init() >> slab_state = UP; --- (2) >> >> And memblock_allloc_range_nid() is not supposed to be called after >> slab_is_available(). Even someone do dose it, it will get memory from slab >> instead of reserve region in memblock. >> >> From the above call flow and background, there are three cases when >> memblock_alloc_range_nid() would be called: >> >> * If it is called before (1), memblock.reserved's nid would be adjusted correctly. >> * If it is called after (2), we don't touch memblock.reserved. >> * If it happens between (1) and (2), it looks would break the consistency of >> nid information in memblock.reserved. Because when we use >> memblock_reserve_kern(), NUMA_NO_NODE would be stored in region. >> >> So my question is if the third case happens, would it introduce a bug? If it >> won't happen, seems we don't need to specify the nid here? > >We don't really care about proper assignment of nodes between (1) and (2) >from one side and the third case does not happen on the other side. Nothing >should call membloc_alloc() after memblock_free_all(). > My point is if no one would call memblock_alloc() after memblock_free_all(), which set nid in memblock.reserved properly, it seems not necessary to do __memblock_reserve() with exact nid during memblock_alloc()? As you did __memblock_reserve(found, size, nid, MEMBLOCK_RSRV_KERN) in this patch. >But it's easy to make the window between (1) and (2) disappear by replacing >checks for slab_is_available() in memblock with a variable local to >memblock. > >> -- >> Wei Yang >> Help you, Help me > >-- >Sincerely yours, >Mike. -- Wei Yang Help you, Help me