From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCACDC36010 for ; Fri, 11 Apr 2025 11:37:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AC4E2801B0; Fri, 11 Apr 2025 07:37:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25A4E28019B; Fri, 11 Apr 2025 07:37:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0861B2801B0; Fri, 11 Apr 2025 07:37:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D748A28019B for ; Fri, 11 Apr 2025 07:37:14 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 01E60B09F8 for ; Fri, 11 Apr 2025 11:37:15 +0000 (UTC) X-FDA: 83321562072.11.5129E33 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf12.hostedemail.com (Postfix) with ESMTP id 853B94000E for ; Fri, 11 Apr 2025 11:37:13 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="VoTcC6/5"; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf12.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744371433; a=rsa-sha256; cv=none; b=aRi8ZD9dRwHub7iLYLQ7zd+ize+XAvGu/TXoimrNsCiC48A3HlX594Pi+iaoXKnwmBmgin rouEjrnp3GVrcGZzMpkSh8Zc36ZprlJ8Q7O/yIuqcyucQVqgzK5tEFtslVt1q/uPXPLSCC 1oRyCtFiy7kkrYeTH+xW8wRddBWctgc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="VoTcC6/5"; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf12.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744371433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BmZPntgvfl27mWIyxMQDncIr25/8cD9rmiYzjWrEz5w=; b=wre8ddqE0o7N2VY6wwCBLcI2KHYJ8RUrXot9LcDTYdSjbGohGsya4+iGXGXSfykBhbmWwS fB4K/rWDIcxNMgza+USIU9wFr4zmAOKZn4iiLH/CE5Wh5ESnbq1h/zZSby6l570fIZZ/xl mNDdOnquHFNL1ITh8xC6J8MD+1q7Ctw= Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 53B933rc015662; Fri, 11 Apr 2025 11:37:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pp1; bh=BmZPnt gvfl27mWIyxMQDncIr25/8cD9rmiYzjWrEz5w=; b=VoTcC6/5xlyo7W/OP6VCcl mOCg3ff2RnkcFE7WBxGIhqlThMizaWMPvFy5qWaUQaiPBHKVOLZ2LuQzQEpS47uR guKM+U0WoGg6Tm/qPHlyrChUcxnQCc58sNeEWlm3Ew71/1M62IiBU7JEjiPHuNNU vQN3M1c9iTifqPHUul5LBDV96BrOhsCwyeX6WBy//pNm+OpNIXpNkJVe/ULdQkms Zxt5MGrSjhXZ6Mr3m1xUAFfPLPCu/uBQZxSOnWhzKcBplws4ZQLEtIdBQ4u0MmuB pDAYyOSB2lfNnzzAGsgaLJk67MFpEMYsA3/iaCFOPdOStTIhXLbU3TRk0jQpnNqQ == Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 45xn5qk6hp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Apr 2025 11:37:04 +0000 (GMT) Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 53BBULrp013208; Fri, 11 Apr 2025 11:37:03 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 45xn5qk6hm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Apr 2025 11:37:03 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 53B9IQGE011326; Fri, 11 Apr 2025 11:37:03 GMT Received: from smtprelay01.wdc07v.mail.ibm.com ([172.16.1.68]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 45uf802wet-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Apr 2025 11:37:03 +0000 Received: from smtpav06.dal12v.mail.ibm.com (smtpav06.dal12v.mail.ibm.com [10.241.53.105]) by smtprelay01.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 53BBb2u526345940 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 11 Apr 2025 11:37:02 GMT Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5067958059; Fri, 11 Apr 2025 11:37:02 +0000 (GMT) Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6371858055; Fri, 11 Apr 2025 11:36:57 +0000 (GMT) Received: from [9.39.23.113] (unknown [9.39.23.113]) by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTP; Fri, 11 Apr 2025 11:36:56 +0000 (GMT) Message-ID: <736ca451-8adc-4c5c-b721-6b78eaeb4699@linux.ibm.com> Date: Fri, 11 Apr 2025 17:06:55 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set To: Mike Rapoport Cc: Greg Kroah-Hartman , linux-kernel@vger.kernel.org, David Hildenbrand , Andrew Morton , linux-mm@kvack.org, Ritesh Harjani , rafael@kernel.org, Danilo Krummrich References: <50142a29010463f436dc5c4feb540e5de3bb09df.1744175097.git.donettom@linux.ibm.com> Content-Language: en-US From: Donet Tom In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: mtakEuySGcSKWr_NqyBXWFLuGDbjXQx8 X-Proofpoint-GUID: prRr8UbKd-vsewIenwE0iua1WZ0Hxt90 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1095,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-04-11_04,2025-04-10_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 malwarescore=0 clxscore=1015 priorityscore=1501 mlxscore=0 mlxlogscore=999 spamscore=0 suspectscore=0 lowpriorityscore=0 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2502280000 definitions=main-2504110073 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 853B94000E X-Stat-Signature: od9g1cxprunpxi6da41az66dupihsmws X-HE-Tag: 1744371433-348499 X-HE-Meta: U2FsdGVkX1+4Vt2TxcpXEqheb1VDrtgXI7h9zATkU8J/Eoc4kNmvDm+LpaNFf2Phu3UYQHE1NqjoQG6SPzGJ4evMK0vs1JodKGUAIUHfYTbgVEEcg8lyLyYv5wy9XrThfkMpflYdKNzi4SyvVhK29NvJ3yJACE5MOTmseFiTexIYHdiM2R/ehX7b1adxqhqjIYcxGO70KUitZsiywRag5EdGmBqgkfBjzM0rfRb6yd7EQfG+yWhc9Ad0k7l6gHfDc+VADmELsXKZRlJChKFyMm/QTOgvtmSb1GBq5393XIvAsnyQWdpdqdeapAMuOxGAFhr9yivbE1YTrZ+aDw+LuQTYGnQr8KX86o1bK6/TifQQl1LM3xD4D1BN5o1pX1KHRITThr/duOs8ZnOqJvsg1lge5ddVm/+3Y3Dp33U7AvXIsjv8YKAFUuNeulWrb0frta3HL3eCcxERBlFeJDBuzuc1j/upUqGXMjX66Zq486L1NlGpo4rKJ5RwwI7I+BsoG9QhlM5X9mIfzTUyHWYwu8ZCLRo6/3bWflAqyTX1mm2IkvFeWzpG7wkztWjCOYxaNuACuQUL1oO4l1ivc/DPdzlgJVB8Dp4NpQiljlriB1WAHyFUedy6iuyq2LK8dHfWk00jZLSr1jynBQM24qkbdR4z32nvjYHc0NOgWX38QZ/j+2O3W63rb840cl3GjuuBTSKfX435ABLNfjre0Z5ikGliE3qLR68ZjAqGrNsWf3GCZEl4vnqW40ji3kSMLMfXCciUQ0GSkO2eq0cI1AwBrB8gIZw2xYs8tPCBtpT2a3XQ2arGPWLOFv/nsK6EjPCM+Ie3YCk4nJmCFwHhrIqmRHswn1WwPLQO3xpED2rFxGIRvezYRw3/JVGtCJNE4oFFXJbi2rwpDzfP/IG0EMq1l47tWMF/8qJ/jMOLqjzHjqCVYO/D0I8GQ+qAx6o9CFiTNRUNl1cbspiLGmb/+6v V1aZKfWM Poy7KdJZXTYJNMatCbnhcOaGujPP142kSQsIbRoyxmoeYWQOKg3Ej8KpZUsHTJAa0CW/0snsnBgw3cDU97BUmJaIOYyHWddFX3x5iGK2Tq+QqJsH27xuPVP57K5X6LIzlNKX4tE92Jfcl6JrfUZ9QDkshYGcJIG3UeKzwCIReIzORSbKfn2umoR8AAkUXpiXeo6TvBHQ/RsdgQ8prcRx8MAFFZO1d0wW0syLsecq3eu8X9QBewOBzRTPdqx8o6qz4mf+KkAL93THG43ZzLZc5OHxhCzXcvOsS31aqWD/12YnnRrYW9DkfhH9yFHh1MN1/7a43/OwYJltyg+00CLT43IPhT6mr2IiUqfTNsW5U/yGaUibIDyHEw6vhbQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/11/25 4:29 PM, Mike Rapoport wrote: > On Fri, Apr 11, 2025 at 12:27:28AM +0530, Donet Tom wrote: >> On 4/10/25 1:37 PM, Mike Rapoport wrote: >>> On Wed, Apr 09, 2025 at 10:57:57AM +0530, Donet Tom wrote: >>>> In the current implementation, when CONFIG_DEFERRED_STRUCT_PAGE_INIT is >>>> set, we iterate over all PFNs in the memory block and use >>>> early_pfn_to_nid to find the NID until a match is found. >>>> >>>> This patch we are using curr_node_memblock_intersect_memory_block() to >>>> check if the current node's memblock intersects with the memory block >>>> passed when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. If an intersection >>>> is found, the memory block is added to the current node. >>>> >>>> If CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the existing mechanism >>>> for finding the NID will continue to be used. >>> I don't think we really need different mechanisms for different settings of >>> CONFIG_DEFERRED_STRUCT_PAGE_INIT. >>> >>> node_dev_init() runs after all struct pages are already initialized and can >>> always use pfn_to_nid(). >> >> In the current implementation, if CONFIG_DEFERRED_STRUCT_PAGE_INIT >> is enabled, we perform a binary search in the memblock region to >> determine the pfn's nid. Otherwise, we use pfn_to_nid() to obtain >> the pfn's nid. >> >> Your point is that we could unify this logic and always use >> pfn_to_nid() to determine the pfn's nid, regardless of whether >> CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Is that >> correct? > Yes, struct pages should be ready by the time node_dev_init() is called > even when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. ok. Thanks Mike. > >>> kernel_init_freeable() -> >>> page_alloc_init_late(); /* completes initialization of deferred pages */ >>> ... >>> do_basic_setup() -> >>> driver_init() -> >>> node_dev_init(); >>> >>> The next step could be refactoring register_mem_block_under_node_early() to >>> loop over memblock regions rather than over pfns. >> So it the current implementation >> >> node_dev_init() >>     register_one_node >>         register_memory_blocks_under_node >>             walk_memory_blocks() >>                 register_mem_block_under_node_early >>                     get_nid_for_pfn >> >> We get each node's start and end PFN from the pg_data. Using these >> values, we determine the memory block's start and end within the >> current node. To identify the node to which these memory block >> belongs,we iterate over each PFN in the range. >> >> The problem I am facing is, >> >> In my system node4 has a memory block ranging from memory30351 >> to memory38524, and memory128433. The memory blocks between >> memory38524 and memory128433 do not belong to this node. >> >> In  walk_memory_blocks() we iterate over all memory blocks starting >> from memory38524 to memory128433. >> In register_mem_block_under_node_early(), up to memory38524, the >> first pfn correctly returns the corresponding nid and the function >> returns from there. But after memory38524 and until memory128433, >> the loop iterates through each pfn and checks the nid. Since the nid >> does not match the required nid, the loop continues. This causes >> the soft lockups. >> >> This issue occurs only when CONFIG_DEFERRED_STRUCT_PAGE_INIT >> is enabled, as a binary search is used to determine the PFN's nid. When >> this configuration is disabled, pfn_to_nid is faster, and the issue does >> not seen.( Faster because nid is getting from page) >> >> To speed up the code when CONFIG_DEFERRED_STRUCT_PAGE_INIT >> is enabled, I added this function that iterates over all memblock regions >> for each memory block to determine its nid. >> >> "Loop over memblock regions instead of iterating over PFNs" - >> My question is - in register_one_node, do you mean that we should iterate >> over all memblock regions, identify the regions belonging to the current >> node, and then retrieve the corresponding memory blocks to register them >> under that node? > I looked more closely at register_mem_block_under_node_early() and > iteration over memblock regions won't make sense there. > > It might make sense to use for_each_mem_range() as top level loop in > node_dev_init(), but that's a separate topic. Yes, this makes sense to me as well. So in your opinion, instead of adding a new memblock search function like I added , it's better to use |for_each_mem_range()| in|node_dev_init()|, which would work for all cases—regardless of whether|CONFIG_DEFERRED_STRUCT_PAGE_INIT| is set or not. Right? > >> Thanks >> Donet >>