From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 410ECC3ABAA for ; Mon, 5 May 2025 16:41:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BFB826B0092; Mon, 5 May 2025 12:41:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BABD66B0093; Mon, 5 May 2025 12:41:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A250A6B0095; Mon, 5 May 2025 12:41:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 811706B0092 for ; Mon, 5 May 2025 12:41:09 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1A025160359 for ; Mon, 5 May 2025 16:41:10 +0000 (UTC) X-FDA: 83409419100.08.27E9888 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf23.hostedemail.com (Postfix) with ESMTP id 1E0BB140002 for ; Mon, 5 May 2025 16:41:06 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=BRrfNcBk; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf23.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746463267; a=rsa-sha256; cv=none; b=lgfJyRx6XCbu09XtZ2dV10+7k+PbAJjuhVKdH+UAU7lZKP9m7djJh0XCg1tEH0g+zzbvbf 0s4LPTcu6ckfa9kNPCjEHNHJp4EjH5I6K24+vH7BDPlFfRobDiIgVJnOioQuAp2XKrFnUm MuwDcsDMSsHwOSUfdT+FNdVa/PFF838= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=BRrfNcBk; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf23.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746463267; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cganpCbDdntqnxcgcUYl7360gDJQTt8vOtXsDdCXm84=; b=prLKqFWpfhEuqghqoVGQxuk0viB1RDXeKBuqU7yOMGGLzF+mouWlCbzQcXdqwwQ1uE3BhW Tw3UoRK9B2lZl34zJxfN7MH5mvqzZ90Nd9cnJ/ZjV1rnWV002mKHdeWqAzBraQgxDPZpoY QwgnfyzcpGyITFIAfhUkjcFegQHZjK4= Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 545DY5eC007052; Mon, 5 May 2025 16:40:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pp1; bh=cganpC bDdntqnxcgcUYl7360gDJQTt8vOtXsDdCXm84=; b=BRrfNcBk1qq1pvXgG3rR+A 31F4SA6pqFZs/dIfwI96bHfkRbap8S5JIMGRf0MoEnd5hjjZ+m57l0s/nTj77srF ZEUA/OHV1E/dpxhZauMJLazPPm3vwkgpmzAMVuIcr84pXP5dpISmKmCHwmYuGIVa XRrd8VR82AUrir30ImzkjCgiVEhhg7++MN8KoVWCMmzKiOYD6tq90TV8VA1ONxkg sigNIfgBhu73DPIiU4uPKwpR6cIb7/5KjYR4kwHMZa+Jeqj21y302eXiooC4dyUv YKFXbcLBA1NTee7eZ6iaqWVkIdbeiB+kFWq7/qHXl0xU2wrrVgZDAYtX9kklGZrA == Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ej6yut31-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 05 May 2025 16:40:56 +0000 (GMT) Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 545GcXi6017809; Mon, 5 May 2025 16:40:56 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ej6yut2y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 05 May 2025 16:40:56 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 545D1NDL032227; Mon, 5 May 2025 16:40:55 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 46dxymf063-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 05 May 2025 16:40:55 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 545Ges7n27263606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 5 May 2025 16:40:54 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 89ADE58052; Mon, 5 May 2025 16:40:54 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7DF5958050; Mon, 5 May 2025 16:40:49 +0000 (GMT) Received: from [9.124.223.213] (unknown [9.124.223.213]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTP; Mon, 5 May 2025 16:40:49 +0000 (GMT) Message-ID: <24a9480b-f494-4e55-82b8-e5443c694c9e@linux.ibm.com> Date: Mon, 5 May 2025 22:10:47 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 1/3] driver/base: Optimize memory block registration to reduce boot time To: David Hildenbrand , Oscar Salvador Cc: Mike Rapoport , Zi Yan , Greg Kroah-Hartman , Andrew Morton , rafael@kernel.org, Danilo Krummrich , Ritesh Harjani , Jonathan Cameron , Alison Schofield , Yury Norov , Dave Jiang , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <188fbfba-afb4-4db7-bbba-7689a96be931@redhat.com> <74c500dd-8d1c-4177-96c7-ddd51ca77306@redhat.com> <0e568e33-34fa-40f6-a20d-ebf653de123d@redhat.com> Content-Language: en-US From: Donet Tom In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 98sQ5pMNPMqrOYqdCbKvgThF9mTB0e7F X-Authority-Analysis: v=2.4 cv=GKAIEvNK c=1 sm=1 tr=0 ts=6818ea18 cx=c_pps a=3Bg1Hr4SwmMryq2xdFQyZA==:117 a=3Bg1Hr4SwmMryq2xdFQyZA==:17 a=IkcTkHD0fZMA:10 a=dt9VzEwgFbYA:10 a=_R2tS7T5d_6P1SSXrE0A:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 X-Proofpoint-GUID: svVsZz5VUUelt6SYY01Cg74ggQ6770Vo X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTA1MDE1OCBTYWx0ZWRfX1boN9EwqKh6u JIwM1jHk1L91sQnX/Uh2VazAOQ2Juc8zGtVfWIsA5bZ/I2HoeewgbRHqQ/0j2g8oafCa902A7YA RsQFg9egkAOH+mJmJ71q4qB1DveFif4keN6ADiuD2HcKVn+7GkwWRHci4pwIRHLdV0Izm2OZIvr 1LH5QpPJU2qbk7qgbS/tNJgoD9Uplqksa9S3MGk7Wjf71YfmfKVbif3TWlgarx2q7VFcXjh/rk7 ODuBm8yY7g6Y8++80Ahn77N6xqPaClDBayV90+oBRMAaKjRCfuupT5UonJg5xPbaV/XJx+7vlfm Csi5U1NG02d7WJpp3F7wEk3ojknwJE1w/5tRBk7AWY4bri2f7ExHRq6KhKX24HBN69tqcDTg7Lg qNZOeH7ddDYmeO78Ym1V/MTh0cM1G5X2SDR+JWryerVTU77LMRxDrE/yz1dCcvQaPUJNIxT6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-05_07,2025-05-05_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 impostorscore=0 mlxscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 mlxlogscore=946 adultscore=0 clxscore=1015 phishscore=0 spamscore=0 bulkscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2504070000 definitions=main-2505050158 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 1E0BB140002 X-Stat-Signature: qj6q8fbdtquyxnyh4gucxmybge5ru8y3 X-Rspam-User: X-HE-Tag: 1746463266-161771 X-HE-Meta: U2FsdGVkX1+sRp83jXnnjedhT0io4mzDy5F1KvG85Ic+8K3o6m7OdDKrKNDLp4UagUg7jZnhOEsX7w4u5QZL+vf7Km/+Nf0bX1RTeqXRDM6VCTGu/reZtCG+1Rlc7rH/MXZVfG3Q9NRY9hdX3YIe6b/yQf42aY5e7dRW+i3tfhrF9a6+6myf/+qmvyHziHKm58IrHpVvn4ELiNxZv6C/5hxvu75Jk3/MXymC45E7s++oZ4NzFjNi67+7v+7vIaMZYYn5FLjRMgigutRbO7OXrNhSSn+YpCzdN644J86kqWAHTi9RMzezvwJbPe2zQ18vp2u1Usbcp/pYrK9cYogUHMsLQilvYdFV1bsj6JaEnvkiOg5zRA0HP1ohK54l1rVRrR0KsfYVjRSvVFOaBBt0cOvPDk4gZoubQCArAZ0jVOllZlkO+CG5gP6v7W8t4znTNlj2rTe6DTTrwS2n9h7lV02EmHZOe+moiw/dcZ4lT+TWChZeylsQmYoMIFz4L9cFyMaKKwBBGbXHUuuRdqx/aWxtAqDXfQyxC4bCpa3Tc52PwBtT0/mFYTBO8z/sv59RgGMGwzk9SeZZWGKRrvTgPI+2vnA1HVxniB1yN1l/0YtinsuU9vCwN3u88UuX5Kw3ZaaElVxqu3aoDlOVvQCNyZauJAXFcNTxS7f7y7iceYMDMo/IL15HK6GWF3gjLjyLqflaDo1FgMwGzFriqzIDW9lncvyYqK0cL2+jxrVVpua8Udcup/8JBseAwrA43zqVUH5GXtbqDaLNhhuS0wuFVN57m35qU7ygBkP4KNTq8QS3F8I6NEJpagcFjyIXimxTeSlkaOBaxdQWX8vDSRsyMFJWSdCzeUTqf7b9Q/dvaOKi8CR89+hKJ9pHxsh+SX1k9o1VHpzXetwUzCs6lt0ubNWaIz5NUnOVnnHg/4GgLXxctOUiyWh7yR1CWGOQ5qC6xGX5jB6PLdXm+RRMzHe mDmJTT1q Jy/kaVR1TR28U5ULyY0srAAq+QObI2ej3yyUzDHO+ccTyhyqGyp4yFMYDFtYHwXU90czQE7r8lxmJKZFuF70YzHoRM/kUMB9CaWk3A5+1kDsILA6BaCVvpE7A3zDqzFznfz90I1vHpSuEUDBIqBoFoS6iMFFqZBSSZXnLo0H2TndTjzm4WIV6t5qsrKy0AEgnpId8FrUHJaLOc2VlAPt182vLsIWvOb4befYzQNbE2Dm9V2voYWo2iFyW1NrEnhB5uorsIHWhfy0YOcmu/Xv6ZBk1KROxG/Sab5IGnLbEc6rjlteAIfwQZgzGsuH3vCN1vYJugw0jDmluXss= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/5/25 6:32 PM, David Hildenbrand wrote: > On 05.05.25 14:51, Donet Tom wrote: >> >> On 5/5/25 4:06 PM, David Hildenbrand wrote: >>> On 05.05.25 11:36, Oscar Salvador wrote: >>>> On Mon, May 05, 2025 at 10:12:48AM +0200, David Hildenbrand wrote: >>>>> Assume you hotplug the second CPU. The node is already >>>>> registered/online, so >>>>> who does the register_cpu_under_node() call? >>>>> >>>>> It's register_cpu() I guess? But no idea in which order that is >>>>> called with >>>>> node onlining. >>>>> >>>>> The code has to be cleaned up such that onlining a node does not >>>>> traverse >>>>> any cpus / memory. >>>>> >>>>> Whoever adds a CPU / memory *after onlining the node* must >>>>> register the >>>>> device manually under the *now online* node. >>>> >>>> So, I think this is the sequence of events: >>>> >>>> - hotplug cpu: >>>>     acpi_processor_hotadd_init >>>>      register_cpu >>>>       register_cpu_under_node >>>> >>>>     online_store >>>>      device_online()->dev_bus_online() >>>>       cpu_subsys->online() >>>>        cpu_subsys_online >>>>         cpu_device_up >>>>          cpu_up >>>>           try_online_node  <- brings node online >>>>            ... >>>>            register_one_node <- registers cpu under node >>>>           _cpu_up >>> >>> My thinking was, whether we can simply move the >>> register_cpu_under_node() after the try_online_node(). See below >>> regarding early. >>> >>> And then, remove the !node_online check from register_cpu_under_node(). >>> >>> But it's all complicated, because for memory, we link a memory block >>> to the node (+set the node online) when it gets added, not when it >>> gets onlined. >>> >>> For CPUs, we seem to be creating the link + set the node online when >>> the CPU gets onlined. >>> >>>> >>>> The first time we hotplug a cpu to the node, note that >>>> register_cpu()->register_cpu_under_node() will bail out as node is >>>> still >>>> offline, so only cpu's sysfs will be created but they will not be >>>> linked >>>> to the node. >>>> Later, online_store()->...->cpu_subsys_online()->..->cpu_up() will >>> take> care of 1) onlining the node and 2) register the cpu to the node >>> (so, >>>> link the sysfs). >>> >>> >>> And only if it actually gets onlined I assume. >>> >>>> >>>> The second time we hotplug a cpu, >>>> register_cpu()->register_cpu_under_node() will do its job as the >>>> node is >>>> already onlined. >>>> And we will not be calling register_one_node() from >>>> __try_online_node() >>>> because of the same reason. >>>> >>>> The thing that bothers me is having register_cpu_under_node() spread >>>> around. >>> >>> Right. >>> >>>> I think that ideally, we should only be calling >>>> register_cpu_under_node() >>>> from register_cpu(), but we have this kinda of (sort of weird?) >>>> relation >>>> that even if we hotplug the cpu, but we do not online it, the numa >>>> node >>>> will remain online, and so we cannot do the linking part (cpu <-> >>>> node), >>>> so we could not really only have register_cpu_under_node() in >>>> register_cpu(), which is the hot-add part, but we also need it in the >>>> cpu_up()->try_online_node() which is the online part. >>> >>> Maybe one could handle CPUs similar to how we handle it with memory: >>> node gets onlined + link created as soon as we add the CPU, not when >>> we online it. >>> >>> But likely there is a reason why we do it like that today ... >>> >>>> >>>> And we cannot also remove the register_cpu_under_node() from >>>> register_cpu() because it is used in other paths (e.g: at boot time ). >>> >>> Ah, so in that case we don't call cpu_up ... hm. >>> >>> Of course, we can always detect the context (early vs. hotplug). >>> Maybe, we should split the early vs. hotplug case up much earlier. >>> >>> register_cpu_early() / register_cpu_hotplug() ... maybe >> >> Hi David and Oscar, >> >> I was thinking that __try_online_node(nid, true) being called from >> try_online_node() might cause issues with this patch. From the >> discussion above, what I understand is: >> >> When try_online_node() is called, there are no memory resources >> available for the node, so register_memory_blocks_under_node() >> has no effect. Therefore, our patch should work in all cases. > > Right, it's simply unnecessary to perform the lookup ... and confusing. > >> >> Do you think we need to make any changes to this patch? > > Probably not to this patch (if it's all working as expected), but we > should certainly look into cleaning that all up. Thanks. I'll also check the cleanup part.