From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D360BC54FB3 for ; Mon, 26 May 2025 08:46:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57E116B008C; Mon, 26 May 2025 04:46:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5559A6B0092; Mon, 26 May 2025 04:46:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 492426B0093; Mon, 26 May 2025 04:46:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2B2876B008C for ; Mon, 26 May 2025 04:46:47 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DAA8D160338 for ; Mon, 26 May 2025 08:46:46 +0000 (UTC) X-FDA: 83484428412.15.F7496FB Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf19.hostedemail.com (Postfix) with ESMTP id D80491A000D for ; Mon, 26 May 2025 08:46:43 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=eeenQEqb; spf=pass (imf19.hostedemail.com: domain of ying.huang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=ying.huang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748249205; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oBqyPTu8lskfG4nnPphpIwrCfBbzNKVNhmRKwastU4c=; b=ATj+im0bwCp/QW6Ht6E9xtIouiHCmMpwM6GgSzAgKJUUA4zGYBy3+Yn6LPeR4FWYBmsJuK zhIk/dq5ST+8liX5yPhSjijN6srARjwrW/UV5lUDCpqcVQIOjtD+WIRhckrBU7eTgsalpJ LwsJzEd5cb+fisSUnOPUk+MXgBJJSQM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=eeenQEqb; spf=pass (imf19.hostedemail.com: domain of ying.huang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=ying.huang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748249205; a=rsa-sha256; cv=none; b=KLikR2OFvJ/IVn2j+3TxmAJf4F5zogTMdXxtmDWkMNJPkeVmd/foh35VWfIwKXCDIUnhG3 l6mj3g5gWrDPv84fnkwBYrwV+2gAQaMId1QgM9OtDj89Y5fRMfq7Mbzbwp2kIpz2p7Tpxd VfZJ249Ysquag7FUKtSsjeLGvpf09VI= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1748249200; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=oBqyPTu8lskfG4nnPphpIwrCfBbzNKVNhmRKwastU4c=; b=eeenQEqb4GtS015h2ucn1frhLDdDghbUnUtzbKCee7bxm9g2ltCG7M06KDZ4ELFRoApLEO1q8CNPZLzPFLgjLlvQGrd56Td8pKHDn0vVO3iNm+6HxVLDfoP82MmKq21tHZb36GeBpJsBETJF4rwPntMBl2ThgMI6wGA8DtMXlAo= Received: from DESKTOP-5N7EMDA(mailfrom:ying.huang@linux.alibaba.com fp:SMTPD_---0WbnAuIq_1748249197 cluster:ay36) by smtp.aliyun-inc.com; Mon, 26 May 2025 16:46:38 +0800 From: "Huang, Ying" To: Bharata B Rao Cc: , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [RFC PATCH v0 0/2] Batch migration for NUMA balancing In-Reply-To: <20250521080238.209678-1-bharata@amd.com> (Bharata B. Rao's message of "Wed, 21 May 2025 13:32:36 +0530") References: <20250521080238.209678-1-bharata@amd.com> Date: Mon, 26 May 2025 16:46:36 +0800 Message-ID: <87sekrbvyr.fsf@DESKTOP-5N7EMDA> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Stat-Signature: 1jocaeg1agckemjjptwabrfnggki678t X-Rspamd-Queue-Id: D80491A000D X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1748249203-160321 X-HE-Meta: U2FsdGVkX1984SXxRLrlAga6Cy5eVw6PXbFW/2nf9W7VtSK35awLkmJo6Zy81eO8Iy4GiLRGagdndv8avaaHmZKXr7eAdQ56hUewDpchxIXPUK2sv8HXPDUMkkwxUTqZsibvzQcMvesmod9ja2hwi65EtztLuXrXWyO4mnCaeDNxTi7a8jcHr/9NrRuyP4W8rFEqPfB+VuPIEICCA72t1fNqVdTlYwdit0zHCgjalJi9SNmOmW+k3F5B5hcPOOs7tdGwbWAtiV7a8g6PKUoXuToZsCInilCo7kp10WrIY7iH84ffzhKaH0wyEyT0xGgLmDD5mCy0UhZUz5G+g0w25oZWUDmXuOXYtb3uGTF0wn0IA5tX+PfAbS8lcFl3O2SgV1rAiy97fpQu1oSKoLG7ZE39rido7eotpe0Pm/ZDpdjFJcu8lXIOhyF6KVKgDtT/jEZtXnGUPY6g2JSIGbUSDqdcyduab61hWkA8eac9opEoihmsiIBl8I2y7Jql03clknO5gRGkj/NTT2q1Gh9PL53zwsl9bfMB8eh+9WrZ62I6oS58GlLvbk0AmReZ0es3qwK8gwuDNWzgcnRCltxgi8/UiyH23suATN157RuwHLlX/uvkZOtWQHxm5QPyXsTP5YziutQ0Xw/BzmbxGlZUocK+mn0e2YyJ/Rc/brDK2s8wfxYPbQjA55T24and8g3wFZARt8HDVXBwnyPccbpNRv4vqHSX+DwLBB4jLRovGuDCE/mAAqw+dal2A5skNr217Y0Eyn3BVlsQuur01d7CZ+4XjWUcHiKYiedL8KaJQ2NifFT2nE9tiI46CJngTc8jrKZ19qnJzYWpzBvZYWj446MzMJtdlxNsMFlayPw17i/qrPwy7fW044t66ECz86sNPeyJTe2GosGUH7Q3yVUJYDR6JKMp7YIXVfbkJDVWQLFvvHPdFRBRxLpcmZ5OvFNkaEr2ejMJXJOyONZ8y34 lI1K7UDO Sjbo8JgRVwRni0neLBJ4rcTWPlPByaeuDt6Xc266YL7Peu0jv39IksoVOi/yEuSmLlqNStwPsMdysTMZoKgSrjuo4WYErU6GiiZr5/X6Odqya9+M2Xpq/yGZdZb3UssVH0FwzKvAG2QeRiQz1e5DhDY07+7imeX089VAMC7n9fe9rAeKXTl2w97pQzGmAwMAmXXZ2l8e1pqJZZwk1AHChbIU/pKUFWY2wVF0d51CAZAnt8aUvtpPBbM5SxYNtUrE1KTR2b/6twPH+JYCV7jqX5fEjRk854cCwpsRz7LbeFRdJM54yQsRj7tlsoAlLe6oPDUax1TI8mndfJzBSkdp7XnvQl+m1kS49NLsSiXrU0BGHTSrGfMFboniWEuyfcx8k+l3CWpgl47enCXhM/NSML77NxdrnYgNdPK8lxpNaYXSlT5gM3YWv9qBozpQ/6r6pHyVkyVB/foRZhU+vU5wxroStdLX0L/WWYsxWTXT3vpfefB6Idh4wRSLTBhkzCx9IMywLhkRTv4RuGegraLZqEklfQ9xraIuy+2h4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, Bharata, Bharata B Rao writes: > Hi, > > This is an attempt to convert the NUMA balancing to do batched > migration instead of migrating one folio at a time. The basic > idea is to collect (from hint fault handler) the folios to be > migrated in a list and batch-migrate them from task_work context. > More details about the specifics are present in patch 2/2. > > During LSFMM[1] and subsequent discussions in MM alignment calls[2], > it was suggested that separate migration threads to handle migration > or promotion request may be desirable. Existing NUMA balancing, hot > page promotion and other future promotion techniques could off-load > migration part to these threads. What is the expected benefit of the change? For code reuse, we can use migrate_misplaced_folio() or migrate_misplaced_folio_batch() in various promotion path. For workload latency influence, per my understanding, PTE scanning is much more serious than migration. Why not start from that? > Or if we manage to have a single > source of hotness truth like kpromoted[3], then that too can hand > over migration requests to the migration threads. I am envisaging > that different hotness sources like kmmscand[4], MGLRU[5], IBS[6] > and CXL HMU would push hot page info to kpromoted, which would > then isolate and push the folios to be promoted to the migrator > thread. > > As a first step, this is an attempt to batch and perform NUMAB > migrations in async manner. Separate migration threads aren't > yet implemented but I am using Gregory's patch[7] that provides > migrate_misplaced_folio_batch() API to do batch migration of > misplaced folios. > > Some points for discussion > -------------------------- > 1. To isolate the misplaced folios or not? > > To do batch migration, the misplaced folios need to be stored in > some manner. I thought isolating them and using the folio->lru > field to link them up would be the most straight-forward way. But > then there were concerns expressed about folios remaining isolated > for long until they get migrated. > > Or should we just maintain the PFNs instead of folios and > isolate them only just prior to migrating them? > > 2. Managing target_nid for misplaced pages > > NUMAB provides the accurate target_nid for each folio that is > detected as misplaced. However when we don't migrate the folio > right away, but instead want to batch and do asyn migration later, > then where do we keep track of target_nid for each folio? > > In this implementation, I am using last_cpupid field as it appeared > that this field could be reused (with some challenges mentioned > in 2/2) for isolated folios. This approach may be specific to NUMAB > but then each sub-system that hands over pages to the migrator thread > should also provide a target_nid and hence each sub-system should be > free to maintain and track the target_nid of folios that it has > isolated/batched for migration in its own specific manner. > > 3. How many folios to batch? > > Currently I have a fixed threshold for number of folios to batch. > It could be a sysctl to allow a setting between a min and max. It > could also be auto-tuned if required. > > The state of the patchset > ------------------------- > * Still raw and very lightly tested > * Just posted to serve as base for subsequent discussions > here and in MM alignment calls. > > References > ---------- > [1] LSFMM LWN summary - https://lwn.net/Articles/1016519/ > [2] MM alignment call summary - https://lore.kernel.org/linux-mm/263d7140-c343-e82e-b836-ec85c52b54eb@google.com/ > [3] kpromoted patchset - https://lore.kernel.org/linux-mm/20250306054532.221138-1-bharata@amd.com/ > [4] Kmmscand: PTE A bit scanning - https://lore.kernel.org/linux-mm/20250319193028.29514-1-raghavendra.kt@amd.com/ > [5] MGLRU scanning for page promotion - https://lore.kernel.org/lkml/20250324220301.1273038-1-kinseyho@google.com/ > [6] IBS base hot page promotion - https://lore.kernel.org/linux-mm/20250306054532.221138-4-bharata@amd.com/ > [7] Unmapped page cache folio promotion patchset - https://lore.kernel.org/linux-mm/20250411221111.493193-1-gourry@gourry.net/ > > Bharata B Rao (1): > mm: sched: Batch-migrate misplaced pages > > Gregory Price (1): > migrate: implement migrate_misplaced_folio_batch > > include/linux/migrate.h | 6 ++++ > include/linux/sched.h | 4 +++ > init/init_task.c | 2 ++ > kernel/sched/fair.c | 64 +++++++++++++++++++++++++++++++++++++++++ > mm/memory.c | 44 ++++++++++++++-------------- > mm/migrate.c | 31 ++++++++++++++++++++ > 6 files changed, 130 insertions(+), 21 deletions(-) --- Best Regards, Huang, Ying