From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73E87E77188 for ; Mon, 30 Dec 2024 05:30:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C7E5E6B007B; Mon, 30 Dec 2024 00:30:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C2D796B0083; Mon, 30 Dec 2024 00:30:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF4B56B0085; Mon, 30 Dec 2024 00:30:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 90B656B007B for ; Mon, 30 Dec 2024 00:30:56 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 32C92120449 for ; Mon, 30 Dec 2024 05:30:56 +0000 (UTC) X-FDA: 82950500766.02.B22C22A Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf15.hostedemail.com (Postfix) with ESMTP id 1E7D5A0006 for ; Mon, 30 Dec 2024 05:29:31 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VmUojBFD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of rientjes@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735536610; a=rsa-sha256; cv=none; b=x3ataeR5TGeV4M+rnowNbgEAHnNoAPKcZ2y6sL6o9SyVBYadWxlEsE05Ux3Strnfmfpufe ZlkYN1MHdJ+c2bXn7Uj5DVoQP5rn90c9jFBpYelDrTXjS4wD9Zhbd1lQaxlcRevegt4CKa zrtmLP679Cyuz6BPHR5ViLGWa8Yd1Zk= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VmUojBFD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of rientjes@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=rientjes@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735536610; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1rwZC3OOa/JkKr8msDgLz96v/rdOCz9EuLrZyAhejrA=; b=pnb0ZUr78Xuq7PWfxjq0izI5oBt6L22be3mOpZEJHlqzF52lbcmK+Oe5BQZS6ivw9e/5QC 1oqfLO0IhVtvGps193SeUqCocH6pdjYMoD0rH8iykTs1ratjBGWqM8x4p1n6Lw01uiA7DI 3YhyJ8I2a1c4yYLw2z8VbeM/oGcWvHA= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-2163affd184so1107375ad.1 for ; Sun, 29 Dec 2024 21:30:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735536653; x=1736141453; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=1rwZC3OOa/JkKr8msDgLz96v/rdOCz9EuLrZyAhejrA=; b=VmUojBFDLubUkmhI7PtkUv4F4Gqa3DEXbi4gSOeq6NBG2Qv9SZwp7JJqMk6sbySQFT J+pFm6Ruqf+p9kTaaddytn9R2JEjYedvWjlTrd9XAEKC+NhL9sjRf7Kr68YhNDCD0eus AO0cm3pnyfvJP2Ox1TMjvHoZ9b2JOV2dsGciDxsEoVe+e+cyzJJjKQzhJuUSkXyR4Vkj 3kvcUJqoDm+gethPi1FRNH0Z91wtLMDNclqa6k6GGyW8VlK9J97tw4ciGPDEWHP0ITtB HSECLaxOqI8jkXZinwe2QgmRbjP9uDDAW8EgdtSdNt/OlzbMC2AyHB+KLXrVmaZYe7/2 2PAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735536653; x=1736141453; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1rwZC3OOa/JkKr8msDgLz96v/rdOCz9EuLrZyAhejrA=; b=v4+mUpruQm+6gHogHP1V9DHiAWxcGZmy1J2jg5D9pSDbMUc58NnvdHjEzV6WTwe/dz PTfDLJoXx9co1TKd2Q9qEVcgXbadvsKomAST7opX0CMaIsMalgiBDlpGz6FaXU8EpAG0 AEquadjdrH8xaQeWohz5R59HbxN425y/KtxlSMm5Sef5Gm3RzNh7t/JfwWpXSJnK+8jd 2zM9aKDeAtdaLIiLI2ZSZmXpTqY7fehbLkyXh1s1WisQn37selIj+XpN3Mj71SNXR08J 4HOJ3nNOeOrAusf90Ul9+tF9A+hhVBXGMMcqpCeKAEfHzkf4xkogGAj6ZMV1Zdx6OC0y 10SA== X-Forwarded-Encrypted: i=1; AJvYcCWOWmseGRIQnt/b2TEUBG3/1yr2JiJ9RVXdrLnAgN2ExOSJtcpKAgI0lUsy/noQHwuixZw2qfSmYg==@kvack.org X-Gm-Message-State: AOJu0YwrFzYoy1HC/GZerHjH8xy68mW9aIIx9/LyOdqiVKf3Z3Meqkrd 0ELXkaV3O+hAVLBNlvgdoDGYSyfNrwQuE6Zoz3PpvScVTnkncE+02JBhCRnjOQ== X-Gm-Gg: ASbGncsIjgbbV6nnhfr9+UM06pzFnvCyQdmVkiN+SbImvi98vhPjq5uN9CQTOay9VFR YqyP+Mnuvu22WqnQ6dQ5HYBEhXpRNFruQ3Nydit6ffk4iZpwc2psLJsx8hQu8bIvLdLXrcRoJnB Q6At+ERoIDEhwpmdgCeG4JlLdntIDdvgfMAe1M6cALYA/g7rjmDA/FgvqYwz+ZNDplgK7J6d32i y87+i6jsEofoFbhibDAHFma1SZl8tL64wwcm64N6Hk2xWCuwxdt+YCWBR1HUVOiWzDo9r6sDJVv 2NNXcw5YZPc= X-Google-Smtp-Source: AGHT+IEAXKsCRoJ/iC6DDo2H0aEIUAJ3P/5OweAt7VgFMPsEX4c4372BanOQ179XUUI+xPhkKaoz/w== X-Received: by 2002:a17:903:2304:b0:215:79b5:aa7e with SMTP id d9443c01a7336-219e770e526mr14273905ad.13.1735536652691; Sun, 29 Dec 2024 21:30:52 -0800 (PST) Received: from [2620:0:1008:15:cab6:d74a:8241:f058] ([2620:0:1008:15:cab6:d74a:8241:f058]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72aad84eb45sm18375599b3a.88.2024.12.29.21.30.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Dec 2024 21:30:52 -0800 (PST) Date: Sun, 29 Dec 2024 21:30:51 -0800 (PST) From: David Rientjes To: Shivank Garg cc: Zi Yan , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, shy828301@gmail.com, Liam Howlett , Gregory Price , linux-mm@kvack.org Subject: Re: Slow-tier Page Promotion discussion recap and open questions In-Reply-To: <1c424899-d394-452f-9e13-d8cf77660c4a@amd.com> Message-ID: References: <6d582bb6-3ba5-1768-92f2-6025340a3cd4@google.com> <9093302B-95A9-4133-A0E0-75A47CE4336F@nvidia.com> <1c424899-d394-452f-9e13-d8cf77660c4a@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: iqrobxgxxwo7m1asx5h9hs546eqro4ce X-Rspam-User: X-Rspamd-Queue-Id: 1E7D5A0006 X-Rspamd-Server: rspam08 X-HE-Tag: 1735536571-960363 X-HE-Meta: U2FsdGVkX1/lfRqKRupEiB59U9R6GWoFlCUj8jAhUYqogErDA5L0b5PyfTHo5dUZEkzbi4w3JefcNyIVPvPm6wrinuSFnPmjD3hu7M3PH+6D3t0/8azcXV3W6UfO009TC2yjuAZuXugDGY4DPzQ2XRuJQQMBIq0ggt6OT17H0ZQ0QFaH27QtAIvuRQO4mv1Z/rYkbSWPx6NhWFsLhLC9SMuTB8fiM7Yg2dR+PiVKZROq81Qo22IPZFjOKJJchHs5ck2p+qFtJwSNER3RID7LpXedLVXPiF0TsaA/zxwqNHXlhUo+hqFvY2FwOtdKt9c6XhdORP9icPtpIJxkSRuVHHOph5+IrcDBjIPnOXTbH96BkhjTWqtX+0XuL2+PJwogUvByPWJYaFWiGeLXSnOPd2tEc0HDO3YmfDN75V7jF4C52IE9LcjMZcfo5Cum7V37N8zO0l67XJ/FIpVh6gtKVab2Ei9/GSxM9Vpeg/30FK0u5tcUH4M2BHilp5eOrhjptNgyVf7fg6pZopWORC+c+YhVr8ipD3uwJEUXYSDcxe75+4gc9bEgQalbHPLOc7buv8KewMYWMlNONbK+8CGtv64NvdE+HinI3cmlJGDSzJkoVb3nZ8dopIHRqtMEFsK/BFpDHrkacw4QfNHBmALsFLHfNT6BrRPFeC92DhC2JLUKa+qYzCXP9+2W6NotdSrWphW5sOibguyjt5yl6HHpemJruQ3XPm6kDtWB6xOboFCqCGrXI3uQFzbs+2JTYIJas44E+/PM/8Uxpc8UuCAhyLbcJ+mktq4AiX4p1efpawBE9txokSjWvQ/ZfYqYrDFI6HBmDxPi23bqNy/RJcINm+bxBdrVVRzeFF2iPDbiUBUjh3gOUGzTWIxfllBj6bMaS3EleF2GrTG0cg8qY21wUopR6WVQwo2jBzedp5OEqjuZ4r8L2v9GvE6JVCwywSCvZgWC91VCVuTYpVczlrU a/leUl75 5P+ivg/ipCIN5XdDS9pLmwPO34X7Betzgd5Upjh7RVkrvOBXla7yPslO0yll25nuhGK9qP3U4kAIWN3nMLLIaErOReRVBI7JuJjbQrpRDk1zKtEUwaW+EI8UZRFahRgT2vgMEVjODuCg2NsBePakZGYmnOJbpbghRTFFSCRkwL9o4JybWWOzWnWc2TGKZsI69CpngZEX4IVbnI1CU6N26+xzga8kbBl2cqkdvLIeesxVBLYVrS3j5nAx/QAAaSxhMIPqx4zubkh43C0fFz15w8ZzVAkViB5CmNROdGoFVxjffBSsAvaom7BoWOtEh+CBoGtQGZ9DpgwjWW+LBXaONST6Pu02ziN9tfX1XiYzV4OWvVbspbMGf8xFbizVTeT0jaxLsVv5z15mnbRQqNEmoHT5fCE4S+4+Z4Dshj3R3AW2gB7MunWWPV9hZ57Ln5zA9gyUHYsvtMjgVHJD1uE5/ec4iBtviz/gLYerxiD7LNYgb2GfKSmLKXaqzxfVersvJPoZVCnL+Rcuw/kPAfDfkfECU8qU6Og7XVMEqgL8uic/Vzo4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 19 Dec 2024, Shivank Garg wrote: > On 12/18/2024 8:20 PM, Zi Yan wrote: > > On 17 Dec 2024, at 23:19, David Rientjes wrote: > > > >> Hi everybody, > >> > >> We had a very interactive discussion last week led by RaghavendraKT on > >> slow-tier page promotion intended for memory tiering platforms, thank > >> you! Thanks as well to everybody who attended and provided great > >> questions, suggestions, and feedback. > >> > >> The RFC patch series "mm: slowtier page promotion based on PTE A bit"[1] > >> is a proposal to allow for asynchronous page promotion based on memory > >> accesses as an alternative to NUMA Balancing based promotions. There was > >> widespread interest in this topic and the discussion surfaced multiple > >> use cases and requirements, very focused on CXL use cases. > >> > > > >> ----->o----- > >> I asked about offloading the migration to a data mover, such as the PSP > >> for AMD, DMA engine, etc and whether that should be treated entirely > >> separately as a topic. Bharata said there was a proof-of-concept > >> available from AMD that does just that but the initial results were not > >> that encouraging. > >> > >> Zi asked if the DMA engine saturated the link between the slow and fast > >> tiers. If we want to offload to a copy engine, we need to verify that > >> the throughput is sufficient or we may be better off using idle cpus to > >> perform the migration for us. > > > > > >> > >> - we likely want to reconsider the single threaded nature of the kthread > >> even if only for NUMA purposes > >> > > > > Related to using DMA engine and/or multi threads for page migration, I had > > a patchset accelerating page migration[1] back in 2019. It showed good > > throughput speedup, ~4x using 16 threads to copy multiple 2MB THP. I think > > it is time to revisit the topic. > > > > > > [1] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/ > > Hi All, > > I wanted to provide some additional context regarding the AMD DMA offloading > POC mentioned by Bharata: > https://lore.kernel.org/linux-mm/20240614221525.19170-1-shivankg@amd.com > > While the initial results weren't as encouraging as hoped, I plan to improve this > in next versions of the patchset. > > The core idea in my RFC patchset is restructuring the folio move operation > to better leverage DMA hardware. Instead of the current folio-by-folio approach: > > for_each_folio() { > copy metadata + content + update PTEs > } > > We batch the operations to minimize overhead: > > for_each_folio() { > copy metadata > } > DMA batch copy all content > for_each_folio() { > update PTEs > } > > My experiment showed that folio copy can consume up to 26.6% of total migration > cost when moving data between NUMA nodes. This suggests significant room for > improvement through DMA offloading, particularly for the larger transfers expected > in CXL scenarios. > > It would be interesting work on combining these approaches for optimized page > promotion. > This is very exciting, thanks Shivank and Zi! The reason I brought this topic up during the session on asynchronous page promotion for memory tiering was because page migration is likely going to become *much* more popular and will be in the critical path under system-wide memory pressure. Hardware assist and any software optimizations that can go along with it would certainly be very interesting to discuss. Shivank, do you have an estimated timeline for when that patch series will be refreshed? Any planned integration with TMPM? Zi, are you looking to refresh your series and continue discussing page migration offload? We could set up another Linux MM Alignment Session topic focused exactly on this and get representatives from the vendors involved. Thanks!