From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FA79C77B7C for ; Tue, 24 Jun 2025 00:34:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 024516B00A6; Mon, 23 Jun 2025 20:34:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3E886B00A7; Mon, 23 Jun 2025 20:34:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E7B616B00A8; Mon, 23 Jun 2025 20:34:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DA0ED6B00A6 for ; Mon, 23 Jun 2025 20:34:13 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6281F1D9090 for ; Tue, 24 Jun 2025 00:34:13 +0000 (UTC) X-FDA: 83588422386.04.8AE1AEF Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf13.hostedemail.com (Postfix) with ESMTP id C60F42000B for ; Tue, 24 Jun 2025 00:34:11 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pJU6m1UC; spf=pass (imf13.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750725251; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fCAQXeYOWFm/AMOw+lAjeLjoZvp8tbumAMZJUfG/TrE=; b=BvzitawzXuPkvOmH99AHdSIZrdZoIdf1SoSJPhTKccNxInpXYQjAC/Y3W24tE1b9mtocBR Nbt1Qv2uSgS7ZfvYJ/dIrVLgd2F6qy/lGjp/06pN+Out5U2tJR9lKtNQVqB0GzTfgwvWUg 5HGPXSNgAwYaOZCJ1zQwBkI/dKros0g= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pJU6m1UC; spf=pass (imf13.hostedemail.com: domain of sj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750725251; a=rsa-sha256; cv=none; b=jySG0ys5czqE70mgs0iaw+SA7PHbf8cGDkN4Jd5I4eVXmx+OcXTPc16d7Dpt562rD3zq/J bVC22AidgEF/KmJVZQs6vNlsaEqjj1YF7LEy6+hf51wPT846owYiKVtAYLK4qzhK/VWoAr r46Hhzv5p9WGAHZPYhxDpQtpCRL8n2s= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 003F661F1D; Tue, 24 Jun 2025 00:34:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C12EC4CEEA; Tue, 24 Jun 2025 00:34:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750725250; bh=lXIMUIqca9amr01wZvSN4kfNBTEuykhwJx5cSkrzAJA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pJU6m1UCxcYko2rIclBtf62bO/+78UquCelaCQcsoCnem7Vqnup2QMpw2P2SfJmgF GiInmVd6XBORQp0WUx+9Q+HNbCV6vF8k6BaYUP1Smx83Phb6MP4nnQZpjTuUPSPpbu q2fOnP1yNnOqDzesR+ft8r01N5IBLroDQSS9IfPD69ffR7KsJfp8M+oil8PpcyHVmk 6HW4dfKnuLUQrvYrbUS8R9yB5y7PmPwVVapvTsVTpXdPPTnmM2C2N+ZC5q9GX4kT5o LZPKD2pQ6/mdjBHDs0nWS5LucSNskCm0Jd96j/aIyvHOvxBsz/ROmvRfxSRSuo7URt /aJ01viuxo7gg== From: SeongJae Park To: Bijan Tabatabai Cc: SeongJae Park , damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, david@redhat.com, ziy@nvidia.com, matthew.brost@intel.com, joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net, ying.huang@linux.alibaba.com, apopple@nvidia.com, bijantabatab@micron.com, venkataravis@micron.com, emirakhur@micron.com, ajayjoshi@micron.com, vtavarespetr@micron.com Subject: Re: [RFC PATCH v2 2/2] mm/damon/paddr: Allow multiple migrate targets Date: Mon, 23 Jun 2025 17:34:08 -0700 Message-Id: <20250624003408.47807-1-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C60F42000B X-Stat-Signature: 3kdg7zb1catktat8j8qheh4kcgo991br X-HE-Tag: 1750725251-531402 X-HE-Meta: U2FsdGVkX1/8NjfHElXjsY76J9xXaD44W07kw/yGmPBdaMOdUkAVmXPY9wBzKopW7WJux0e2psFvzf9Uie5TezwjUotN+aKAnoeBKCDX0xNmbdINJFvriYu5Gaml+znwb+r/UikC3WR4Dj04DD8zFBRY4FaVkwzt/i0w8cp/VGreaMHV9BHn3BHHhl2C4DutkWjjRmlpli/LRM2Tpd5KTf62k6GhlkTfOfuIOpdpsJu9rA/lCKC0rOxenZ58Mz+r45doD0bLIGgEmgkJSKlPgcepg7sUv5PtayKCsOzYKMGCk91wE5Nqw9txvRh8tXSgH6RACeeknxi3PInUYZB4GIJoPYke1lqhf1AvbN58HXihOUVijTai5b67OX2Nle112AEqeBkKeNKvwc9TU/eVqTok6TQriDJqvXGzJAZCUYgNE3RRFJSDab7xUkkPL5OqJvoNjzNgAi9elWG2EGjnR2rGvjON0bXKhGRChLTDmXBgDKyZYU0EwQuWgaSaeOkod/bXV8EHfutpB/KAJZq/GtYpTHkuZmt69MVN/g+CYzXjSAOz4yNtOJHxdaMUPBKlgj5IbmA9fRtjA0Tu3u0TnfHGDK+fZqq00WTsLBxE2JfM8KrbyFGxJNElxnwe9zsyRbIsPDGEfCfknrzKCaF6NHF6MD/YC1KQPuZ4NxJ9vDm28G5R66p+OKxakxPRnp3GJQOViY5pj/yO1/vTLVEg2lXRrTuMFXA/R/kynqbvVTK6Bmx3nMw8/h3+9yJbJYqe6ZyQ1Wh/kQt7x1GRXqg1jIHCi5yHQjBz6iLWVGmCGGD1rUFqyieQnlbKeYASZ3CMU/uzsA1kUOv8EtXh5BlsRpVWKe4N5fzA8yvO/hpwPsIoCBF6f8mmjWhYmNxyOFmg0OULVnELY3fqXazPNFXJNXXb0vbqKVbKQExqzKrMHRy/a8fKrTF644YHAEk8V8fg0kiqZGa9J+BowLWI2gl VnuOHGMW q+zTt0X60ztRLZUwGv0DNxaAW71bQGozglbuPhoQFAKxX+/zMujt0htFzRDN318ydbUdr5e6ZKarn4AsUJ7a8B9l/YVGkQTtrisBQm0gFT6BHdWS1LKvk8u2ImZqWZEp1CHAk9MkTtQ8+ZhB+quxw6ZoIn9c0DoNjp3BNnqzApcxdwjQiq3ro4D7fgo1H3Zx5rVPey8akQUK1LmViCl63w1E+MhmSpFZ0+1OE/cQQ/mxZe7gJ3kaSrPu59aQNCJ8/LWE4Z2DZDMfdQTVXASYsjT5/d/wjU+PjWIPCshwbrIgx0Z7GwODW3/rpNw0ioTxYdES60aQ/439SNBcKRBtxyIxLHE749vPkfXMDdmJnds7gcUjgMDr7xaRQsEEdDNixOLHDpr7I2dmVW9u2d+2iUo0d6BTqL2Hnoe3/HsM68pXm+VqXrsmf3+kQgTTE+sQgwo/bPO2Ud39iACKgHs4Oiv/zC/F2p8syXKhh6PiAIF/qNSk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 23 Jun 2025 18:15:00 -0500 Bijan Tabatabai wrote: [...] > Hi SeongJae, > > I really appreciate your detailed response. > The quota auto-tuning helps, but I feel like it's still not exactly > what I want. For example, I think a quota goal that stops migration > based on the memory usage balance gets quite a bit more complicated > when instead of interleaving all data, we are just interleaving *hot* > data. I haven't looked at it extensively, but I imagine it wouldn't be > easy to identify how much data is hot in the paddr setting, I don't think so, and I don't see why you think so. Could you please elaborate? > especially > because the regions can contain a significant amount of unallocated > data. In the case, unallocated data shouldn't be accessed at all, so the region will just look cold to DAMON. > Also, if the interleave weights changed, for example, from 11:9 > to 10:10, it would be preferable if only 5% of data is migrated; > however, with the round robin approach, 50% would be. Finally, and I > forgot to mention this in my last message, the round-robin approach > does away with any notion of spatial locality, which does help the > effectiveness of interleaving [1]. We could use the probabilistic interleaving, if this is the problem? > I don't think anything done with > quotas can get around that. I think I'm not getting your points well, sorry. More elaboration of your concern would be helpful. > I wonder if there's an elegant way to > specify whether to use rmap or not, but my initial feeling is that > might just add complication to the code and interface for not enough > benefit. Agreed. Please note that I'm open to add an interface for this behavior if the benefit is clear. I'm also thinking adding none-rmap migration first (if it shows some benefit), and adding rmap support later with additional benefit confirmation could also be an option. > > Maybe, as you suggest later on, this is an indication that my use case > is a better fit for a vaddr scheme. I'll get into that more below. > > > > Using the VMA offset to determine where a page > > > should be placed avoids this problem because it gives a folio a single > > > node it can be in for a given set of interleave weights. This means > > > that in steady state, no folios will be migrated. > > > > This makes sense for this use case. But I don't think this makes same sense > > for possible other use cases, like memory tiering on systems having multiple > > NUMA nodes of same tier. > > I see where you're coming from. I think the crux of this difference is > that in my use case, the set of nodes we are monitoring is the same as > the set of nodes we are migrating to, while in the use case you > describe, the set of nodes being monitored is disjoint from the set of > migration target nodes. I understand and agree this difference. > I think this in particular makes ping ponging > more of a problem for my use case, compared to promotion/demotion > schemes. But again I'm failing at understanding this, sorry. Could I ask more elaborations? > > > If you really need this virtual address space based > > deterministic behavior, it would make more sense to use virtual address spaces > > monitoring (damon-vaddr). > > Maybe it does make sense for me to implement vaddr versions of the > migrate actions for my use case. Yes, that could also be an option. > One thing that gives me pause about > this, is that, from what I understand, it would be harder to have > vaddr schemes apply to processes that start after damon begins. I > think to do that, one would have to detect when a process starts, and > then do a damon tune to upgrade the targets list? It would be nice if, > say, you could specify a cgroup as a vaddr target and track all > processes in that cgroup, but that would be a different patchset for > another day. I agree that could be a future thing to do. Note that DAMON user-space tool implements[1] a similar feature. > > But, using vaddr has other benefits, like the sampling would take into > account the locality of the accesses. There are also ways to make > vaddr sampling more efficient by using higher levels of the page > tables, that I don't think apply to paddr schemes [2]. I believe the > authors of [2] said they submitted their patches to the kernel, but I > don't know if it has been upstreamed (sorry about derailing the > conversation slightly). Thank you for reminding it. It was nice finding and approach[2], but unfortunately it didn't be upstreamed. I now realize the monitoring intervals auto-tuning[3] idea was partly motivated by the nice discussion, though. [1] https://github.com/damonitor/damo/blob/next/release_note#L33 [2] https://lore.kernel.org/damon/20240318132848.82686-1-aravinda.prasad@intel.com/ [3] https://lkml.kernel.org/r/20250303221726.484227-1-sj@kernel.org Thanks, SJ [...] > > [1] https://elixir.bootlin.com/linux/v6.16-rc3/source/mm/mempolicy.c#L213 > [2] https://www.usenix.org/conference/atc24/presentation/nair