From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E71DC7115B for ; Mon, 23 Jun 2025 23:15:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 052B86B008A; Mon, 23 Jun 2025 19:15:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0032F6B008C; Mon, 23 Jun 2025 19:15:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E35486B009A; Mon, 23 Jun 2025 19:15:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CCBE36B008A for ; Mon, 23 Jun 2025 19:15:15 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9FFEC1A0EEF for ; Mon, 23 Jun 2025 23:15:15 +0000 (UTC) X-FDA: 83588223390.12.48FE2A6 Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) by imf25.hostedemail.com (Postfix) with ESMTP id 92C47A0012 for ; Mon, 23 Jun 2025 23:15:13 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dhIiLFUo; spf=pass (imf25.hostedemail.com: domain of bijan311@gmail.com designates 209.85.218.49 as permitted sender) smtp.mailfrom=bijan311@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750720513; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RBz2MH4vsF5+PtaFf3tYfQ96/rNjEiKL+V+vNB44KJo=; b=lNE8uN8EYBNCYd3p+p+9OUGGbhpH2we4WwMbl3ZUlbu/qk2WxTlmvk0HoAoZbc6WgMx1Jb dGEU0KCbd3U2y+V4H1HHiyJb+pj2VpV0jOoGIHkkSytsZ98aIj4FODX8aXWcEJWGxST5BS TxJRHEt3HRjtbSUco9J1hxJEBwtZJgo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dhIiLFUo; spf=pass (imf25.hostedemail.com: domain of bijan311@gmail.com designates 209.85.218.49 as permitted sender) smtp.mailfrom=bijan311@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750720513; a=rsa-sha256; cv=none; b=fTOI/PW2xnLB6vyMqOQDYUOZLpdxxgbfWUjyrMURz0sWBxcYH5gbW/6I9i/juKsubwkF7D qTSMtErxn/9ip8DiaLLAm7nLlqU8uS3eCwBQ+cInkOVKsn0XQ0GftK5a90ZIsN9WcT9X7p oIDEb571AivpcW6s9E6mHHg9uWGHBYA= Received: by mail-ej1-f49.google.com with SMTP id a640c23a62f3a-ade48b24c97so827851166b.2 for ; Mon, 23 Jun 2025 16:15:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750720512; x=1751325312; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=RBz2MH4vsF5+PtaFf3tYfQ96/rNjEiKL+V+vNB44KJo=; b=dhIiLFUoEb2+Jlw+La6y5or/5PSFiA5wPycIzRM3RIdn4FHhtuNEKuYg2hBltC2yTS RhpFLOj/0/3pdUuN3rLMCqwu6RNxGsikZEWWjBfzAMvZ/OqS8g5j1lbTcntGszfTI8h4 l2ZETBSimbq7rgllla7DwfYcYLCTtzbRinm3UAFvNr9Utf6GrqjpB6aGenvIID/ZT7fj ibBDyt7cjetuce328pyH3zNMHWAaZ6/rGSet51SqQBNFKlcsoynmoNnBxMVI1px76abF mbRgpgETi8fd9J/Chj8iKQa9ZRorm//RLPe1nELWS3+GAN1FOpyZXgOnyV2KJzOvbU0t fyMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750720512; x=1751325312; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RBz2MH4vsF5+PtaFf3tYfQ96/rNjEiKL+V+vNB44KJo=; b=FCGytxIFq8ZbsWTM+gfGGDwVexJwJcquhYksvMnO8G8VLH1fa1RCZSFSaPWxzKHISc LJJIBZebNgOwBkXWB7DBM8mJiC1DBBZIVkJ8XLruXWbxxgTBJdOkILbm0X1CGbPH+GNt KX9qp6CDzdLVw6DWoNQ6hFoBhlotXE/ey7PYRsJdavCvo9YLPMaoXXBD9A2On7eNCCL1 6/WI25p0QDrZxq+lqx9jkpMCAOWeV3kNbT8pFNhEXwk0UtzQsqijMB1BcOiQTKyItTgU NsoVEUaz78p+C8/A5tHzSRS/+C4grrkAot2xLmszHU8JPR5AgJAs8t8/S23mRQsefjRd vVEg== X-Forwarded-Encrypted: i=1; AJvYcCVbfDt13HBmwSxRViPBT/c4AWScldRWeiszfzmzMmhnByceyHDNCNhnQH6f2IL1+TPQUd+pX35FUA==@kvack.org X-Gm-Message-State: AOJu0Yw5HWp8zmSnc9PX0cBzLSEprhDEr+zHjP4cibE4IjIVss92LQRa 7GzShXzCucy3Nd7rrSIMl1hMyj7xIrsKGak0lNVWI9ZYtJ8VQjmNhBjY+StTPTDv/kXrPP+pNEv GfKeeN6iLdzWXlyHOcU8cbP+wZ09+c/o= X-Gm-Gg: ASbGncsLOf1zjuaHMAijrU37YvKDXP0DI3n5N+99ME4f18UI0z/UOvtF8wEZEbbuafZ nTQWyx6UnD4oLBd9uak85yFSQHu/FFrIXhKvwL5n5Y8TaBsu9d2YwpFZqHcIxOZkdloVpgLmDS7 USJhgNFQyM0EkbXPtE40H5Bf2eEhDQ09zN7ASjQsoOnRGvBLvUTpRvO0CMBmneCgmKZyR+ENNRx EpHvQ== X-Google-Smtp-Source: AGHT+IHPUSbUFt4gKmx35OTPDxKtucfsnWXLHwhuZC4RnoCwVMI4k/jbfXUbIDHmkbtMDZoocoQJrfWc9dwdbCAKytc= X-Received: by 2002:a17:907:c295:b0:ade:422d:3168 with SMTP id a640c23a62f3a-ae057b6cc35mr1256632566b.37.1750720511655; Mon, 23 Jun 2025 16:15:11 -0700 (PDT) MIME-Version: 1.0 References: <20250623175204.43917-1-sj@kernel.org> In-Reply-To: <20250623175204.43917-1-sj@kernel.org> From: Bijan Tabatabai Date: Mon, 23 Jun 2025 18:15:00 -0500 X-Gm-Features: AX0GCFvk3bD96yNp8E3FeZAJR62qRNVLNC4Ahk2Y4MIrtpo5yApquvrGtk5G1Ms Message-ID: Subject: Re: [RFC PATCH v2 2/2] mm/damon/paddr: Allow multiple migrate targets To: SeongJae Park Cc: damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, david@redhat.com, ziy@nvidia.com, matthew.brost@intel.com, joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net, ying.huang@linux.alibaba.com, apopple@nvidia.com, bijantabatab@micron.com, venkataravis@micron.com, emirakhur@micron.com, ajayjoshi@micron.com, vtavarespetr@micron.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 92C47A0012 X-Stat-Signature: 1cty3ymtarct4z79rw3obbydumesoq53 X-HE-Tag: 1750720513-503574 X-HE-Meta: U2FsdGVkX18MPxcJzOKeaxfx+nXFYSVdLD3ajJCF0k+bJKwmk/wvu6yy0/Mx3WU2jLnsckF19INezCZet5a6zDB9q+4B531qvTeIt5GYqWoee4hhUxAmFTptbjaQyQISDFJy0Z+wp96j3pg0HPvRc4lDXkVuLFk2ZaCty9DkLovoekoYGyzze10z+xdYMVMc8zv5WKILJcZDjIsHIXYkYAh6TWQWMqASo/nTATKx353lvZd6j6eqGqIcLaZDcRMJ9XInztqF+xNy6zqmijhP7/91sRmBfP8JoNwfof5FJV/CpbE7PkGBI14HDyNHcvhHwxcCjs80fAoDoRFcqdw1LTSWmh8dEYsdiTVE50q5p2tvW+wdroip7ClFIncaiaSVi/CKv4FVqZB1mgy3h6Z/tG9SIuNjmrtik1i+pqaSPB0Hq8fNoHtV3VO3eiMgRBOOT3V+JrhimPAIyUBWwcgI7ysKFbfbEgTihWjgI1UxW4iwDglduNE4X6V5Tswo2l0FBT7yuRkaGZ50preZo2bC0lt6PgqmRn9VoV3OqjhVaN5bJI6bSaJZrqfcvu4EgGIWJI3sv6lytohuNp2uYrYkYjL9TNI1oxj7h87yMGTra5pNlPyfa9tjdLHXVWMLOecRaqpSZjRiDX9u5vGkfvBVE/mAErwYHouAg9Jb92Xg9vS+UbR/KBO+1Jn1QZWuS0jSMWPJq8c4Hmk+nmjpRFn6SvRz/T/Av9e1KaZXtrZqiSHoWz6ZXZRlojWuniPiKR8jGa1CNlHz8Vg1naQavwn6xilOP5NwuqeU+tka0w/RfSQsMLPsZLlFcBKW9ChmFlIb+lGCsvUvpO7XkxkvCDfn8kNCwaGJL7h/RqDcJSlbRc+gHYIdmabotGBg0hPHHkJLRGfsT8M+34Wzrm7Y959Wy4n92YCovMuzvAw4mIvxFlOMhTS9CSVTkUiPZYpQInJWerUZXJ+WNCIGVP63P3r Uan/z4vI uEqFU9Gy4Zvba1eY29nRPvDYYRFWQOP+es3xcg8W5eEX65b97t9mrvwr3W3o8LoyKGY2R/4Y1jYdoawW8+mUAuhG3ZsLUl4Xijk/FBb/sErlvS0hkMFw+PyqBeeN6hRFdI7X/qkh6tkOmAdAcy2zbc2ks/VMGmlls9Ld62EAt0cVXbhhdOaeGpb0+YVtRbdEvyegFz6AvhOTZ+SBea6GW/ufEXYmIX2qx7Zmj1ZfwxbYf8t7VuqXLUkpcQIUjt/KrBY3ueFgzu6xKBO/yYgZCHCsEU7tMJWytAu7dcAA/5NsQS3O2hz8R1nES0Gh6MYx3ZUO2FRZ6VoYjh7dDjVABoXXfsbhWryZHwr37qqNL1h2UCilRzsW7ALigdaY7ssjoEbRnml/Ucl1bC+dYxUNuH57k5xNrstsZUSdwEPvmDmkwGAlWM2Dt4AaU2xTbNAKJ+fIVdwAh72FbrVDO0QGu4mgV+cb/A0nUO1OG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [...] > Thank you for walking with me, Bijan. I understand and agree your concerns. > Actually, this kind of unnecessary ping-pong is a general problem for DAMOS. > We hence made a few DAMOS features to avoid this issue. > > The first feature is 'age' reset. DAMOS sets 'age' of regions to zero when it > applies an action. Hence if your DAMOS scheme has minimum 'age' for the target > access pattern, the region will not be selected as action target again, very > soon. > > The second feature is the quota. You can set speed limit of a DAMOS action, to > avoid DAMOS being too aggressive. When DAMOS finds memory regions that > eligible for a given action and larger than the given quota, it calculates > access temperature of regions, and apply the action to only hottest or coldest > regions of quota amount. Whether to prioritize hotter or colder depends on the > action. DAMOS_MIGRATE_HOT prefers hotter one. Together with the age reset, > this can reduce unnecessary pingpong. > > The third feature is quota auto-tuning. You can ask DAMON to adjust the quotas > on its own, based on some metrics. Let me describe an example with memory > tiering use case. Consider there are two NUMA nodes of different speed. Node > 0 is faster than node 1, samely for every CPU. Then you can ask DAMON to > migrate hot pages on node 1 to node 0 aiming 99% of node 0 memory be allocated, > while migrating cold pages on node 0 to node 1 aiming 1% of node 0 memory be > free. Then, DAMON will adjust the quotas for two different schemes based on > current node 0 memory used/free amount. If node 0 memory is used less than > 99%, hot pages migration scheme will work. The aggressiveness will be > determined on the difference between the current memory usage and the target > usage. For example, DAMON will try to migrate hot pages faster when node 0 > memory usage is 50%, compared to when node 0 memory usage is 98%. The cold > pages migration scheme will do nothing when node 0 memory is used less than > 99%, since its goal (1% node 0 free memory ratio) is already over-achieved. > When the node 0 memory usage becomes 99% and no more allocation is made, DAMON > will be quiet. Even if a few more allocations happen, DAMON will work in slow > speed, and hence make only reasonable and healthy amount of noise. > > Back to your use case, you could set per-node ideal memory usage of > interleaving as the quota goal. For example, on the 1:1 ratio interleaving on > 2 NUMA nodes, you could use two DAMOS scheme, one aiming 50% node 0 memused, > and other one aiming 50% node 0 memfree. Once pages are well interleaved, both > schemes will stop working for unnecessary pingponging. > > Note that you can one of quota auto-tuning metric that DAMON support is > arbitrary user input. When this is being used, users can simply feed any value > as current value of the goal metric. For example, you can use application's > performance metric, memory bandwidth, or whatever. You could see the > node0-node1 balance from your user-space tool and feed it to DAMON quota > auto-tuning. Then, DAMON will do more migration when it is imbalanced, and no > more migration when it is well balanced. > > Finally, you can change DAMON parameters including schemes while DAMON is > running. You can add and remove schemes whenever you want, while DAMON keeps > monitoring the access pattern. Your user-space tool can determine how > aggressive migration is required based on current memory balance and adjust > DAMOS quotas online, or even turns DAMOS schemes off/on on demand. > > So I think you could avoid the problem using these features. Does this make > sense to you? > > In future, we could add more DAMOS self-feedback metric for this use case. For > example, the memory usage balance of nodes. My self-tuning example above was > using two schemes since there is no DAMOS quota goal tuning metric that can > directly be used for your use case. But I'd say that shouldn't be a blocker of > this work. Hi SeongJae, I really appreciate your detailed response. The quota auto-tuning helps, but I feel like it's still not exactly what I want. For example, I think a quota goal that stops migration based on the memory usage balance gets quite a bit more complicated when instead of interleaving all data, we are just interleaving *hot* data. I haven't looked at it extensively, but I imagine it wouldn't be easy to identify how much data is hot in the paddr setting, especially because the regions can contain a significant amount of unallocated data. Also, if the interleave weights changed, for example, from 11:9 to 10:10, it would be preferable if only 5% of data is migrated; however, with the round robin approach, 50% would be. Finally, and I forgot to mention this in my last message, the round-robin approach does away with any notion of spatial locality, which does help the effectiveness of interleaving [1]. I don't think anything done with quotas can get around that. I wonder if there's an elegant way to specify whether to use rmap or not, but my initial feeling is that might just add complication to the code and interface for not enough benefit. Maybe, as you suggest later on, this is an indication that my use case is a better fit for a vaddr scheme. I'll get into that more below. > > Using the VMA offset to determine where a page > > should be placed avoids this problem because it gives a folio a single > > node it can be in for a given set of interleave weights. This means > > that in steady state, no folios will be migrated. > > This makes sense for this use case. But I don't think this makes same sense > for possible other use cases, like memory tiering on systems having multiple > NUMA nodes of same tier. I see where you're coming from. I think the crux of this difference is that in my use case, the set of nodes we are monitoring is the same as the set of nodes we are migrating to, while in the use case you describe, the set of nodes being monitored is disjoint from the set of migration target nodes. I think this in particular makes ping ponging more of a problem for my use case, compared to promotion/demotion schemes. > If you really need this virtual address space based > deterministic behavior, it would make more sense to use virtual address spaces > monitoring (damon-vaddr). Maybe it does make sense for me to implement vaddr versions of the migrate actions for my use case. One thing that gives me pause about this, is that, from what I understand, it would be harder to have vaddr schemes apply to processes that start after damon begins. I think to do that, one would have to detect when a process starts, and then do a damon tune to upgrade the targets list? It would be nice if, say, you could specify a cgroup as a vaddr target and track all processes in that cgroup, but that would be a different patchset for another day. But, using vaddr has other benefits, like the sampling would take into account the locality of the accesses. There are also ways to make vaddr sampling more efficient by using higher levels of the page tables, that I don't think apply to paddr schemes [2]. I believe the authors of [2] said they submitted their patches to the kernel, but I don't know if it has been upstreamed (sorry about derailing the conversation slightly). [1] https://elixir.bootlin.com/linux/v6.16-rc3/source/mm/mempolicy.c#L213 [2] https://www.usenix.org/conference/atc24/presentation/nair