From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6502CC71136 for ; Fri, 13 Jun 2025 17:12:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBDB76B007B; Fri, 13 Jun 2025 13:12:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D95666B0089; Fri, 13 Jun 2025 13:12:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAC666B008A; Fri, 13 Jun 2025 13:12:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AD8436B007B for ; Fri, 13 Jun 2025 13:12:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4EFC3C0D33 for ; Fri, 13 Jun 2025 17:12:42 +0000 (UTC) X-FDA: 83551021764.23.3942ADA Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf06.hostedemail.com (Postfix) with ESMTP id 8FB2918000B for ; Fri, 13 Jun 2025 17:12:40 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CYig8c2K; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf06.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749834760; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8rQx6FGuBfcHackFDYKJgf5kXt3qmAv56HMRwMRsAY8=; b=Zj0uDSS29OMbD0KU0MVdJ5LnHydL4YL8g3lPdxy8ATk7IPft3vblZLhkP0CbD2PL1azHMo w4XLhH6P/v6x3Bq4CrmIeTN4pKL+KpSOnmk7w/yt4+Eypv3nyN86tuRev6UhEwndyC0VDq pAIMzedfgpyl0QYNKoOdbVtCY7gWRRs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749834760; a=rsa-sha256; cv=none; b=Kis7CAXyWSax0Jw8om70idiRwywcIBAucymzvQMnsw+pog7h80/WQ/erJuMNU+ENs3M0Xi ekrXudCVTKPZipDN8rnVdDTTgz3sxaJJtXSsg8pJvRxbuqbMfi2Zz10TKLzUm3eYd6557m JExjDGTzE2bwFQrKSaUMAwkwdJ/99xk= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CYig8c2K; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf06.hostedemail.com: domain of sj@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=sj@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id DB7C1A515C0; Fri, 13 Jun 2025 17:12:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54FF6C4CEEB; Fri, 13 Jun 2025 17:12:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749834759; bh=xM4LY8gCbVDYFCPbHwGXzUDBh//wknD9c5Jupk3PfVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CYig8c2KsM8V1idlrJDousBAEU1xY7riLzZWMBTWE3Ezu2VsSR3yMHdo+EZrRPviZ BydL+68ZnGxEMYLCvzJpfV6fRy61xlYThU65/p/1tVPw+pdqDPIpGcOWMgAW7aPkzV +tGA94ZGrvKjJz2Dw42l0Hk0JshDRSo8/zt7gSQwRbDBOU35210VWs+2xsarH8E1Oi b//BrwyccxdIhvOvQ9vNzKXzbT9lwnSYCGMWxiBxqU0MrxS2kLsljTeexSkJOAQIFr /62nri21isMHRdLqH7ztmcGqHGFYbGMed7zYnKEp9ZPSushEVo8pS0XzkEiAAfXdzO RGIdY7xlvt3Qw== From: SeongJae Park To: Bijan Tabatabai Cc: SeongJae Park , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, corbet@lwn.net, david@redhat.com, ziy@nvidia.com, matthew.brost@intel.com, joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net, ying.huang@linux.alibaba.com, apopple@nvidia.com, bijantabatab@micron.com, venkataravis@micron.com, emirakhur@micron.com, ajayjoshi@micron.com, vtavarespetr@micron.com, damon@lists.linux.dev Subject: Re: [RFC PATCH 0/4] mm/damon: Add DAMOS action to interleave data across nodes Date: Fri, 13 Jun 2025 10:12:37 -0700 Message-Id: <20250613171237.44776-1-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: oidp18ont8upyzuugqegshqtqantt8nx X-Rspamd-Queue-Id: 8FB2918000B X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1749834760-420634 X-HE-Meta: U2FsdGVkX18cdVSNtwq57q0XYLsjW/r8b5WV6NSnw4FMugM9cME3L1QNW84BWYm/4NneD5kfBhDHukR19Lx01vgdBBQDac02qKPbbhEuvCFWqbQPDbp2XKqAbFahjpZ2LMH9U/j3PMov/6ZXvQprgCvCPia8jh7MIHlpndWtNayAGDU7WiZjw2TrPhcBeoXP/RSzoNplX1we3Q3pxKcxlsXDxkh+m0KVEn5OZtILxjBERnJd94sSpz+k7UnBdDNWsFrSrffMMNkjYAiIsEu25a7RcyHBPKUGeapPFDg6lAWui6IJfHnwRjnFU5aGuYZT+mdswIKK1JIOT0/QIyXsVzFme1oikehubTsZWTqxT4Pf0atKvldZuwYjBrYBxf1tQiU9yaem7RW1PHVOYPL9BK4BU9LVgpo2/uYBpVtOv7cGd0Y5JSRA44KMVsqk7LvH8UqXAYJWLrDb9lxwYZzPKdYBu6jeqx6DmafxmmpO3G36uTNIX+j8OsCvU0Cv+w0iCxX+1VeFllIifqtEDyS+O6sF0syqMCUkF4QVQ4kvPvjpVK6zT+Ewk97RfiIQY6NfqG7V5PywFKgM0bp2s6H7MpYp93axrhkJ9ahjAt1cKqPt41mXe4x179Y/sWotXjt6ElwiP0xFP25tbQzHkU/leHZRD0DIjFB8ai0hpPPMYbKRtSO2GGBPe8RUgH5KAXIcMabShqailMMusiwwnGopYW6s2zDfZ9bKFKQ6GRB50s9l9S/09tN5KxaI0xN1L9hVkxZ6BWZq7Hmb9bWGhGWJjMfAxJV+hwgI1nSTA8fhK5/kTKwInqDvJSZ78PMyAUlz34oweNU7LO04ZkmT3KNZR6ny96gVaLABm2ZuwVX0enQMlS1Jw5kN1ItTTG9axzLaTQIqUPR591gQRrzkuV/03zCgCHYiMMlgBgxUtGCAljlH0+YFB5kXSiyAYGP6j87z5H3MtgiD+CmMJLOC11v v1N+/ksf tDWg/3icasNGVs25NdhFUh65kposSwCnuLlZmNBh/pN/6ILE6qICiAtZAziMjG22jNJc2cUamE1kf/qALFybk8SoIZSiq+p9B0nO7f/R/5CW7kLUrPkrU/7JVibLl9WDtUm40JT8szabxhMLRkLY98FTH9eIBCElzqFpgXt/m+8ck7obr0zaD/TfTNnS5CB0JGRW/c/C2vspUNgI3VW8xKoShgQ/eN0PJap+eXNhdr9zejJzbXDUpFDrU7Eih09+Fsa1843ufhAsEZeej6WJwkbGshV+Dj4Kk4+zBcWEPHWN1K0PwNH4IQd4Rwlw1F3osKslL8+HCS0bKOlxCnA4SYjYqFEEU2rO0X7sxeYyQB/S1tYc2bVd9oUFtYgbaoZuxcRTzhmRf4O3ak7pjbTAhptYfQkgInRv+yRuOpMd+04/uYktFbeqsn/Bw7g+ao+QnzI/t X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 13 Jun 2025 10:44:17 -0500 Bijan Tabatabai wrote: > Hi SeongJae, > > Thank you for your comments. > > On Thu, Jun 12, 2025 at 6:49 PM SeongJae Park wrote: > > > > Hi Bijan, > > > > On Thu, 12 Jun 2025 13:13:26 -0500 Bijan Tabatabai wrote: > > > > > From: Bijan Tabatabai > > > [...] > > What about extending DAMOS_MIGRATE_{HOT,COLD} to support your use case? For > > example, letting users enter special keyword, say, 'weighted_interleave' to > > 'target_nid' DAMON sysfs file. In the case, DAMOS_MIGRATE_{HOT,COLD} would > > work in the way you are implementing DAMOS_INTERLEAVE. > > I like this idea. I will do this in the next version of the patch. Great, looking forward to that! > I > have a couple of questions > about how to go about this if you don't mind. Of course I don't :) > > First, should I drop the vaddr implementation or implement > DAMOS_MIGRATE_{HOT,COLD} > in vaddr as well? I am leaning towards the former because I believe > the paddr version is > more important, though the vaddr version is useful if the user only > cares about one > application. I show no problem at dropping the vaddr implementation. Please do what you want and need to do on your pace :) > > Second, do you have a preference for how we indicate that we are using > the mempolicy > rather than target_nid in struct damos? I was thinking of either > setting target_nid to > NUMA_NO_NODE or adding a boolean to struct damos for this. I'd prefer adding a boolean to 'struct damos'. > > Maybe it would also be a good idea to generalize it some more. I > implemented this using > just weighted interleave because I was targeting the use case where > the best interleave > weights for a workload changes as the bandwidth utilization of the > system changes, which > I will go describe in more detail further down. However, we could > apply the same logic for > any mempolicy instead of just filtering for MPOL_WEIGHTED_INTERLEAVE. This might > clean up the code a little bit because the logic dependent on > CONFIG_NUMA would be > contained in the mempolicy code. Yes, I agree. Such flexibility sounds useful :) In future, I think we could further let users set multiple target nodes for DAMOS_MIGRATE_{HOT,COLD} with arbitrary weights. [...] > > I show the test results on the commit messages of the second and the fourth > > patches. In the next version, letting readers know that here would be nice. > > Also adding a short description of what you confirmed with the tests here > > (e.g., with the test we confirmed this patch functions as expected [and > > achieves X % Y metric wins]) would be nice. > > > > Noted. I'll include this in the cover letter of the next patch set. Thank you! :) [...] > > I think it would also be nice if you could add more explanation about why you > > picked DAMON as a way to implement this feature. I assume that's because you > > found opportunities to utilize this feature in some access-aware way or > > utilizing DAMOS features. I was actually able to imagine some such usages. > > For example, we could do the re-interleaving for hot or cold pages of specific > > NUMA nodes or specific virtual address ranges first to make interleaving > > effective faster. > > Yeah, I'll give more detail on the use case I was targeting, which I > will also include > in the cover letter of the next patch set. > > Basically, we have seen that the best interleave weights for a workload can > change depending on the bandwidth utilization of the system. This was touched > upon in the discussion in [1]. As a toy example, imagine some > application that uses > 75% of the local bandwidth. Assuming sufficient capacity, when running alone, we > probably want to keep all of that application's data in local memory. > However, if a > second instance of that application begins, using the same amount of bandwidth, > it would be best to interleave the data of both processes to alleviate > the bandwidth > pressure from the local node. Likewise, when one of the processes ends, the data > should be moved back to local memory. > > We imagine there would be a userspace application that would monitor system > performance characteristics, such as bandwidth utilization or memory > access latency, > and uses that information to tune the interleave weights. Others seemed to have > come to a similar conclusion in previous discussions [2]. We are > currently working > on a userspace program that does this, but it's not quite ready to be > published yet. Sounds interesting, looking forward! Note that DAMOS has internal feedback loop for auto-tuning aggressiveness of a given scheme, and the feedback loop accepts system metrics or arbitrary user inputs. I think the userspace program _might_ be able to give the arbitrary feedback. We could also think about extending the list of DAMOS-accepting feedback system metrics to memory bandwidth. > > After the userspace application adjusts the interleave weights, we need some > mechanism to migrate the application pages that have already been allocated. > We think DAMON is the correct venue for this mechanism because we noticed > that we don't have to migrate all of the application's pages to > improve performance, > we just need to migrate the frequently accessed pages. DAMON's existing hotness > tracking is very useful for this. Additionally, as Ying pointed out > [3], a complete > solution must also handle when a memory node is at capacity. The existing > DAMOS_MIGRATE_COLD action can be used in conjunction with the functionality > in this patch set to provide that complete solution. > > [1] https://lore.kernel.org/linux-mm/20250313155705.1943522-1-joshua.hahnjy@gmail.com/ > [2] https://lore.kernel.org/linux-mm/20250314151137.892379-1-joshua.hahnjy@gmail.com/ > [3] https://lore.kernel.org/linux-mm/87frjfx6u4.fsf@DESKTOP-5N7EMDA/ Thank you for this nice and informative description of the use case! > > > Also we could apply a sort of speed limit for the interleaving-migration to > > ensure it doesn't consume memory bandwidth too much. The limit could be > > arbitrarily user-defined or auto-tuned for specific system metrics value (e.g., > > memory bandwidth balance?). > > I agree this is a concern, but I figured DAMOS's existing quota mechanism would > handle it. If you could elaborate on why quotas aren't enough here, > that would help > me come up with a solution. What I wanted to say is, we could use DAMOS's existing quota mechanism to handle it. DAMOS quota feature is just another name of [auto-tunable] speed limit. Sorry for confusing you. Anyway, happy to confirm this is yet another DAMOS feature that could be useful for your and future cases. > > > > If you have such use case in your mind or your test setups, sharing those here > > or on the next versions of this would be very helpful for reviewers. > > Answered above. I will include them in the next version. That was very helpful. Keeping that on the next version will be helpful for new readers such as future SJ :) [1] https://origin.kernel.org/doc/html/latest/mm/damon/design.html#aim-oriented-feedback-driven-auto-tuning Thanks, SJ [...]