From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB1CC71135 for ; Fri, 13 Jun 2025 09:55:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9A796B007B; Fri, 13 Jun 2025 05:55:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D4A886B0089; Fri, 13 Jun 2025 05:55:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C61676B008A; Fri, 13 Jun 2025 05:55:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AB4AF6B007B for ; Fri, 13 Jun 2025 05:55:39 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 12D661A1884 for ; Fri, 13 Jun 2025 09:55:39 +0000 (UTC) X-FDA: 83549920398.05.1C37C46 Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92]) by imf23.hostedemail.com (Postfix) with ESMTP id 6E8BF140007 for ; Fri, 13 Jun 2025 09:55:36 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf23.hostedemail.com: domain of rakie.kim@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=rakie.kim@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749808537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=esgJPGPraoMf5wI9wS/hVsJSFl4LKAsyCro+t1aZtow=; b=Dq5vS8anzZpFJaDto5zxh5O8vbEMEGQ4Qxhr1WT68ptn+GXxF/gkMpjutI9wxMFQvak0Mz OkaRUkpL6o+gw3IZTUJWnGI1IUC5ijvXOwsXv9MUFGGjyp7ekiVID+npoF9Krf9DPLlNbw EGWtvbk3WUWlnlso/vX7XEfDx4gkV6A= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf23.hostedemail.com: domain of rakie.kim@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=rakie.kim@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749808537; a=rsa-sha256; cv=none; b=lbxFZaR5vOTfGAD9Iz+ZheuM8FX+FWBww0TjFQVNcW2fEzbIzdDuYa1H60OrryCxxecfGO K7NYxKrXcA6w1qF4XZIq8aLbgpQTNXQL2SN8OgwEFPXbmI29W83PMEOgegszL2HqhugA07 DerJNRKv4J4KBeehO4fZDhKRBz8MQ+w= X-AuditID: a67dfc5b-669ff7000002311f-b3-684bf593d06c From: Rakie Kim To: Bijan Tabatabai Cc: sj@kernel.org, akpm@linux-foundation.org, corbet@lwn.net, david@redhat.com, ziy@nvidia.com, matthew.brost@intel.com, joshua.hahnjy@gmail.com, rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net, ying.huang@linux.alibaba.com, apopple@nvidia.com, bijantabatab@micron.com, venkataravis@micron.com, emirakhur@micron.com, ajayjoshi@micron.com, vtavarespetr@micron.com, damon@lists.linux.com, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kernel_team@skhynix.com Subject: Re: [RFC PATCH 0/4] mm/damon: Add DAMOS action to interleave data across nodes Date: Fri, 13 Jun 2025 18:55:17 +0900 Message-ID: <20250613095525.1845-1-rakie.kim@sk.com> X-Mailer: git-send-email 2.48.1.windows.1 In-Reply-To: <20250612181330.31236-1-bijan311@gmail.com> References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrKIsWRmVeSWpSXmKPExsXC9ZZnoe7kr94ZBsc2mVus2NvKbjFn/Ro2 i103Qiwafnxms1hw7zyjxZMD7YwWCy9uYbT4uv4Xs8WMD5fYLX7ePc5ucXzrPHaLhW1LWCwu 75rDZnFvzX9Wi2990haHv75hstjZfIfJ4vi9SewWq9dkWMw+eo/dQcRj56y77B7dbZfZPRbv ecnksWlVJ5vHpk+T2D1OzPjN4rHzoaXH9I7nQMm+yawe39d3sHn0Nr9j83i/7yqbx+dNcgG8 UVw2Kak5mWWpRfp2CVwZt9+/Zypo1ayY9WcHawPjW4UuRk4OCQETicvXelhg7Av3DgLZHBxs AkoSx/bGgIRFBDQkdny/w97FyMXBLDCdRWLZvVPsIAlhgXCJxmm/wWwWAVWJ6xt+MYHYvEBz VhzuYYOYqSnRcOkeWJxTwEJiXf9OsLiQAI/Eqw37GSHqBSVOznwCdgOzgLxE89bZzCDLJAQO sUus3n2GFWKQpMTBFTdYJjDyz0LSMwtJzwJGplWMQpl5ZbmJmTkmehmVeZkVesn5uZsYgZG3 rPZP9A7GTxeCDzEKcDAq8fBa7PLKEGJNLCuuzD3EKMHBrCTCy3gFKMSbklhZlVqUH19UmpNa fIhRmoNFSZzX6Ft5ipBAemJJanZqakFqEUyWiYNTqoFx4rG890wvk89d69w7lYfptVGj8e+i y9tXtjTb91qqnKr5b3TQOOP4i+31/W7iqWufT/U5svBX/ooAT/b+fYZX9Ble9H/gSH3G9u96 VdOZh7GCSi8uVXxrnC8RWC0zT2HZ9vbzomVn9/i+F4vblmm85Nvts24VhsdsBb4rXH7ld+6w 8aKlJzZHK7EUZyQaajEXFScCABdFEka4AgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprFIsWRmVeSWpSXmKPExsXCNUNNS3fyV+8MgwXv1CxW7G1lt5izfg2b xa4bIRYNPz6zWSy4d57R4tyU2WwWTw60M1osvLiF0eLr+l/MFjM+XGK3+Hn3OLvF8a3z2C0O zz3JarGwbQmLxeVdc9gs7q35z2rxrU/a4tC156wWh7++YbLY2XyHyeL4vUnsFqvXZFjMPnqP 3UHcY+esu+we3W2X2T0W73nJ5LFpVSebx6ZPk9g9Tsz4zeKx86Glx/SO50DJvsmsHt/Xd7B5 9Da/Y/N4v+8qm8e32x4ei198YPL4vEkugD+KyyYlNSezLLVI3y6BK+P2+/dMBa2aFbP+7GBt YHyr0MXIySEhYCJx4d5Bli5GDg42ASWJY3tjQMIiAhoSO77fYe9i5OJgFpjOIrHs3il2kISw QLhE47TfYDaLgKrE9Q2/mEBsXqA5Kw73sEHM1JRouHQPLM4pYCGxrn8nWFxIgEfi1Yb9jBD1 ghInZz5hAbGZBeQlmrfOZp7AyDMLSWoWktQCRqZVjCKZeWW5iZk5pnrF2RmVeZkVesn5uZsY gTG2rPbPxB2MXy67H2IU4GBU4uG12OWVIcSaWFZcmXuIUYKDWUmEl/EKUIg3JbGyKrUoP76o NCe1+BCjNAeLkjivV3hqgpBAemJJanZqakFqEUyWiYNTqoGx/9vVaYZet1Z/dewrOPWyOObw 4z3B7PpuO6xOZSnvSN9260Sb+YNZW71esd+ri5yy8LjvnuzuV5//9Tzs4ZZmalD6cFvhu8ax oLWhvy4znDSy7MmX/dIbuaCDO9t1muBEpTflAYwFTOwPlfsUixfrb8nsynz0U2ajcNTtpLk5 VUU/6o0umz5WYinOSDTUYi4qTgQATQbrLq0CAAA= X-CFilter-Loop: Reflected X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6E8BF140007 X-Stat-Signature: 9z7m5384i7rzp863s4xbwmhzfifianuw X-Rspam-User: X-HE-Tag: 1749808536-817331 X-HE-Meta: U2FsdGVkX1/IaOOZKE6DOBPXt2jdUl/ZRTUxTU/vvBwVlBZ6VtxY8NNXE9cXaRAunliiWTN23SUoPVbDqz5Kji/lijdV9CsUoAw9PVF2Uy5ZY+boBlJfJ8nmt7xieFr4N/Vf3H7DksoP5rXvV7Kd19uoUWcUmJ2yt9262SKCzG2CZqB0hqVrdQ7c1bYN1OeAHWUOKd1kY9p62faflCiaFLqA4e9fToQPQohwZm/lIYdGVCuxk/J7AC8VLF7EHM/B+/IlBjSnKxZ8JrtOoIQXJK1rIO91THj1j04NKkU19eZ41YH9JEfFiVpaJFrJDBTsw40H6sfEUv2vvfFE3nkh4V5KDcZ+zhu5R7dW6We1cC1/xp12kvte+WLO0DWwIh9X/NSKNR1iBPfX4fQ5CwcMGlm7W/feDJs31RJKVjetNtcqF7cM2MPb0Mwrecl1bs2VTsh/iCWkznIvHvEu337wM7knC2o068Kd2GJZLhxOvIh/r/3kugN8PUICERM/qc5KlpvBqdmey6l5zuu23nDyxZHMOy4AMQNODEWAr3kpWdcpkpvaxw6qpZCLZJ99xvUicaGwcMf4BsYo005zt2t/wpFH8fs3r4gwKDQ4puPhpVKe7bQ6T6HgDusTDzdJDhZxzeb2P48p9zmMnwS1DJLOij3KYvg+SzbNcHBX/r33YDRvX1P8d6/PJfMuZyni2E3g2wMvIw/YPZr671lgBH2ZVfTprR8eH6uw+DIXDs/j6wwXoEHJOCZrck5u3KOmCbwgZHDvhMhTEScOqawStdvDBIJsO/Zy0d5AtXlx2oofmL2vl76MNFsxC27mVYb1g7fJX7s6yzt06b23OeOLeryangayAxRgvhquDbOStkkVZQRILEHWFOo/pOxgJjdt0B4YdDnnUCq5kemp7qh0Rux7/R6dLUDtG1HnAAun3OQGtV8nhn9TkK0N1cenK/RtC1LwoNu54mwB1tYjbvOLEjj c6SZqc8S +1FLl1jbrYY7h/tHqFB9mkLOOYD4+x0TPr94/jqJw9v20ZGIwEmGFSoINdcZDI8m/8a3BTnSrgjyLgDR8JxetTTNsKTHO7sx4qQ3rMwAMiFqMnutKbEnPn3Sz6VWDjBwjao5+AjQVeflBJuQmWh3csiOFWg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 12 Jun 2025 13:13:26 -0500 Bijan Tabatabai wrote: > From: Bijan Tabatabai > > A recent patch set automatically set the interleave weight for each node > according to the node's maximum bandwidth [1]. In another thread, the patch > set's author, Joshua Hahn, wondered if/how these weights should be changed > if the bandwidth utilization of the system changes [2]. > > This patch set adds the mechanism for dynamically changing how application > data is interleaved across nodes while leaving the policy of what the > interleave weights should be to userspace. It does this by adding a new > DAMOS action: DAMOS_INTERLEAVE. We implement DAMOS_INTERLEAVE with both > paddr and vaddr operations sets. Using the paddr version is useful for > managing page placement globally. Using the vaddr version limits tracking > to one process per kdamond instance, but the va based tracking better > captures spacial locality. Hi Bijan, Thank you for explaining the motivation and need behind this patch. I believe it's important to consider the case where a new memory node is added and the interleave weight values are recalculated. If a new memory node (say, node2) is added, there are two possible approaches to consider. 1. Migrating pages to the newly added node2. In this case, there is a potential issue where pages may be migrated to node2, even though it is not part of the nodemask set by the user. 2. Ignoring the newly added node2 and continuing to use only the existing nodemask for migrations. However, if the weight values have been updated considering node2 performance, avoiding node2 might reduce the effectiveness of using Weighted Interleave. It would be helpful to consider these two options or explore other possible solutions to ensure correctness. Rakie > > DAMOS_INTERLEAVE interleaves pages within a region across nodes using the > interleave weights at /sys/kernel/mm/mempolicy/weighted_interleave/node > and the page placement algorithm in weighted_interleave_nid via > policy_nodemask. We chose to reuse the mempolicy weighted interleave > infrastructure to avoid reimplementing code. However, this has the awkward > side effect that only pages that are mapped to processes using > MPOL_WEIGHTED_INTERLEAVE will be migrated according to new interleave > weights. This might be fine because workloads that want their data to be > dynamically interleaved will want their newly allocated data to be > interleaved at the same ratio. > > If exposing policy_nodemask is undesirable, we have two alternative methods > for having DAMON access the interleave weights it should use. We would > appreciate feedback on which method is preferred. > 1. Use mpol_misplaced instead > pros: mpol_misplaced is already exposed publically > cons: Would require refactoring mpol_misplaced to take a struct vm_area > instead of a struct vm_fault, and require refactoring mpol_misplaced and > get_vma_policy to take in a struct task_struct rather than just using > current. Also requires processes to use MPOL_WEIGHTED_INTERLEAVE. > 2. Add a new field to struct damos, similar to target_nid for the > MIGRATE_HOT/COLD schemes. > pros: Keeps changes contained inside DAMON. Would not require processes > to use MPOL_WEIGHTED_INTERLEAVE. > cons: Duplicates page placement code. Requires discussion on the sysfs > interface to use for users to pass in the interleave weights. > > This patchset was tested on an AMD machine with a NUMA node with CPUs > attached to DDR memory and a cpu-less NUMA node attached to CXL memory. > However, this patch set should generalize to other architectures and number > of NUMA nodes. > > Patches Sequence > ________________ > The first patch exposes policy_nodemask() in include/linux/mempolicy.h to > let DAMON determine where a page should be placed for interleaving. > The second patch implements DAMOS_INTERLEAVE as a paddr action. > The third patch moves the DAMON page migration code to ops-common, allowing > vaddr actions to use it. > Finally, the fourth patch implements a vaddr version of DAMOS_INTERLEAVE. > > [1] https://lore.kernel.org/linux-mm/20250520141236.2987309-1-joshua.hahnjy@gmail.com/ > [2] https://lore.kernel.org/linux-mm/20250313155705.1943522-1-joshua.hahnjy@gmail.com/ > > Bijan Tabatabai (4): > mm/mempolicy: Expose policy_nodemask() in include/linux/mempolicy.h > mm/damon/paddr: Add DAMOS_INTERLEAVE action > mm/damon: Move damon_pa_migrate_pages to ops-common > mm/damon/vaddr: Add vaddr version of DAMOS_INTERLEAVE > > Documentation/mm/damon/design.rst | 2 + > include/linux/damon.h | 2 + > include/linux/mempolicy.h | 2 + > mm/damon/ops-common.c | 136 ++++++++++++++++++++ > mm/damon/ops-common.h | 4 + > mm/damon/paddr.c | 198 +++++++++++++----------------- > mm/damon/sysfs-schemes.c | 1 + > mm/damon/vaddr.c | 124 +++++++++++++++++++ > mm/mempolicy.c | 4 +- > 9 files changed, 360 insertions(+), 113 deletions(-) > > -- > 2.43.5 > >