From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 66A72C282EC
	for <linux-mm@archiver.kernel.org>; Fri, 14 Mar 2025 15:11:48 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 7EF32280003; Fri, 14 Mar 2025 11:11:46 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 79E44280001; Fri, 14 Mar 2025 11:11:46 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 6679F280003; Fri, 14 Mar 2025 11:11:46 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13])
	by kanga.kvack.org (Postfix) with ESMTP id 4B55E280001
	for <linux-mm@kvack.org>; Fri, 14 Mar 2025 11:11:46 -0400 (EDT)
Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay05.hostedemail.com (Postfix) with ESMTP id 34A7E56AFE
	for <linux-mm@kvack.org>; Fri, 14 Mar 2025 15:11:44 +0000 (UTC)
X-FDA: 83220496128.03.09FA5E8
Received: from mail-yb1-f180.google.com (mail-yb1-f180.google.com [209.85.219.180])
	by imf21.hostedemail.com (Postfix) with ESMTP id BE7531C0014
	for <linux-mm@kvack.org>; Fri, 14 Mar 2025 15:11:41 +0000 (UTC)
Authentication-Results: imf21.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=GNZ8PUDR;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf21.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741965101; a=rsa-sha256;
	cv=none;
	b=g5FRWCNlQHx3C912vxkuZDShdcdECdePhnuYFhxiW0yFY+fAAkmcH+q0cmvX7i7igqMjxP
	IZQjqgx58GjdAxjIg+OV3RJZrDZVGTN30bj9s07ImPPH/naRAo8xyyWVk0/wAqQKALFcS6
	t/3n0VX3gYKjRRwUGziCxE0rT3UV0XA=
ARC-Authentication-Results: i=1;
	imf21.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=GNZ8PUDR;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf21.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1741965101;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=rVFdW0JzvmKVfiJgTP3XByL6LpPYI/HB61BI1n8otAI=;
	b=Y1FUtpukHnw0tCPLTzuW/zoHbfTAKWy9orGgfV7w8zGA+eGW+PQtZY7aBNEkIHI/jU/9qp
	LM26VbtA4vACyFVju2702IMccnqgSJj/hjtL4eA8BsWfww18zu2ACTA2LLCpheTpLe1zF5
	lz6kftYwabxT23XdVi68Dnp2Hf/b9sg=
Received: by mail-yb1-f180.google.com with SMTP id 3f1490d57ef6-e60aef2711fso1675661276.2
        for <linux-mm@kvack.org>; Fri, 14 Mar 2025 08:11:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1741965101; x=1742569901; darn=kvack.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=rVFdW0JzvmKVfiJgTP3XByL6LpPYI/HB61BI1n8otAI=;
        b=GNZ8PUDRB7UIsHmdGZR1L06ZtMsYDUkSBi1mYhBH36d5T4f37B7qr9R5UZ2/DpHjKI
         SWM5KXxySiyv3HAOOkyTs573mtiLsVC2j4ZWUYZeNFQwl5OHvKWWMpFYBA2SJHdmEjWG
         jgs5Nb90FFeN7ktxG7iJWr4+rTfsCJFBwuHG8IvLgbJaJELMgd6CD+lWVGTFp2uUDTQc
         Jb2DfixGc6FCR2XO+Q7aysL3fuLG3/p6EblU21mLjziFutGtONB/SHnJhF+I30JplhcV
         CG3eY7uzT7AR9ghqFHwPC+NWZ7hYVipDiYEL/SlHs6Rvhrp9M2ZuAL9grmMjgo4aWQXs
         j+mw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1741965101; x=1742569901;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=rVFdW0JzvmKVfiJgTP3XByL6LpPYI/HB61BI1n8otAI=;
        b=w6WOw00PKAC3DZy8uBEW6n++rK7zAtdgRWBdioKhiHzVerfJr9cboebM+I5KZUzIlD
         LeGvH/eexqds811CamjyaX6JnNEAjNEz5Hy6GJ+HV2ifb2FJMyc57kOqegSshXCJPQD9
         1EjNEXuLDYXqFHl6yacowtzem+8P1EOlP4JP8Sq29vIwfpQyMqH0IZgG//tATjsi16Fm
         KXBdA/VGPVVpP4i8DqBZuVxGI17Bxqu67aLnX9G6Jb88utnaNI5tBFfM9k18wBP8Anit
         dBUxF4JZguKkG2D5+N6D/D0R/vuR5bUx2AcuN2mrJAUd4+a52C0eF44bc/tWgQKKtHSV
         V8BA==
X-Forwarded-Encrypted: i=1; AJvYcCU6lKbaFj5MUCGvwoJ40W0lSs/cZs1vvVgRh0RPcprxzkQpl/KPCI80dtgMlJBV0jzm8wz+cg/iug==@kvack.org
X-Gm-Message-State: AOJu0YyaLl6QGDdO/yr/joKc4IcqzmQmu5UoK/rIOjIJODUFQpmxMpf2
	xgnrQ8e+cBTd9waZJvH8NuVglxRRqPHg/H2xfj6J2tv6oaRHalvd
X-Gm-Gg: ASbGncsCOIRFM7tWkeQ9rcWD6WxjRHtekuRH8kg8uv5NN+rWsHGbakathe0K/fWW+4o
	uC7zxkC1IFrnzG8t6u8I7+E2H6GSr/mmX+ONYMYygxdpQtfHxs+JTigEKHYAps4fpNVHWeXZWf7
	RlnXklo9Cj/L4X2WL6umD3Nln1WH3U/7KkQy+XcpBUld9G7VR9U7jOCxrHsa0qOcYTiexDwN7Yw
	sGsSVRxgAbo6bPAUIzc/anoALU7BYsDuIhDvA6xGzyg1um9Qn3OKkp9te+5L6/QqQAS5xISO297
	Ce2zgGwXJ/YqeNh9H1uDo+zNGYt146IT4VGlWXq1b40=
X-Google-Smtp-Source: AGHT+IFirqbnNxGBvRNLFT2ybNgJhdhoVirc5Y8Fuawrx63X0CEMuMxGSQYqtGq8V4vAow53ZLKMjA==
X-Received: by 2002:a05:6902:e03:b0:e5d:dd0a:7fdb with SMTP id 3f1490d57ef6-e63f6533cf8mr3008971276.28.1741965100614;
        Fri, 14 Mar 2025 08:11:40 -0700 (PDT)
Received: from localhost ([2a03:2880:25ff:1::])
        by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e63e742889csm874660276.40.2025.03.14.08.11.39
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 14 Mar 2025 08:11:39 -0700 (PDT)
From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>,
	lsf-pc@lists.linux-foundation.org,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	gourry@gourry.net,
	hyeonggon.yoo@sk.com,
	honggyu.kim@sk.com,
	kernel-team@meta.com
Subject: Re: [LSF/MM/BPF TOPIC] Weighted interleave auto-tuning
Date: Fri, 14 Mar 2025 08:11:16 -0700
Message-ID: <20250314151137.892379-1-joshua.hahnjy@gmail.com>
X-Mailer: git-send-email 2.47.1
In-Reply-To: <20250314141541.00003fad@huawei.com>
References: 
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Rspamd-Server: rspam07
X-Rspam-User: 
X-Stat-Signature: giem8uy1owsq9ktrc783ja74doi3udfp
X-Rspamd-Queue-Id: BE7531C0014
X-HE-Tag: 1741965101-865489
X-HE-Meta: U2FsdGVkX18ccN3tD3c9PMAqSuHdU0gjyEbFa/wmbRN1PiF75tr5QW8PX0b38czqfzeCcwEv75ozxwQbODH1/+ggIQ7r5dHzjpnb0LRYZsjC9YUE5EWF6u4Q+zHrFNR0vA0CGD+DXGb3FvlH/EaAaIMfnyMeMLlNnfXeRy/+B6EzjIs0uqKMsHIVAuD5lHWy5tP8ZkGzemjAnB4l/Kk978DJ4TYVun63kaESPEQ512XdIYfFTAhhfz1afvakPQKTHdUNNMpUyNb1M3ZZSGE1/JAXwi6OLQsl/WTLt0BDc0tIn1VjR3obOqk66OtwwVYwhBmvTJniV0NfqE8S5M0Wt60f7xSBwos7IcLcLIU8uKP5NSwWQNqB/e0HcnKpaUB9W1sqP8mzthtWlYtIFQeK0bgZjkwZaz13WmOnN6Ccrmo4PzQ8DwmirkF75SlZmY+HKbooXHDyisXDNkcf9Y9lb9wdY0acvJ99AGx7JGEtFYrHk4mwgdgnjoBdGjzzPY6nN7xKjeLfzlx58UZ2g8B4L9t/HINo6k2pj8OwR1ckIrpRfIXkGwbNVZf4A0FxoiIa/LNG8XcvJUAV5FR8PriXhRvxm8Qql6v6vg/CEd0nijMWQ8g+bPq1n1Izbj/FPgl9bmR32+34JzfZciavDY1/q8fniq+ESuUVZOJ26Ujg6AtO+FO1AxC/uRVOjmCZgyvI+NaKyHp2U1/1Exjz+jONGpbUS4OHxdzilZnyqx4noeoKFYRCwhsgE+bx6UM8u26ZKwLPX9MXloPzzgv9g9UBP9wunWOmVjQu76oAR8MaY03NTOv2eRBVMwLL7oRmCisUcpv9blzWhSE787GyEd9Eiwjk/viunioB8qEvp/x313IuFauujK8ltbI1Li8tVBwRqe7ff27GvMLptrV9+4cHPtTPTWFyOsgpbeefvy4aLAYgT8GALa5v/nwNXFTa0Y6D77mw445siI3JnHUzbpi
 zNFhyTSS
 Ue85cwTF6MDbWyn0tzKw+lX8K8+mTYER0atK81diy121DHIyULTkjxVi84vOebMcshRs/bFy6TXJsRW4JAsgEBuhyKaiC3dkgUvMS
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Fri, 14 Mar 2025 14:15:41 +0000 Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Fri, 14 Mar 2025 18:08:35 +0800
> "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> 
> > Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> > 
> > > On Thu,  9 Jan 2025 13:50:48 -0500 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> > >  
> > >> Hello everyone, I hope everyone has had a great start to 2025!
> > >> 
> > >> Recently, I have been working on a patch series [1] with
> > >> Gregory Price <gourry@gourry.net> that provides new default interleave
> > >> weights, along with dynamic re-weighting on hotplug events and a series
> > >> of UAPIs that allow users to configure how they want the defaults to behave.
> > >> 
> > >> In introducing these new defaults, discussions have opened up in the
> > >> community regarding how best to create a UAPI that can provide
> > >> coherent and transparent interactions for the user. In particular, consider
> > >> this scenario: when a hotplug event happens and a node comes online
> > >> with new bandwidth information (and therefore changing the bandwidth
> > >> distributions across the system), should user-set weights be overwritten
> > >> to reflect the new distributions? If so, how can we justify overwriting
> > >> user-set values in a sysfs interface? If not, how will users manually
> > >> adjust the node weights to the optimal weight?
> > >> 
> > >> I would like to revisit some of the design choices made for this patch,
> > >> including how the defaults were derived, and open the conversation to
> > >> hear what the community believes is a reasonable way to allow users to
> > >> tune weighted interleave weights. More broadly, I hope to get gather
> > >> community insight on how they use weighted interleave, and do my best to
> > >> reflect those workflows in the patch.  
> > >
> > > Weighted interleave has since moved onto v7 [1], and a v8 is currently being
> > > drafted. Through feedback from reviewers, we have landed on a coherent UAPI
> > > that gives users two options: auto mode, which leaves all weight calculation
> > > decisions to the system, and manual mode, which leaves weighted interleave
> > > the same as it is without the patch.
> > >
> > > Given that the patch's functionality is mostly concrete and that the questions
> > > I hoped to raise during this slot were answered via patch feedback, I hope to
> > > ask another question during the talk:
> > >
> > > Should the system dynamically change what metrics it uses to weight the nodes,
> > > based on what bottlenecks the system is currently facing?
> > >
> > > In the patch, min(read_bandwidth, write_bandwidth) is used as the heuristic
> > > to determine what a node's weight should be. However, what if the system is
> > > not bottlenecked by bandwidth, but by latency? A system could also be
> > > bottlenecked by read bandwidth, but not by write bandwidth.
> > >
> > > Consider a scenario where a system has many memory nodes with varying
> > > latencies and bandwidths. When the system is not bottlenecked by bandwidth,
> > > it might prefer to allocate memory from nodes with lower latency. Once the
> > > system starts feeling pressured by bandwidth, the weights for high bandwidth
> > > (but also high latency) nodes would slowly increase to alleviate pressure
> > > from the system. Once the system is back in a manageable state, weights for
> > > low latency nodes would start increasing again. Users would not have to be
> > > aware of any of this -- they would just see the system take control of the
> > > weight changes as the system's needs continue to change.  
> > 
> > IIUC, this assumes the capacity of all kinds of memory is large enough.
> > However, this may be not true in some cases.  So, another possibility is
> > that, for a system with DRAM and CXL memory nodes.
> > 
> > - There is free space on DRAM node, the bandwidth of DRAM node isn't
> >   saturated, memory is allocated on DRAM node.
> > 
> > - There is no free space on DRAM node, the bandwidth of DRAM node isn't
> >   saturated, cold pages are migrated to CXL memory nodes, while hot
> >   pages are migrated to DRAM memory nodes.
> > 
> > - The bandwidth of DRAM node is saturated, hot pages are migrated to CXL
> >   memory nodes.
> > 
> > In general, I think that the real situation is complex and this makes it
> > hard to implement a good policy in kernel.  So, I suspect that it's
> > better to start with the experiments in user space.
> > 
> > > This proposal also has some concerns that need to be addressed:
> > > - How reactive should the system be, and how aggressively should it tune the
> > >   weights? We don't want the system to overreact to short spikes in pressure.
> > > - Does dynamic weight adjusting lead to pages being "misplaced"? Should those
> > >   "misplaced" pages be migrated? (probably not)
> > > - Does this need to be in the kernel? A userspace daemon that monitors kernel
> > >   metrics has the ability to make the changes (via the nodeN interfaces).
> 
> If this was done in kernel, what metrics would make sense to drive this?
> Similar to hot page tracking we may run into contention with PMUs or similar and
> their other use cases. 

Hello Jonathan, thank you for your interest in this proposal!

Yes, I think you and Ying both bring up great points about how this is
probably something more suitable for a userspace program. Usespace probably
has more information about the characteristics of the workload, and I agree
with your point about contention.

If the kernel thread doesn't probe frequently, then it would be making poor
allocation decisions based on stale data, but if it does probe frequently,
it would incur lots of overhead from the contention (and make other contending
threads slower as well). Not to mention, there is also the overhead of
probing itself : -)

I will keep thinking about these questions, and see if I can come up with
any interesting ideas to discuss during LSFMMBPF. Thank you again for your
interest, I hope you have a great day!
Joshua

> > >
> > > Thoughts & comments are appreciated! Thank you, and have a great day!
> > > Joshua
> > >
> > > [1] https://lore.kernel.org/all/20250305200506.2529583-1-joshua.hahnjy@gmail.com/
> > >
> > > Sent using hkml (https://github.com/sjp38/hackermail)  
> > 
> > ---
> > Best Regards,
> > Huang, Ying
> >

Sent using hkml (https://github.com/sjp38/hackermail)