From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18718C0218F for ; Tue, 4 Feb 2025 07:50:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8B746B0088; Tue, 4 Feb 2025 02:50:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A3BBA280003; Tue, 4 Feb 2025 02:50:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90303280002; Tue, 4 Feb 2025 02:50:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6FF846B0088 for ; Tue, 4 Feb 2025 02:50:38 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1E054120844 for ; Tue, 4 Feb 2025 07:50:38 +0000 (UTC) X-FDA: 83081490156.02.4DCA325 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf15.hostedemail.com (Postfix) with ESMTP id 36E4CA0002 for ; Tue, 4 Feb 2025 07:50:36 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AAgxVJYL; spf=pass (imf15.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738655436; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m4QbTov9qjbzAGVZMQZfzMc51lM6YdjVO0Ci8uV4/vQ=; b=KhD/zHXlTiU5S9UL4bzOiBUKKnTbY0srLyWX94HvrGT4tgMPLfC0MGY4oDilwRNUXE4LT8 4d0nrcWT0w1BfdvyV12qAewovP0J+8/I+MvxCsoyOCvUbRvw3wA4VRVvktYErc6uigSNpc qOy1cEL7Spk1LO2MeYXi5Mz6JGM73mc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738655436; a=rsa-sha256; cv=none; b=hyK93BOUjyptmx9wToxdhPZLH/neh2piwlNFONYegwuH8HnoSXVEqjTOKz+mijLhAq8WRf vIGGNNSzsaIAmX+Mdr2cmjAQ4ClEaEPogcs7ZcYviq8+zz3jEv844nm4BzPziEAswtfNGX rtlKBkHJHGCqJ6xWLCXS1sYdyXiCtjs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=AAgxVJYL; spf=pass (imf15.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-21ddab8800bso73660215ad.3 for ; Mon, 03 Feb 2025 23:50:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738655435; x=1739260235; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=m4QbTov9qjbzAGVZMQZfzMc51lM6YdjVO0Ci8uV4/vQ=; b=AAgxVJYLva26BR0cBZ3BcLHtDHuUU6l+dWr7XskFQ1BlmGL73InDKreu4JH47Zv+26 1mPwGR+Aqi1KkV7EoamqYRi2iaPZahVtbV1+ovDNFUoOWJgyUIK0BGx4xK3j3ZB+E19d 73+MwhFfPJFtbwvRymL7jl90hUshK4MZlbeP+CLQJN8Ptb2EwTLhi4VDnWa3sW3Qh9wn 1UmGX6bvrK4tymZV1dZmBdIBjqNe77m06qABOLjangqptGF2teMNurx9UzM2tM8a/fmJ JVTJF4v/IZiXwvVRDN4Feikzu/+4NXb9AAhCPS4Xsqg/VSVf5UpGQcTHtowt5uA6cbaC oPbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738655435; x=1739260235; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=m4QbTov9qjbzAGVZMQZfzMc51lM6YdjVO0Ci8uV4/vQ=; b=MVEnCExAz52ezd0qvAHF3g2BBhJLP04QjB75ulsF4lg99NeeSSAt1Y0Cwzo2WWMwEV VpZdekbn9WHsol2qth33YqGShQsUw+sD+3ZeO1mIbw/KvBPl8PBMMoFMU3ktt5B95xex uXOfozqGBm1TT9vGosG+33Eoelzhlm78ktjDnmZhnMFreJEECLB6welU5i1RuQ6xLcrA GKS9PlHNE5/MN236LDpGcP5egoYObBYxRMGnL3UYtuD4ayDWRnusx1sM1izoLIniWfFq E3R8nBq1hdAtslKux4IG5E1mgoFtuvmQAzg4rEHegLUFbKx10JUFSd4hswj8PywaELgo FIvQ== X-Forwarded-Encrypted: i=1; AJvYcCUybe7ChbLumDHBcu4RaUHcWRL3tEFwVYSWul7KLsP9l6DYYpnI4RaBFTMyHyu61IFuTpK0LPZRMg==@kvack.org X-Gm-Message-State: AOJu0YxfOJsshe8aJate2OitOm7nTC7uT2Yxzv51xBtK+/RNtjBbni/e 7oIcYeJ+fpd+7l2HtvFuaRo+tEYmFif0uAYZQUd5DcZcuOIOl4/p X-Gm-Gg: ASbGncvl17OQWXkcSD5f3qNIEc/41gweVch0qh+ocGPaVz/mslq8VIZGeYxQEyBo14b 8ehGRpv6YH+11KApr2GRI1v9WTTymRiC0zY5BHitxG9EDVPzhFWjDVKAypt+jGDw3AEcxEfdgoD +zcDznLrrlz3ujLeldSPrxk9+hoTH1oPfoUSRO5xpGf0m9pHqsu4v7i0bk11y9IqMYtLTjlamiI B0Ew13WnwZGTIys8XDY42XqmEGvdrzdWxvbXuLrOyzS1QtoR6KWpTXk5e11fRuzkHMbK3E/e0Nr xi/2pl9WKRCZ/meOG+cosfQ8GA== X-Google-Smtp-Source: AGHT+IECrkqEYZxZRxAFlXAex36I+pmgFvIMWQpXSWrms6PPX16TKKsBaO2hBThkA0GN/qSWTc93Sw== X-Received: by 2002:a05:6a00:91e0:b0:72f:f86f:f8d8 with SMTP id d2e1a72fcca58-72ff8700403mr19900543b3a.2.1738655434875; Mon, 03 Feb 2025 23:50:34 -0800 (PST) Received: from localhost.localdomain ([1.245.180.67]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72fe6008bd8sm9682164b3a.0.2025.02.03.23.50.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 23:50:34 -0800 (PST) Date: Tue, 4 Feb 2025 16:50:26 +0900 From: "Harry (Hyeonggon) Yoo" <42.hyeyoo@gmail.com> To: Joshua Hahn Cc: gourry@gourry.net, hyeonggon.yoo@sk.com, ying.huang@linux.alibaba.com, rafael@kernel.org, lenb@kernel.org, gregkh@linuxfoundation.org, akpm@linux-foundation.org, honggyu.kim@sk.com, rakie.kim@sk.com, dan.j.williams@intel.com, Jonathan.Cameron@huawei.com, dave.jiang@intel.com, horen.chuang@linux.dev, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: Re: [PATCH v4] Weighted Interleave Auto-tuning Message-ID: References: <20250128222332.3835931-1-joshua.hahnjy@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250128222332.3835931-1-joshua.hahnjy@gmail.com> X-Stat-Signature: 44ze38ybcfausu3edai44jo41kc7w3js X-Rspam-User: X-Rspamd-Queue-Id: 36E4CA0002 X-Rspamd-Server: rspam03 X-HE-Tag: 1738655436-526912 X-HE-Meta: U2FsdGVkX1/VW0wEoDvlvXyyt8RcFeBcOE8uLB6nIbA8jTMwosYWkNrixaqJ0ZpK2rPCMpj/Zaig+MpbzPY2hJMIlUFmH5fuKibVY5ubNo7GTYsMiAUPh3Kt0B/L5T0Md1ZKzZWQMqREHV5ACsqBPFxruvOaFxAKa/xB4Vrhmod8lBXxDwogczjxN6/l2nH+7yzraHVQpkh3mxr7YVBpavgDboC5BzhQj27aIF1cLtLo31I4sqQJPLDoc9Rr6zqDxm7e3JAcTwsI4ZgitXBRr3cJh9P6OYmla1Ml00hgiTHRQ40b4ES0DSVutyp3vmQ0sH7yr07ViYFPQR5dtEnCmTSqEpVtYKha+wSL4dfO8Wp5Q+RCizraNEZLTuTg5NpeX4G0Rhf0xgfgkUSNUZLxZ6kG64+2aDiywnWHGML9MI6J1MJPBLJ/VX6PM8AOEY5dPvBNGe4Fb4tP/MkjvrA0Vn3N/t6IwpFjKBSaLcDZwd2C+g6MWflRd0G+sBWHGatOayHbfk3N77vg6PHS3nsPVigcLxL2qM75JnFpg7OrS3ajeSHxvASfm0tshJaC/gQWYjJUMo165mqupiTTBBnN+WWq9KXYBTbnR2at+5fmHzQoX0IwzCeQ1GnU9Bngl7suoDJR6t95bbxv/mjnZ4f6PIqWC8hPRKNFf5CD2xx/QZsjYx9CHqSbdYTlYpcXlCe/gm4eu+C/1Y7LaoWWDYpcigMtnPYiD8FwdgapqCQ/FkuBxpN4PRY+FxAaivtYffI0kQ5Ck/WwmgHh+1YUvdnNzoUx642EMdB78CL6UsyKsZ5ewa3gstWhpx3W8v4evBbUnmGKrfDeWGcVrvwHBDqTnJvC//T1t5WVEJ+2FUwPDtQeG9MsoyIjUga6wqMTrHzamkvSALLq9vl0+Ma36hFa+UyeRgIx0MfkS+vP1ld2aQ3rTkLRcdEoCYPtKl30Zm8jf1ng9xIc7xXGYJiCt9f SmAWD5HR NqivvKYQ9ldwCMNdRiwt6InFVWkQL8tM9vUl/WF8zk5JDBuHi+/BF2OiJktz7O/RJQcXAupjJ7QZWNVDeXd3SE0NcJzZHl1KYu432MC317Z4PWPzLLYV0nG3pfAlZhK2z6Ga1uGmy4QHyycCYW3xG6rc4H8TqaZzwPQ7BZXmNUurUSt1c1dpdPYnEcwRyPqzzgjcMmQNyaYHwbaY3lQvFpYzJ5uS5ITY65dSTA+hn0fY0s5NywU5Nd9etBmalY3EiGY3htzt77X3uAll4Sx6On7UdP76e/2/2B1frBGiBG2Pq/2+fZzlSC1Pf8bivwe0VEJLPtP0osEN4mZ4mgGTlx6XGnhgAMsoBRQgM5IXD0ZVnLGFiztSdkPm6l/rdVdA8pD7qDJXpuCKjYwha1ayLAd08U+PMdT3TThQiaNPPCyTYTVlejqBgTlasXRR+27z3YXyiY5JUFoPYULNYnDtw+bWfBpeNNmNbNZpBcbDRCE1DEpOlzklFK1fyReJt0l1tX2KCWfi28mawlopw8emvnLli+5QvVG+XwwveaYy3PzwiNmBSLcy2mM11xd+fOkLsphPD9mBk5yJ4JB+l/JYzWgC6Via6oPdwiNPDSHXLJN0D5ApIRw8F+cwFkErcHwLUkUAQr4ouoNs+ysvOFgzyf1mBKL/QdthBkm18RZH9pvbhylBaMDKe2eYcuQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 28, 2025 at 02:23:31PM -0800, Joshua Hahn wrote: > On machines with multiple memory nodes, interleaving page allocations > across nodes allows for better utilization of each node's bandwidth. > Previous work by Gregory Price [1] introduced weighted interleave, which > allowed for pages to be allocated across nodes according to user-set ratios. > > Ideally, these weights should be proportional to their bandwidth, so > that under bandwidth pressure, each node uses its maximal efficient > bandwidth and prevents latency from increasing exponentially. > > At the same time, we want these weights to be as small as possible. > Having ratios that involve large co-prime numbers like 7639:1345:7 leads > to awkward and inefficient allocations, since the node with weight 7 > will remain mostly unused (and despite being proportional to bandwidth, > will not aid in relieving the bandwidth pressure in the other two nodes). > > This patch introduces an auto-configuration mode for the interleave > weights that aims to balance the two goals of setting node weights to be > proportional to their bandwidths and keeping the weight values low. > In order to perform the weight re-scaling, we use an internal > "weightiness" value (fixed to 32) that defines interleave aggression. > > In this auto configuration mode, node weights are dynamically updated > every time there is a hotplug event that introduces new bandwidth. > > Users can also enter manual mode by writing "N" or "0" to the new "auto" > sysfs interface. When a user enters manual mode, the system stops > dynamically updating any of the node weights, even during hotplug events > that can shift the optimal weight distribution. The system also enters > manual mode any time a user sets a node's weight directly by using the > nodeN interface introduced in [1]. On the other hand, auto mode is > only entered by explicitly writing "Y" or "1" to the auto interface. > > There is one functional change that this patch makes to the existing > weighted_interleave ABI: previously, writing 0 directly to a nodeN > interface was said to reset the weight to the system default. Before > this patch, the default for all weights were 1, which meant that writing > 0 and 1 were functionally equivalent. > > This patch introduces "real" defaults, but moves away from letting users > use 0 as a "set to default" interface. Rather, users who want to use > system defaults should use auto mode. This patch seems to be the > appropriate place to make this change, since we would like to remove > this usage before users begin to rely on the feature in userspace. > Moreover, users will not be losing any functionality; they can still > write 1 into a node if they want a weight of 1. Thus, we deprecate the > "write zero to reset" feature in favor of returning an error, the same > way we would return an error when the user writes any other invalid > weight to the interface. > > [1] https://lore.kernel.org/linux-mm/20240202170238.90004-1-gregory.price@memverge.com/ > > Signed-off-by: Joshua Hahn > Co-developed-by: Gregory Price > Signed-off-by: Gregory Price > --- Hi Joshua, I'm glad we're close to finalizing the interface. I believe the author has successfully addressed major concerns through the revisions. The interface and the code now look good to me. Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> With a few nits: > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > index 0b7972de04e9..c26879f59d5d 100644 > --- a/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-mempolicy-weighted-interleave > @@ -20,6 +20,34 @@ Description: Weight configuration interface for nodeN [...snip...] > +What: /sys/kernel/mm/mempolicy/weighted_interleave/auto > +Date: January 2025 > +Contact: Linux memory management mailing list > +Description: Auto-weighting configuration interface > + > + Configuration mode for weighted interleave. A 'Y' indicates > + that the system is in auto mode, and a 'N' indicates that > + the system is in manual mode. All other values are invalid. > + > + In auto mode, all node weights are re-calculated and overwritten > + (visible via the nodeN interfaces) whenever new bandwidth data > + is made available during either boot or hotplug events. > + > + In manual mode, node weights can only be updated by the user. > + If a node is hotplugged while the user is in manual mode, > + the node will have a default weight of 1. > + > + Modes can be changed by writing Y, N, 1, or 0 to the interface. > + All other strings will be ignored, and -EINVAL will be returned. > + If Y or 1 is written to the interface but the recalculation or > + updates fail at any point (-ENOMEM or -ENODEV), then the mode > + will remain in manual mode. nit: the commit log describes that writing 'N' or '0' means switching to manual mode and writing 1 means switching to auto mode, but the Documentation does not explicitly states what '0' and '1' does? > + Writing a new weight to a node directly via the nodeN interface > + will also automatically update the system to manual mode. > diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c > index 80a3481c0470..cc94cba112dd 100644 > --- a/drivers/acpi/numa/hmat.c > +++ b/drivers/acpi/numa/hmat.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include nit: is this #include directive necessary? -- Harry