From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29A72E77198 for ; Wed, 8 Jan 2025 01:19:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACD966B0096; Tue, 7 Jan 2025 20:19:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A7D1C6B0098; Tue, 7 Jan 2025 20:19:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 945686B0099; Tue, 7 Jan 2025 20:19:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7700C6B0096 for ; Tue, 7 Jan 2025 20:19:27 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2C2291C6CCE for ; Wed, 8 Jan 2025 01:19:27 +0000 (UTC) X-FDA: 82982526774.19.8DED489 Received: from invmail3.skhynix.com (exvmail3.hynix.com [166.125.252.90]) by imf25.hostedemail.com (Postfix) with ESMTP id 6F072A000B for ; Wed, 8 Jan 2025 01:19:24 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of hyeonggon.yoo@sk.com designates 166.125.252.90 as permitted sender) smtp.mailfrom=hyeonggon.yoo@sk.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736299165; a=rsa-sha256; cv=none; b=Ruo45gcmfXB7uOxK1LYvEBQIfolwtwarhrwEcOZwpDucJoyry4kWokU9ztm0dW2cWkBSzu qRR0/JM3DUrQAYChQihDjnxn616gbfmtmD/oFbmwTfXJWMZ/bnmalTpnF8RC+4gPAaL+oa l45MR6EV0s5ChgXy0lP3dntoIGf+oVk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of hyeonggon.yoo@sk.com designates 166.125.252.90 as permitted sender) smtp.mailfrom=hyeonggon.yoo@sk.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736299165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nSpisCB1+bxMfRtmfGQxlykqqHUgVpyjBVVcFNS2BIA=; b=gduzlnpD/65IYv4MExKKwylR5GUhMNaMvm+2an58inS9l3FkHWPVsr4bTLJB3sRZQwRjHL vPANkOZwOLVGecG/LZsEsrUgOFgvnURQo/Q3HsZaXQFxX0L6xBRzNaRj0bK6QF26WgYrEw YE+fMRPqZwZ/M9cjxmYQNjmR0kjJXlA= X-AuditID: a67dfc59-7a9ff700000194b3-30-677dd2985662 Message-ID: <769f98b3-f5e5-448c-966e-4dd5468e5041@sk.com> Date: Wed, 8 Jan 2025 10:19:19 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: kernel_team@skhynix.com, 42.hyeyoo@gmail.com, "rafael@kernel.org" , "lenb@kernel.org" , "gregkh@linuxfoundation.org" , "akpm@linux-foundation.org" , =?UTF-8?B?6rmA7ZmN6recKEtJTSBIT05HR1lVKSBTeXN0ZW0gU1c=?= , =?UTF-8?B?6rmA65296riwKEtJTSBSQUtJRSkgU3lzdGVtIFNX?= , "dan.j.williams@intel.com" , "Jonathan.Cameron@huawei.com" , "dave.jiang@intel.com" , "horen.chuang@linux.dev" , "hannes@cmpxchg.org" , "linux-kernel@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "linux-mm@kvack.org" , "kernel-team@meta.com" Subject: Re: [External Mail] Re: [External Mail] [RFC PATCH] mm/mempolicy: Weighted interleave auto-tuning To: "Huang, Ying" , Gregory Price , Joshua Hahn References: <20241225093042.7710-1-joshua.hahnjy@gmail.com> <874j2rp6or.fsf@DESKTOP-5N7EMDA> <87cyhdhon1.fsf@DESKTOP-5N7EMDA> <874j2lll91.fsf@DESKTOP-5N7EMDA> Content-Language: en-US From: Hyeonggon Yoo In-Reply-To: <874j2lll91.fsf@DESKTOP-5N7EMDA> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrBIsWRmVeSWpSXmKPExsXC9ZZnoe6MS7XpBoe+8VhM7DGwmLN+DZvF 9KkXGC1O3Gxks/h59zi7RfPi9WwWqzf5WtzuP8dqsWrhNTaL41vnsVvsuwhUu/PhWzaL5fv6 GS0u75rDZnFvzX9Wi7lfpjJbrF6T4SDocfjNe2aPnbPusnt0t11m92g58pbVY/Gel0wem1Z1 snls+jSJ3ePEjN8sHjsfWnosbJjK7LF/7hp2j3MXKzw+b5IL4I3isklJzcksSy3St0vgyjjY 0MZYsEyp4vKBH2wNjLckuxg5OSQETCTWn/7LCmPvvn0ZzOYVsJRY/auXDcRmEVCRWLftAgtE XFDi5MwnYLaogLzE/Vsz2LsYuTiYBf6ySfQdOAvWLCyQI9F+/B5YkYhAjcTqxQtZQIqEBH4w SjQs6QYrYhYQl7j1ZD5TFyMHB5uAlsSOzlSQMKeArsT7z1cYIUrMJLq2dkHZ8hLb385hBpkj IXCOXeLBwzksEFdLShxccYNlAqPgLCQHzkKyYhaSWbOQzFrAyLKKUSQzryw3MTPHWK84O6My L7NCLzk/dxMjMH6X1f6J3MH47ULwIUYBDkYlHl4Pudp0IdbEsuLK3EOMEhzMSiK8lrJAId6U xMqq1KL8+KLSnNTiQ4zSHCxK4rxG38pThATSE0tSs1NTC1KLYLJMHJxSDYzVtnYqd7Z3lqW/ T61Y/MP0HHtf0cEGjo0mr9nTZqkeai7SLj/33a1m8U4Ty3/2oRGP2jdyfsravMdRNje7ZL3d Bwl2pW/C2ULbn1e2iU4z4lS2kOd09Bf0+3Gg8FfYyQerFWdamU1Rj9JwmJHH8le2/ZZcUeSx ngMPnp17XiG+SuxJvzjXQyWW4oxEQy3mouJEAHIZBd3bAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrNIsWRmVeSWpSXmKPExsXCNUOnRHf6pdp0gydvrCwm9hhYzFm/hs1i +tQLjBYnbjayWfy8e5zdonnxejaL1Zt8LT4/e81scbv/HKvFqoXX2CyOb53HbrHvIlDD4bkn WS12PnzLZrF8Xz+jxeVdc9gs7q35z2ox98tUZotD156zWqxek+Eg4nH4zXtmj52z7rJ7dLdd ZvdoOfKW1WPxnpdMHptWdbJ5bPo0id3jxIzfLB47H1p6LGyYyuyxf+4ado9zFys8vt328Fj8 4gOTx+dNcgH8UVw2Kak5mWWpRfp2CVwZBxvaGAuWKVVcPvCDrYHxlmQXIyeHhICJxO7bl1lB bF4BS4nVv3rZQGwWARWJddsusEDEBSVOznwCZosKyEvcvzWDvYuRi4NZ4C+bRN+Bs2DNwgI5 Eu3H74EViQjUSKxevJAFpEhI4AejRMOSbrAiZgFxiVtP5jN1MXJwsAloSezoTAUJcwroSrz/ fIURosRMomtrF5QtL7H97RzmCYx8s5DcMQvJpFlIWmYhaVnAyLKKUSQzryw3MTPHTK84O6My L7NCLzk/dxMjMEqX1f6ZtIPx22X3Q4wCHIxKPLwecrXpQqyJZcWVuYcYJTiYlUR4LWWBQrwp iZVVqUX58UWlOanFhxilOViUxHm9wlMThATSE0tSs1NTC1KLYLJMHJxSDYzJHx4eFAjYrXkw Je3Fit1/NW+wblK+eGb9IwXvcOvq/80HF8/zvlaz6W6gTbtVSc3frZvW++3tt3MQX/dNbbu/ 5kYLC+NMHQbvtj1JpnVyNpwx5j3G0r2vcvicZm8zF6k9detQZOudvVr5m9UUljXt21a2IcNY d/lUuRCbo7+W7SvL1nq1dLsSS3FGoqEWc1FxIgCUdkf1zgIAAA== X-CFilter-Loop: Reflected X-Stat-Signature: zq4ekqzj7nwepmz3n5co4fafbkkrrrdb X-Rspam-User: X-Rspamd-Queue-Id: 6F072A000B X-Rspamd-Server: rspam08 X-HE-Tag: 1736299164-603781 X-HE-Meta: U2FsdGVkX18a1WHZmY35kl8a+TnDmzsjWVhW8AqGqMRE7zdQExvZhjF+hQFqfYux1EbrWyV1cM6OygKvZrYyoEpMaDp66FYYkrgT/ObKhE+XfeKWv3oEDtwyTA4twhWfmmhHKz8jKjxAcOcaoGZMWcoMpvdNRwUKO2k3jZ8wni1Bi82EkjtxYNnfOh8LvUYySjf1Y7gEzWAf1b09U+D/qopIPMl9v7XCdcXek+jUgMz8HZTsRdGCemi34gkQmKbc7KnLUrEraakAGmhZ3DL8lhIHrrwq12W0cHpUY9kuWeJ9VvpQ5E6GLaar/Qa8NBJqqJSmM/EmZPyW5Qjhfk63uz5Ij5bBX3jxWJoy1nmCXByLJSk8f7ASq7Fmzb8Z+N9rA7geQEGswj/g+Be/0HqGmPFnaXXPH1jTHYvJnPP08hgK5qPBUCjoroxFrqd87TivZ98AUa6hIPaRtUXojrClBY7OTeka+Q3nNuidMHx4unCSY0ql9pSkyNIhmNfj14tCPtovwAcGMtwRXtdae40pYLTN7O6J4FU35L84MOYNtzZjxkarzDkHGx9qA2Fkp2znEkfTs+byuq8C2GFGVKpP5KvDi6Urf3m5WEiGAzQSD8lU3mgX2s7k3y3IFCxhtPYaC/iAkRzZAYOzYfkNtxyF99UCCepJ9QlzxDEi/JjjBBvstuXtYJw4kNLFDPyonx8KZ8EKmUlnEZPupBChM3cbkSG8FG+lPtu1BkkQ8o4AlS4eryz2n2LvdXG1ylLAOLqvbsmHjz6mBF/LmyuRk0nmFMZR69kIbKkENM0q1O98IKDhNR3Bx+rWxg2dA2XX+IKes1hg6lP3ZKVebp6QTWV8a9kPP22th26eLneZHX2GIk4y/O7gLLNFMTyUy0jOjWdTa062csnHGm0uxq6Sajx/KcNPAZ0m4N4TbGcwWt1FE97DypsXavl/hN/fbxTH31SQQY9UwcD+pepPplpNR9G jPqg14xQ 9mPufvR3htPBZ4CRlRQ6kE7hRDeFF8+Xy7Z+VH4XQ62emaDJgMnX3x2OnAtceDSFj3dQGGEEOZA3K7gAtlaslXXXgfnP0VLko/kHrjRnCM6feJX3T7CZbyOf7TfuUz6dQnfFS043ktif6WjfBxUW7UBLUm7s2gfK8g1ho7EUUal/iVCJQVH4I+JK1GY7e4LEV1v91 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-12-30 3:48 AM, Huang, Ying wrote: > Gregory Price writes: > >> On Fri, Dec 27, 2024 at 09:59:30AM +0800, Huang, Ying wrote: >>> Gregory Price writes: >>>> This still allows 0 to be a manual "reset specific node to default" >>>> mechanism for a specific node, and gives us a clean override. >>> >>> The difficulty is that users don't know the default value when they >>> reset a node's weight. We don't have an interface to show them. So, I >>> suggest to disable the functionality: "reset specific node to default". >>> They can still use "echo 1 > use_defaults" to reset all nodes to >>> default. >>> >> >> Good point, and agree. So lets just ditch 0. Since that "feature" >> wasn't even functional in the current code (it just reset it to 1 at >> this point), it's probably safe to just ditch it. Worst case scenario >> if someone takes issues, we can just have it revert the weight to 1. > > Before implementing the new version, it's better to summarize the user > space interface design first. So, we can check whether we have some > flaws. Hi, hope you all had a nice year-end holiday :) Let me summarize the points we've discussed: - A new knob 'weightiness' is unnecessary until it's proven useful. Just using an internal default weightiness value will be enough. - It will be counter-intuitive to update the value previously set by user, even if the value will no longer be valid (e.g. due to CXL memory hot-plug). User should update the weights accordingly in that case, instead of the kernel updating automatically overwriting them. - Ditch the way of using 0 as 'system default' value because the user won't know what will be the default when setting it anyway. 0 value now means the kernel won't weight-interleave the node. - Setting a node weight to default value (e.g. via the previous semantic of '0') could be problematic because it's not atomic - the system may be updating default values while the user's trying to set a node weight to default value. To deal with that, Huang suggested 'use_defaults' to atomically update all the weights to system default. Please let me know if there's any point we discussed that I am missing. Additionally I would like to mention that within an internal discussion my colleague Honggyu suggested introducing 'mode' parameter which can be either 'manual' or 'auto' instead of 'use_defaults' to be provide more intuitive interface. With Honggyu's suggestion and the points we've discussed, I think the interface could be: # At booting, the mode is 'auto' where the kernel can automatically # update any weights. mode auto # User hasn't specified any weight yet. effective [2, 1, -, -] # Using system defaults for node 0-1, # and node 2-3 not populated yet. # When a new NUMA node is added (e.g. via hotplug) in the 'auto' mode, # all weights are re-calculated based on ACPI HMAT table, including the # weight of the new node. mode auto # User hasn't specified weights yet. effective [2, 1, 1, -] # Using system defaults for node 0-2, # and node 3 not populated yet. # When user set at least one weight value, change the mode to 'manual' # where the kernel does not update any weights automatically without # user's consent. mode manual # User changed the weight of node 0 to 4, # changing the mode to manual config mode. effective [4, 1, 1, -] # When a new NUMA node is added (e.g. via hotplug) in the manual mode, # the new node's weight is zero because it's in manual mode and user # did not specify the weight for the new node yet. mode manual effective [4, 1, 1, 0] # When user changes the mode to 'auto', all weights are changed to # system defaults based on the ACPI HMAT table. mode auto effective [2, 1, 1, 1] # system defaults In the example I did not distinguish 'default weights' and 'user weights' because it's not important where the weight values came from -- but it's important to know 1) what's the effective weights now and 2) if the kernel can update them. Any thoughts? --- Best, Hyeonggon