linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: "Gregory Price" <gourry@gourry.net>,
	hyeonggon.yoo@sk.com, kernel_team@skhynix.com,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"김홍규(KIM HONGGYU) System SW" <honggyu.kim@sk.com>,
	"김락기(KIM RAKIE) System SW" <rakie.kim@sk.com>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"horen.chuang@linux.dev" <horen.chuang@linux.dev>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kernel-team@meta.com" <kernel-team@meta.com>
Subject: Re: [External Mail] [RFC PATCH] mm/mempolicy: Weighted interleave auto-tuning
Date: Thu, 26 Dec 2024 09:35:32 +0800	[thread overview]
Message-ID: <874j2rp6or.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <20241225093042.7710-1-joshua.hahnjy@gmail.com> (Joshua Hahn's message of "Wed, 25 Dec 2024 18:30:42 +0900")

Hi, Joshua,

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> Hi Gregory and Huang,
>
> Sorry for the silence on my end for the past few days. I decided to take
> some time off of the computer, but I should be more reponsive now!
>
> On Wed, 25 Dec 2024 08:25:13 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Gregory Price <gourry@gourry.net> writes:
>> 
>> > On Sun, Dec 22, 2024 at 04:29:30PM +0800, Huang, Ying wrote:
>> >> Gregory Price <gourry@gourry.net> writes:
>> >> 
>> >> > On Sat, Dec 21, 2024 at 01:57:58PM +0800, Huang, Ying wrote:
>
> [.....8<.....]
>
>> > We decided when implementing weights that 0 was a special value that
>> > reverts to the system default:
>> >
>> >   Writing an empty string or `0` will reset the weight to the
>> >   system default. The system default may be set by the kernel
>> >   or drivers at boot or during hotplug events.
>> >
>> > I'm ok pulling the default weights in collectively once the first one is
>> > written, but 0 is an invalid value which causes issues.
>> >
>> > We went through that when we initially implemented the feature w/ task-local
>> > weights and why the help function overrides it to 1 if it's ever seen.
>> >
>> > We'll revert back to our initial implementation w/ default_iw_table and
>> > iw_table - where iw_table contains user-defined weights.  Writing a 0 to
>> > iw_table[N] will allow get_il_weight() to retrieve default_iw_table[N]
>> > as the docs imply it should.
>> 
>> So, the suggested behavior becomes the following?
>> 
>> default_values [5,2,-] <- 1 node not set, expected to be hotplugged
>> user_values    [4,2,1] <- user has only set one value, not populated nodes have value 1
>> effective      [4,2,1]
>> 
>> hotplug event
>> default_values [2,1,1] - reweight has occurred
>> user_values    [4,2,1]
>> effective      [4,2,1]
>
> Yes, I think this was the intended effect when we were discussing what
> interface made the most sense.
>
>> Even if so, we still have another issue.  The effective values may be a
>> combination of default_values and user_values and it's hard for users to
>> identify which one is from default_values and subject to change.  For
>> example,
>> 
>> user reset weight of node 0 to default: echo 0 > node0
>> default_values [2,1,1]
>> user_values    [0,2,1]
>> effective      [2,2,1]
>> 
>> change the default again
>> default_values [3,1,1] - reweight again
>> user_values    [0,2,1]
>> effective      [3,2,1]
>
> Agreed. Actually, this confusion was partly what motivated our new
> re-work of the patch in v2, which got rid of the default and user
> layers, and made all internal values transparent to the user as well.
> That way, there would be no confusion as to the true source of the
> value, and the user could be aware that re-weighting would impact
> all values, regardless of whehter they were default values or not.
>
> If we are moving away from allowing users to dynamically change the
> weightiness (max_node_weight) parameter however, then I think that there
> may be more merit to using the two-level default & user values system to
> allow for more flexibility.
>  
>> This is still quite confusing.  Another possible solution is to copy the
>> default value instead,
>> 
>> user reset weight of node 0 to default: echo 0 > node0
>> default_values [2,1,1]
>> user_values    [2,2,1] - copy default value when echo 0
>> effective      [2,2,1]
>> 
>> change the default again
>> default_values [3,1,1] - reweight again
>> user_values    [2,2,1]
>> effective      [2,2,1]
>
> This makes a lot sense to me, I think it lets us keep both the
> transparency of the new one-layered system and all the benefits that
> come with having default values that can adapt to hotplug events. One
> thing we should consider is that the user should probably be able to
> check what the default value is for a given node before deciding to
> copy that value over to the weight table.
>
> Having two files for each node (nodeN, defaultN) seems a bit too
> cluttered for the user perspective. Making the nodeN interfaces serve
> multiple purposes (i.e. echo -1 into the nodes will output the default
> value for that node) also seems a bit too complicated as well, in my
> opinion. Maybe having a file 'weight_tables' that contains a table of
> default/user/effective weights (as have been used in these conversations)
> might be useful for the user? (Or maybe just the defaults)
>
> Then a workflow for the user may be as such:
>
> $ cat /sys/kernel/mm/mempolicy/weighted_interleave/weight_tables
> default vales: [4,7,2]
>   user values: [-,-,-]
>     effective: [4,7,2]

AFAIK, this breaks the sysfs attribute format rule as follows.

https://docs.kernel.org/filesystems/sysfs.html#attributes

It's hard to use array sysfs attribute here too.  Because the node ID
may be non-consecutive.  This makes it hard to read.

> $ echo 4 > /sys/kernel/mm/mempolicy/weighted_interleave/node2
> 4
> ...
>
>> The remaining issue is that we cannot revert to default atomically.
>> That is, user_values may becomea  combination of old and new
>> default_values if users echo 0 to each node one by one when kernel is
>> changing default_values.  To resolve this, we may add another interface
>> to do that, for example, "use_default".
>> 
>> echo 1 > use_default
>> 
>> will use default_values for all nodes.  We can check whether we are
>> using default via
>> 
>> cat use_default
>
> Like mentioned in the previous comments, I think that the "setting one
> value to set all the others" is a good method, especially since the
> more I think about it (in my limited experience), I think there is rarely
> a scenario where a user wants to use a hybrid of manually-set and
> default values and is switching back and forth between default and
> manual values.
>
>> Anyway, I think that we need a thorough thought about the user space
>> interface.  And add good document, at least in change log.  It's really
>> hard to make user space interface right.
>> 
>> I'm open to better user space interface design.
>
> I agree with this, thank you for your feedback. I think there has been
> a lot of great points raised in these conversations, and I will do my
> best to take these comments into consideration when writing better
> documentation. 
>
> Thank you for your input! I hope you have a great day and happy holidays!

Happy holidays!

---
Best Regards,
Huang, Ying


  reply	other threads:[~2024-12-26  1:35 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-10 21:54 Joshua Hahn
2024-12-13  6:19 ` [External Mail] " Hyeonggon Yoo
2024-12-13 16:28   ` Gregory Price
2024-12-13 19:57   ` Joshua Hahn
2024-12-16  7:53     ` Hyeonggon Yoo
2024-12-16 15:46       ` Joshua Hahn
2024-12-21  5:57     ` Huang, Ying
2024-12-21 14:58       ` Gregory Price
2024-12-22  8:29         ` Huang, Ying
2024-12-22 16:54           ` Gregory Price
2024-12-25  0:25             ` Huang, Ying
2024-12-25  9:30               ` Joshua Hahn
2024-12-26  1:35                 ` Huang, Ying [this message]
2024-12-26 18:13                   ` Gregory Price
2024-12-27  1:59                     ` Huang, Ying
2024-12-27 15:35                       ` Gregory Price
2024-12-30  6:48                         ` Huang, Ying
2025-01-08  1:19                           ` [External Mail] " Hyeonggon Yoo
2025-01-08 16:56                             ` Joshua Hahn
2025-01-09 15:56                             ` Gregory Price
2025-01-09 17:18                               ` Joshua Hahn
2025-01-09 19:10                                 ` Joshua Hahn
2025-01-21 11:01                                   ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874j2rp6or.fsf@DESKTOP-5N7EMDA \
    --to=ying.huang@linux.alibaba.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=gourry@gourry.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=honggyu.kim@sk.com \
    --cc=horen.chuang@linux.dev \
    --cc=hyeonggon.yoo@sk.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kernel_team@skhynix.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=rakie.kim@sk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox