From: Alistair Popple <apopple@nvidia.com>
To: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Bharata B Rao <bharata@amd.com>,
"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
Wei Xu <weixugc@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Michal Hocko <mhocko@kernel.org>, Yang Shi <shy828301@gmail.com>,
Dave Jiang <dave.jiang@intel.com>,
Rafael J Wysocki <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH -V3 1/4] memory tiering: add abstract distance calculation algorithms management
Date: Tue, 19 Sep 2023 15:13:18 +1000 [thread overview]
Message-ID: <877com68zm.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20230912082101.342002-2-ying.huang@intel.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Huang Ying <ying.huang@intel.com> writes:
> The abstract distance may be calculated by various drivers, such as
> ACPI HMAT, CXL CDAT, etc. While it may be used by various code which
> hot-add memory node, such as dax/kmem etc. To decouple the algorithm
> users and the providers, the abstract distance calculation algorithms
> management mechanism is implemented in this patch. It provides
> interface for the providers to register the implementation, and
> interface for the users.
>
> Multiple algorithm implementations can cooperate via calculating
> abstract distance for different memory nodes. The preference of
> algorithm implementations can be specified via
> priority (notifier_block.priority).
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Tested-by: Bharata B Rao <bharata@amd.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Wei Xu <weixugc@google.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
> ---
> include/linux/memory-tiers.h | 19 ++++++++++++
> mm/memory-tiers.c | 59 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 78 insertions(+)
>
> diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h
> index 437441cdf78f..c8382220cced 100644
> --- a/include/linux/memory-tiers.h
> +++ b/include/linux/memory-tiers.h
> @@ -6,6 +6,7 @@
> #include <linux/nodemask.h>
> #include <linux/kref.h>
> #include <linux/mmzone.h>
> +#include <linux/notifier.h>
> /*
> * Each tier cover a abstrace distance chunk size of 128
> */
> @@ -36,6 +37,9 @@ struct memory_dev_type *alloc_memory_type(int adistance);
> void put_memory_type(struct memory_dev_type *memtype);
> void init_node_memory_type(int node, struct memory_dev_type *default_type);
> void clear_node_memory_type(int node, struct memory_dev_type *memtype);
> +int register_mt_adistance_algorithm(struct notifier_block *nb);
> +int unregister_mt_adistance_algorithm(struct notifier_block *nb);
> +int mt_calc_adistance(int node, int *adist);
> #ifdef CONFIG_MIGRATION
> int next_demotion_node(int node);
> void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets);
> @@ -97,5 +101,20 @@ static inline bool node_is_toptier(int node)
> {
> return true;
> }
> +
> +static inline int register_mt_adistance_algorithm(struct notifier_block *nb)
> +{
> + return 0;
> +}
> +
> +static inline int unregister_mt_adistance_algorithm(struct notifier_block *nb)
> +{
> + return 0;
> +}
> +
> +static inline int mt_calc_adistance(int node, int *adist)
> +{
> + return NOTIFY_DONE;
> +}
> #endif /* CONFIG_NUMA */
> #endif /* _LINUX_MEMORY_TIERS_H */
> diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
> index 37a4f59d9585..76c0ad47a5ad 100644
> --- a/mm/memory-tiers.c
> +++ b/mm/memory-tiers.c
> @@ -5,6 +5,7 @@
> #include <linux/kobject.h>
> #include <linux/memory.h>
> #include <linux/memory-tiers.h>
> +#include <linux/notifier.h>
>
> #include "internal.h"
>
> @@ -105,6 +106,8 @@ static int top_tier_adistance;
> static struct demotion_nodes *node_demotion __read_mostly;
> #endif /* CONFIG_MIGRATION */
>
> +static BLOCKING_NOTIFIER_HEAD(mt_adistance_algorithms);
> +
> static inline struct memory_tier *to_memory_tier(struct device *device)
> {
> return container_of(device, struct memory_tier, dev);
> @@ -592,6 +595,62 @@ void clear_node_memory_type(int node, struct memory_dev_type *memtype)
> }
> EXPORT_SYMBOL_GPL(clear_node_memory_type);
>
> +/**
> + * register_mt_adistance_algorithm() - Register memory tiering abstract distance algorithm
> + * @nb: The notifier block which describe the algorithm
> + *
> + * Return: 0 on success, errno on error.
> + *
> + * Every memory tiering abstract distance algorithm provider needs to
> + * register the algorithm with register_mt_adistance_algorithm(). To
> + * calculate the abstract distance for a specified memory node, the
> + * notifier function will be called unless some high priority
> + * algorithm has provided result. The prototype of the notifier
> + * function is as follows,
> + *
> + * int (*algorithm_notifier)(struct notifier_block *nb,
> + * unsigned long nid, void *data);
> + *
> + * Where "nid" specifies the memory node, "data" is the pointer to the
> + * returned abstract distance (that is, "int *adist"). If the
> + * algorithm provides the result, NOTIFY_STOP should be returned.
> + * Otherwise, return_value & %NOTIFY_STOP_MASK == 0 to allow the next
> + * algorithm in the chain to provide the result.
> + */
> +int register_mt_adistance_algorithm(struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_register(&mt_adistance_algorithms, nb);
> +}
> +EXPORT_SYMBOL_GPL(register_mt_adistance_algorithm);
> +
> +/**
> + * unregister_mt_adistance_algorithm() - Unregister memory tiering abstract distance algorithm
> + * @nb: the notifier block which describe the algorithm
> + *
> + * Return: 0 on success, errno on error.
> + */
> +int unregister_mt_adistance_algorithm(struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_unregister(&mt_adistance_algorithms, nb);
> +}
> +EXPORT_SYMBOL_GPL(unregister_mt_adistance_algorithm);
> +
> +/**
> + * mt_calc_adistance() - Calculate abstract distance with registered algorithms
> + * @node: the node to calculate abstract distance for
> + * @adist: the returned abstract distance
> + *
> + * Return: if return_value & %NOTIFY_STOP_MASK != 0, then some
> + * abstract distance algorithm provides the result, and return it via
> + * @adist. Otherwise, no algorithm can provide the result and @adist
> + * will be kept as it is.
> + */
> +int mt_calc_adistance(int node, int *adist)
> +{
> + return blocking_notifier_call_chain(&mt_adistance_algorithms, node, adist);
> +}
> +EXPORT_SYMBOL_GPL(mt_calc_adistance);
> +
> static int __meminit memtier_hotplug_callback(struct notifier_block *self,
> unsigned long action, void *_arg)
> {
next prev parent reply other threads:[~2023-09-19 5:13 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-12 8:20 [PATCH -V3 0/4] memory tiering: calculate abstract distance based on ACPI HMAT Huang Ying
2023-09-12 8:20 ` [PATCH -V3 1/4] memory tiering: add abstract distance calculation algorithms management Huang Ying
2023-09-14 17:29 ` Dave Jiang
2023-09-19 5:13 ` Alistair Popple [this message]
2023-09-12 8:20 ` [PATCH -V3 2/4] acpi, hmat: refactor hmat_register_target_initiators() Huang Ying
2023-09-14 17:30 ` Dave Jiang
2023-09-12 8:21 ` [PATCH -V3 3/4] acpi, hmat: calculate abstract distance with HMAT Huang Ying
2023-09-14 17:31 ` Dave Jiang
2023-09-19 5:14 ` Alistair Popple
2023-09-19 6:11 ` Huang, Ying
2023-09-12 8:21 ` [PATCH -V3 4/4] dax, kmem: calculate abstract distance with general interface Huang Ying
2023-09-14 17:31 ` Dave Jiang
2023-09-19 5:31 ` Alistair Popple
2023-09-19 5:56 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877com68zm.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=bharata@amd.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=shy828301@gmail.com \
--cc=weixugc@google.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox