From: "Huang, Ying" <ying.huang@intel.com>
To: Bharata B Rao <bharata@amd.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>,
<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Alistair Popple <apopple@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
"Davidlohr Bueso" <dave@stgolabs.net>,
Hesham Almatary <hesham.almatary@huawei.com>,
Jagdish Gediya <jvgediya.oss@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
"Michal Hocko" <mhocko@kernel.org>,
Tim Chen <tim.c.chen@intel.com>, Wei Xu <weixugc@google.com>,
Yang Shi <shy828301@gmail.com>
Subject: Re: [RFC] memory tiering: use small chunk size and more tiers
Date: Mon, 31 Oct 2022 09:33:49 +0800 [thread overview]
Message-ID: <87leowepz6.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <07912a0d-eb91-a6ef-2b9d-74593805f29e@amd.com> (Bharata B. Rao's message of "Fri, 28 Oct 2022 19:23:33 +0530")
Bharata B Rao <bharata@amd.com> writes:
> On 10/28/2022 2:03 PM, Huang, Ying wrote:
>> Bharata B Rao <bharata@amd.com> writes:
>>
>>> On 10/28/2022 11:16 AM, Huang, Ying wrote:
>>>> If my understanding were correct, you think the latency / bandwidth of
>>>> these NUMA nodes will near each other, but may be different.
>>>>
>>>> Even if the latency / bandwidth of these NUMA nodes isn't exactly same,
>>>> we should deal with that in memory types instead of memory tiers.
>>>> There's only one abstract distance for each memory type.
>>>>
>>>> So, I still believe we will not have many memory tiers with my proposal.
>>>>
>>>> I don't care too much about the exact number, but want to discuss some
>>>> general design choice,
>>>>
>>>> a) Avoid to group multiple memory types into one memory tier by default
>>>> at most times.
>>>
>>> Do you expect the abstract distances of two different types to be
>>> close enough in real life (like you showed in your example with
>>> CXL - 5000 and PMEM - 5100) that they will get assigned into same tier
>>> most times?
>>>
>>> Are you foreseeing that abstract distance that get mapped by sources
>>> like HMAT would run into this issue?
>>
>> Only if we set abstract distance chunk size large. So, I think that
>> it's better to set chunk size as small as possible to avoid potential
>> issue. What is the downside to set the chunk size small?
>
> I don't see anything in particular. However
>
> - With just two memory types (default_dram_type and dax_slowmem_type
> with adistance values of 576 and 576*5 respectively) defined currently,
> - With no interface yet to set/change adistance value of a memory type,
> - With no defined way to convert the performance characteristics info
> (bw and latency) from sources like HMAT into a adistance value,
>
> I find it a bit difficult to see how a chunk size of 10 against the
> existing 128 could be more useful.
OK. Maybe we pay too much attention to specific number. My target
isn't to push this specific RFC into kernel. I just want to discuss the
design choices with community.
My basic idea is NOT to group memory types into memory tiers via
customizing abstract distance chunk size. Because that's hard to be
used and implemented. So far, it appears that nobody objects this.
Then, it's even better to avoid to adjust abstract chunk size in kernel
as much as possible. This will make the life of the user space
tools/scripts easier. One solution is to define more than enough
possible tiers under DRAM (we have unlimited number of tiers above
DRAM).
In the upstream implementation, 4 tiers are possible below DRAM. That's
enough for now. But in the long run, it may be better to define more.
100 possible tiers below DRAM may be too extreme. How about define the
abstract distance of DRAM to be 1050 and chunk size to be 100. Then we
will have 10 possible tiers below DRAM. That may be more than enough
even in the long run?
Again, the specific number isn't so important for me. So please suggest
your number if necessary.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2022-10-31 1:34 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 6:59 Huang Ying
2022-10-27 10:45 ` Aneesh Kumar K V
2022-10-28 3:03 ` Huang, Ying
2022-10-28 5:05 ` Aneesh Kumar K V
2022-10-28 5:46 ` Huang, Ying
2022-10-28 8:04 ` Bharata B Rao
2022-10-28 8:33 ` Huang, Ying
2022-10-28 13:53 ` Bharata B Rao
2022-10-31 1:33 ` Huang, Ying [this message]
2022-11-01 14:34 ` Michal Hocko
2022-11-02 0:39 ` Huang, Ying
2022-11-02 7:51 ` Michal Hocko
2022-11-02 8:02 ` Huang, Ying
2022-11-02 8:17 ` Michal Hocko
2022-11-02 8:28 ` Huang, Ying
2022-11-02 8:39 ` Michal Hocko
2022-11-02 8:45 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87leowepz6.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=apopple@nvidia.com \
--cc=bharata@amd.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=hannes@cmpxchg.org \
--cc=hesham.almatary@huawei.com \
--cc=jvgediya.oss@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shy828301@gmail.com \
--cc=tim.c.chen@intel.com \
--cc=weixugc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox