From: Yang Shi <yang.shi@linux.alibaba.com>
To: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
linux-kernel <linux-kernel@vger.kernel.org>
Cc: mhocko@suse.com, hannes@cmpxchg.org, dan.j.williams@intel.com,
dave.hansen@linux.intel.com, fengguang.wu@intel.com,
YangShi <yang.shi@linux.alibaba.com>
Subject: [LSF/MM TOPIC] Use NVDIMM as NUMA node and NUMA API
Date: Wed, 30 Jan 2019 09:26:41 -0800 [thread overview]
Message-ID: <f0d66b0c-c9b6-a040-c485-1606041a70a2@linux.alibaba.com> (raw)
Hi folks,
I would like to attend the LSF/MM Summit 2019. I'm interested in most MM
topics, particularly the NUMA API topic proposed by Jerome since it is
related to my below proposal.
I would like to share some our usecases, needs and approaches about
using NVDIMM as a NUMA node.
We would like to provide NVDIMM to our cloud customers as some low cost
memory. Virtual machines could run with NVDIMM as backed memory. Then
we would like the below needs are met:
* The ratio of DRAM vs NVDIMM is configurable per process, or even
per VMA
* The user VMs alway get DRAM first as long as the ratio is not
reached
* Migrate cold data to NVDIMM and keep hot data in DRAM dynamically
and throughout the life time of VMs
To meet the needs we did some in-house implementation:
* Provide madvise interface to configure the ratio
* Put NVDIMM into a separate zonelist so that default allocation
can't touch it as long as it is requested explicitly
* A kernel thread scans cold pages
We tried to just use current NUMA APIs, but we realized they can't meet
our needs. For example, if we configure a VMA use 50% DRAM and 50%
NVDIMM, mbind() could set preferred node policy (DRAM node or NVDIMM
node) for this VMA, but it can't control how much DRAM or NVDIMM is used
by this specific VMA to satisfy the ratio.
So, IMHO we definitely need more fine-grained APIs to control the NUMA
behavior.
I'd like also to discuss about this topic with:
Dave Hansen
Dan Williams
Fengguang Wu
Other than the above topic, I'd also like to meet other MM developers to
discuss about some our usecases about memory cgroup (hallway
conversation may be good enough). I had submitted some RFC patches to
the mailing list and they did incur some discussion, but we have not
reached solid conclusion yet.
https://lore.kernel.org/lkml/1547061285-100329-1-git-send-email-yang.shi@linux.alibaba.com/
Thanks,
Yang
reply other threads:[~2019-01-30 17:26 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f0d66b0c-c9b6-a040-c485-1606041a70a2@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox