From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Feng Tang <feng.tang@intel.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
Ben Widawsky <ben.widawsky@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Michal Hocko <mhocko@kernel.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mgorman@techsingularity.net>,
Mike Kravetz <mike.kravetz@oracle.com>,
Randy Dunlap <rdunlap@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>, Andi Kleen <ak@linux.intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Huang Ying <ying.huang@intel.com>,
linux-api@vger.kernel.org
Subject: Re: [RFC PATCH v2 2/3] mm/mempolicy: add set_mempolicy_home_node syscall
Date: Thu, 21 Oct 2021 14:26:27 +0530 [thread overview]
Message-ID: <328c2e3c-6adb-033f-87a0-8f80296f833f@linux.ibm.com> (raw)
In-Reply-To: <20211021073206.GA20861@shbuild999.sh.intel.com>
On 10/21/21 13:02, Feng Tang wrote:
> Hi Aneesh,
>
> On Wed, Oct 20, 2021 at 02:54:52PM +0530, Aneesh Kumar K.V wrote:
>> This syscall can be used to set a home node for the MPOL_BIND
>> and MPOL_PREFERRED_MANY memory policy. Users should use this
>> syscall after setting up a memory policy for the specified range
>> as shown below.
>>
>> mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp,
>> new_nodes->size + 1, 0);
>> sys_set_mempolicy_home_node((unsigned long)p, nr_pages * page_size,
>> home_node, 0);
>>
>> The syscall allows specifying a home node/preferred node from which kernel
>> will fulfill memory allocation requests first.
>>
>> For address range with MPOL_BIND memory policy, if nodemask specifies more
>> than one node, page allocations will come from the node in the nodemask
>> with sufficient free memory that is closest to the home node/preferred node.
>>
>> For MPOL_PREFERRED_MANY if the nodemask specifies more than one node,
>> page allocation will come from the node in the nodemask with sufficient
>> free memory that is closest to the home node/preferred node. If there is
>> not enough memory in all the nodes specified in the nodemask, the allocation
>> will be attempted from the closest numa node to the home node in the system.
>
> I can understand the requirement for MPOL_BIND, and for MPOL_PREFERRED_MANY,
> it provides 3 levels of preference:
> home node --> preferred nodes --> all nodes
> Any real usage cases for this? For a platform which may have 3 types of
> memory (HBM, DRAM, PMEM), this may be useful.
The patch was based on a need to enable an application (that is already
using MPOL_PREFERRED to hint a preference node) to run on a system with
different types of memory (fast and slow memory).
>
>> This helps applications to hint at a memory allocation preference node
>> and fallback to _only_ a set of nodes if the memory is not available
>> on the preferred node. Fallback allocation is attempted from the node which is
>> nearest to the preferred node.
>>
>> This helps applications to have control on memory allocation numa nodes and
>> avoids default fallback to slow memory NUMA nodes. For example a system with
>> NUMA nodes 1,2 and 3 with DRAM memory and 10, 11 and 12 of slow memory
>>
>> new_nodes = numa_bitmask_alloc(nr_nodes);
>>
>> numa_bitmask_setbit(new_nodes, 1);
>> numa_bitmask_setbit(new_nodes, 2);
>> numa_bitmask_setbit(new_nodes, 3);
>>
>> p = mmap(NULL, nr_pages * page_size, protflag, mapflag, -1, 0);
>> mbind(p, nr_pages * page_size, MPOL_BIND, new_nodes->maskp, new_nodes->size + 1, 0);
>>
>> sys_set_mempolicy_home_node(p, nr_pages * page_size, 2, 0);
>
> For this example, it's 'mbind + sys_set_mempolicy_home_node', will case
> 'set_mempolicy + sys_set_mempolicy_home_node' be also supported?
>
At this point it is not asked for. Hence the patch is looking up for vma
policy to set the home node. If there is a need to set home node for a
task, we can look at adding the same. I have kept flags variable, that
should help us to accommodate such a request if we get one in the future.
-aneesh
next prev parent reply other threads:[~2021-10-21 8:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-20 9:24 [RFC PATCH v2 1/3] mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY Aneesh Kumar K.V
2021-10-20 9:24 ` [RFC PATCH v2 2/3] mm/mempolicy: add set_mempolicy_home_node syscall Aneesh Kumar K.V
2021-10-21 7:32 ` Feng Tang
2021-10-21 8:56 ` Aneesh Kumar K.V [this message]
2021-10-20 9:24 ` [RFC PATCH v2 3/3] mm/mempolicy: wire up syscall set_mempolicy_home_node Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=328c2e3c-6adb-033f-87a0-8f80296f833f@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=ben.widawsky@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=rdunlap@infradead.org \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox