From: Donet Tom <donettom@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>,
"Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Huang Ying <ying.huang@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Mel Gorman <mgorman@suse.de>,
Ben Widawsky <ben.widawsky@intel.com>,
Feng Tang <feng.tang@intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Rik van Riel <riel@surriel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Mike Kravetz <mike.kravetz@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Dan Williams <dan.j.williams@intel.com>,
Hugh Dickins <hughd@google.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Suren Baghdasaryan <surenb@google.com>
Subject: Re: [PATCH 3/3] mm/numa_balancing:Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy
Date: Mon, 26 Feb 2024 18:39:16 +0530 [thread overview]
Message-ID: <0e633718-2313-4a0f-9907-b0fa5ffa18bc@linux.ibm.com> (raw)
In-Reply-To: <ZdRneVbsts8t3VAW@tiehlicka>
On 2/20/24 14:18, Michal Hocko wrote:
> On Tue 20-02-24 09:27:25, Aneesh Kumar K.V wrote:
> [...]
>> case MPOL_PREFERRED_MANY:
>> if (pol->flags & MPOL_F_MORON) {
>> if (!mpol_preferred_should_numa_migrate(thisnid, curnid, pol))
>> goto out;
>> break;
>> }
>>
>> /*
>> * use current page if in policy nodemask,
>> * else select nearest allowed node, if any.
>> * If no allowed nodes, use current [!misplaced].
>> */
>> if (node_isset(curnid, pol->nodes))
>> goto out;
>> z = first_zones_zonelist(
>> node_zonelist(thisnid, GFP_HIGHUSER),
>> gfp_zone(GFP_HIGHUSER),
>> &pol->nodes);
>> polnid = zone_to_nid(z->zone);
>> break;
>> ....
>> ..
>> }
>>
>> /* Migrate the folio towards the node whose CPU is referencing it */
>> if (pol->flags & MPOL_F_MORON) {
>> polnid = thisnid;
>>
>> if (!should_numa_migrate_memory(current, folio, curnid,
>> thiscpu))
>> goto out;
>> }
>>
>> if (curnid != polnid)
>> ret = polnid;
>> out:
>> mpol_cond_put(pol);
>>
>> return ret;
>> }
> Ohh, right this code is confusing as hell. Thanks for the clarification.
> With this in mind. There should be a comment warning about MPOL_F_MOF
> always being unset as the userspace cannot really set it up.
>
> Thanks!
>
Hi Michal
Sorry For the late reply.
If we set MPOL_F_NUMA_BALANCING from userspace then MPOL_F_MOF and MPOL_F_MORON flags will get set in kernel.
/* Basic parameter sanity check used by both mbind() and set_mempolicy() */
static inline int sanitize_mpol_flags(int *mode, unsigned short *flags)
{
*flags = *mode & MPOL_MODE_FLAGS;
*mode &= ~MPOL_MODE_FLAGS;
if ((unsigned int)(*mode) >= MPOL_MAX)
return -EINVAL;
if ((*flags & MPOL_F_STATIC_NODES) && (*flags & MPOL_F_RELATIVE_NODES))
return -EINVAL;
if (*flags & MPOL_F_NUMA_BALANCING) {
if (*mode == MPOL_BIND || *mode == MPOL_PREFERRED_MANY)
*flags |= (MPOL_F_MOF | MPOL_F_MORON);
else
return -EINVAL;
}
In current kernel it is supported only for MPOL_BIND and we added suppor for MPOL_PREFERRED_MANY also.
Why MPOL_F_MOF flag is required?
---------------------------------
For NUMA migration the process memory is unmapped by "task_numa_work" periodically, if unmapped memory got
accessed again then NUMA hinting page fault will occur and in page fault handler the pages get migrated.
If MPOL_F_MOF is not set then "task_numa_work" will not unmap the process pages and NUMA hinting page fault
and migration will not occur. This change has been introduced by commit
fc3147245d193b (mm: numa: Limit NUMA scanning to migrate-on-fault VMAs).
How new implementation works
----------------------------
MPOL_PREFERRED_MANY is able to set MPOL_F_MOF and MPOL_F_MORON through MPOL_F_NUMA_BALANCING. So NUMA hinting
page faults will occur. In mpol_misplaced if we can do numa migration, we select the currently executing node as the target node
otherwise we end up returning from the function with ret = NUMA_NO_NODE.
So since we are able to set MPOL_F_MOF from userspace through MPOL_F_NUMA_BALANCING, no need to add this comment right?
Thanks
Donet Tom
next prev parent reply other threads:[~2024-02-26 13:09 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-17 7:31 [PATCH 1/3] mm/mempolicy: Use the already fetched local variable Donet Tom
2024-02-17 7:31 ` [PATCH 3/3] mm/numa_balancing:Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Donet Tom
2024-02-19 12:07 ` Michal Hocko
2024-02-19 13:44 ` Donet Tom
2024-02-20 6:36 ` Huang, Ying
2024-02-20 6:44 ` Aneesh Kumar K.V
2024-02-20 7:23 ` Huang, Ying
2024-02-20 7:46 ` Aneesh Kumar K.V
2024-02-20 8:01 ` Huang, Ying
2024-02-19 14:20 ` Michal Hocko
2024-02-19 15:07 ` Donet Tom
2024-02-19 19:12 ` Michal Hocko
2024-02-20 3:57 ` Aneesh Kumar K.V
2024-02-20 8:48 ` Michal Hocko
2024-02-26 13:09 ` Donet Tom [this message]
2024-02-20 7:18 ` Huang, Ying
2024-02-20 7:53 ` Aneesh Kumar K.V
2024-02-20 7:58 ` Huang, Ying
2024-03-03 6:16 ` Aneesh Kumar K.V
2024-03-04 1:59 ` Huang, Ying
2024-02-18 21:38 ` [PATCH 1/3] mm/mempolicy: Use the already fetched local variable Andrew Morton
2024-02-19 8:34 ` Donet Tom
2024-02-20 1:21 ` Andrew Morton
2024-02-20 4:10 ` Aneesh Kumar K.V
2024-02-20 6:25 ` Huang, Ying
2024-02-20 6:32 ` Aneesh Kumar K.V
2024-02-20 7:03 ` Aneesh Kumar K.V
2024-02-20 7:22 ` Huang, Ying
2024-02-20 9:03 ` Michal Hocko
2024-03-03 6:17 ` Aneesh Kumar K.V
2024-03-04 1:49 ` Huang, Ying
[not found] ` <bf7e6779f842fb65cf7bb9b2c617feb2af271cb7.1708097962.git.donettom@linux.ibm.com>
2024-02-19 12:02 ` [PATCH 2/3] mm/mempolicy: Avoid the fallthrough with MPOLD_BIND in mpol_misplaced Michal Hocko
2024-02-19 15:18 ` Donet Tom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e633718-2313-4a0f-9907-b0fa5ffa18bc@linux.ibm.com \
--to=donettom@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=ben.widawsky@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox