linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Fan Du <fan.du@intel.com>
To: akpm@linux-foundation.org, mhocko@suse.com,
	fengguang.wu@intel.com, dan.j.williams@intel.com,
	dave.hansen@intel.com, xishi.qiuxishi@alibaba-inc.com,
	ying.huang@intel.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Fan Du <fan.du@intel.com>
Subject: [RFC PATCH 5/5] mm, page_alloc: Introduce ZONELIST_FALLBACK_SAME_TYPE fallback list
Date: Thu, 25 Apr 2019 09:21:35 +0800	[thread overview]
Message-ID: <1556155295-77723-6-git-send-email-fan.du@intel.com> (raw)
In-Reply-To: <1556155295-77723-1-git-send-email-fan.du@intel.com>

On system with heterogeneous memory, reasonable fall back lists woul be:
a. No fall back, stick to current running node.
b. Fall back to other nodes of the same type or different type
   e.g. DRAM node 0 -> DRAM node 1 -> PMEM node 2 -> PMEM node 3
c. Fall back to other nodes of the same type only.
   e.g. DRAM node 0 -> DRAM node 1

a. is already in place, previous patch implement b. providing way to
satisfy memory request as best effort by default. And this patch of
writing build c. to fallback to the same node type when user specify
GFP_SAME_NODE_TYPE only.

Signed-off-by: Fan Du <fan.du@intel.com>
---
 include/linux/gfp.h    |  7 +++++++
 include/linux/mmzone.h |  1 +
 mm/page_alloc.c        | 15 +++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index fdab7de..ca5fdfc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -44,6 +44,8 @@
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
+#define ___GFP_SAME_NODE_TYPE	0x1000000u
+
 /* If the above are modified, __GFP_BITS_SHIFT may need updating */
 
 /*
@@ -215,6 +217,7 @@
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
+#define __GFP_SAME_NODE_TYPE ((__force gfp_t)___GFP_SAME_NODE_TYPE)
 
 /* Room for N __GFP_FOO bits */
 #define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
@@ -301,6 +304,8 @@
 			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
 
+#define GFP_SAME_NODE_TYPE (__GFP_SAME_NODE_TYPE)
+
 /* Convert GFP flags to their corresponding migrate type */
 #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
 #define GFP_MOVABLE_SHIFT 3
@@ -438,6 +443,8 @@ static inline int gfp_zonelist(gfp_t flags)
 #ifdef CONFIG_NUMA
 	if (unlikely(flags & __GFP_THISNODE))
 		return ZONELIST_NOFALLBACK;
+	if (unlikely(flags & __GFP_SAME_NODE_TYPE))
+		return ZONELIST_FALLBACK_SAME_TYPE;
 #endif
 	return ZONELIST_FALLBACK;
 }
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 8c37e1c..2f8603e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -583,6 +583,7 @@ static inline bool zone_intersects(struct zone *zone,
 
 enum {
 	ZONELIST_FALLBACK,	/* zonelist with fallback */
+	ZONELIST_FALLBACK_SAME_TYPE,	/* zonelist with fallback to the same type node */
 #ifdef CONFIG_NUMA
 	/*
 	 * The NUMA zonelists are doubled because we need zonelists that
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a408a91..de797921 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5448,6 +5448,21 @@ static void build_zonelists_in_node_order(pg_data_t *pgdat, int *node_order,
 	}
 	zonerefs->zone = NULL;
 	zonerefs->zone_idx = 0;
+
+	zonerefs = pgdat->node_zonelists[ZONELIST_FALLBACK_SAME_TYPE]._zonerefs;
+
+	for (i = 0; i < nr_nodes; i++) {
+		int nr_zones;
+
+		pg_data_t *node = NODE_DATA(node_order[i]);
+
+		if (!is_node_same_type(node->node_id, pgdat->node_id))
+			continue;
+		nr_zones = build_zonerefs_node(node, zonerefs);
+		zonerefs += nr_zones;
+	}
+	zonerefs->zone = NULL;
+	zonerefs->zone_idx = 0;
 }
 
 /*
-- 
1.8.3.1


  parent reply	other threads:[~2019-04-25  1:42 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-25  1:21 [RFC PATCH 0/5] New fallback workflow for heterogeneous memory system Fan Du
2019-04-25  1:21 ` [RFC PATCH 1/5] acpi/numa: memorize NUMA node type from SRAT table Fan Du
2019-04-25  1:21 ` [RFC PATCH 2/5] mmzone: new pgdat flags for DRAM and PMEM Fan Du
2019-04-25  1:21 ` [RFC PATCH 3/5] x86,numa: update numa node type Fan Du
2019-04-25  1:21 ` [RFC PATCH 4/5] mm, page alloc: build fallback list on per node type basis Fan Du
2019-04-25  1:21 ` Fan Du [this message]
     [not found]   ` <a0728518-a067-4f89-a8ae-3fa279f768f2.xishi.qiuxishi@alibaba-inc.com>
2019-04-25  3:26     ` [RFC PATCH 5/5] mm, page_alloc: Introduce ZONELIST_FALLBACK_SAME_TYPE fallback list Xishi Qiu
2019-04-25  7:45       ` Du, Fan
2019-04-25  6:38   ` Michal Hocko
2019-04-25  7:43     ` Du, Fan
2019-04-25  7:48       ` Michal Hocko
2019-04-25  7:55         ` Du, Fan
2019-04-25  8:09           ` Michal Hocko
2019-04-25  8:20             ` Du, Fan
2019-04-25  8:43               ` Michal Hocko
2019-04-25  9:18                 ` Du, Fan
2019-04-25  6:37 ` [RFC PATCH 0/5] New fallback workflow for heterogeneous memory system Michal Hocko
2019-04-25  7:41   ` Du, Fan
2019-04-25  7:53     ` Michal Hocko
2019-04-25  8:05       ` Du, Fan
2019-04-25 15:43         ` Dan Williams
2019-04-26  2:40           ` Du, Fan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1556155295-77723-6-git-send-email-fan.du@intel.com \
    --to=fan.du@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=xishi.qiuxishi@alibaba-inc.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox