linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tang Chen <tangchen@cn.fujitsu.com>
To: robert.moore@intel.com, lv.zheng@intel.com, rjw@sisk.pl,
	lenb@kernel.org, tglx@linutronix.de, mingo@elte.hu,
	hpa@zytor.com, akpm@linux-foundation.org, tj@kernel.org,
	trenn@suse.de, yinghai@kernel.org, jiang.liu@huawei.com,
	wency@cn.fujitsu.com, laijs@cn.fujitsu.com,
	isimatu.yasuaki@jp.fujitsu.com, izumi.taku@jp.fujitsu.com,
	mgorman@suse.de, minchan@kernel.org, mina86@mina86.com,
	gong.chen@linux.intel.com, vasilis.liaskovitis@profitbricks.com,
	lwoodman@redhat.com, riel@redhat.com, jweiner@redhat.com,
	prarit@redhat.com, zhangyanfei@cn.fujitsu.com,
	yanghy@cn.fujitsu.com
Cc: x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-acpi@vger.kernel.org
Subject: [PATCH v3 24/25] mem-hotplug: Introduce movablenode boot option to {en|dis}able using SRAT.
Date: Wed, 7 Aug 2013 18:52:15 +0800	[thread overview]
Message-ID: <1375872736-4822-25-git-send-email-tangchen@cn.fujitsu.com> (raw)
In-Reply-To: <1375872736-4822-1-git-send-email-tangchen@cn.fujitsu.com>

The Hot-Pluggable fired in SRAT specifies which memory is hotpluggable.
As we mentioned before, if hotpluggable memory is used by the kernel,
it cannot be hot-removed. So memory hotplug users may want to set all
hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it.

Memory hotplug users may also set a node as movable node, which has
ZONE_MOVABLE only, so that the whole node can be hot-removed.

But the kernel cannot use memory in ZONE_MOVABLE. By doing this, the
kernel cannot use memory in movable nodes. This will cause NUMA
performance down. And other users may be unhappy.

So we need a way to allow users to enable and disable this functionality.
In this patch, we introduce movablenode boot option to allow users to
choose to reserve hotpluggable memory and set it as ZONE_MOVABLE or not.

Users can specify "movablenode" in kernel commandline to enable this
functionality. For those who don't use memory hotplug or who don't want
to lose their NUMA performance, just don't specify anything. The kernel
will work as before.

Suggested-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
---
 Documentation/kernel-parameters.txt |   15 +++++++++++++++
 arch/x86/kernel/setup.c             |   10 ++++++++--
 include/linux/memory_hotplug.h      |    3 +++
 mm/memory_hotplug.c                 |   11 +++++++++++
 4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 15356ac..7349d1f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1718,6 +1718,21 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			that the amount of memory usable for all allocations
 			is not too small.
 
+	movablenode		[KNL,X86] This parameter enables/disables the
+			kernel to arrange hotpluggable memory ranges recorded
+			in ACPI SRAT(System Resource Affinity Table) as
+			ZONE_MOVABLE. And these memory can be hot-removed when
+			the system is up.
+			By specifying this option, all the hotpluggable memory
+			will be in ZONE_MOVABLE, which the kernel cannot use.
+			This will cause NUMA performance down. For users who
+			care about NUMA performance, just don't use it.
+			If all the memory ranges in the system are hotpluggable,
+			then the ones used by the kernel at early time, such as
+			kernel code and data segments, initrd file and so on,
+			won't be set as ZONE_MOVABLE, and won't be hotpluggable.
+			Otherwise the kernel won't have enough memory to boot.
+
 	MTD_Partition=	[MTD]
 			Format: <name>,<region-number>,<size>,<offset>
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 36d7fe8..abdfed7 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1061,14 +1061,20 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	early_acpi_boot_table_init();
 
-#ifdef CONFIG_ACPI_NUMA
+#if defined(CONFIG_ACPI_NUMA) && defined(CONFIG_MOVABLE_NODE)
 	/*
 	 * Linux kernel cannot migrate kernel pages, as a result, memory used
 	 * by the kernel cannot be hot-removed. Find and mark hotpluggable
 	 * memory in memblock to prevent memblock from allocating hotpluggable
 	 * memory for the kernel.
+	 *
+	 * If all the memory in a node is hotpluggable, then the kernel won't
+	 * be able to use memory on that node. This will cause NUMA performance
+	 * down. So by default, we don't reserve any hotpluggable memory. Users
+	 * may use "movablenode" boot option to enable this functionality.
 	 */
-	find_hotpluggable_memory();
+	if (movablenode_enable_srat)
+		find_hotpluggable_memory();
 #endif
 
 	/*
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 463efa9..43eb373 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -33,6 +33,9 @@ enum {
 	ONLINE_MOVABLE,
 };
 
+/* Enable/disable SRAT in movablenode boot option */
+extern bool movablenode_enable_srat;
+
 /*
  * pgdat resizing functions
  */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0a69ceb..64e9f7e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -93,6 +93,17 @@ static void release_memory_resource(struct resource *res)
 }
 
 #ifdef CONFIG_ACPI_NUMA
+#ifdef CONFIG_MOVABLE_NODE
+bool __initdata movablenode_enable_srat;
+
+static int __init cmdline_parse_movablenode(char *p)
+{
+	movablenode_enable_srat = true;
+	return 0;
+}
+early_param("movablenode", cmdline_parse_movablenode);
+#endif	/* CONFIG_MOVABLE_NODE */
+
 /**
  * kernel_resides_in_range - Check if kernel resides in a memory region.
  * @base: The base address of the memory region.
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-08-07 10:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-07 10:51 [PATCH v3 00/25] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-08-07 10:51 ` [PATCH v3 01/25] acpi: Print Hot-Pluggable Field in SRAT Tang Chen
2013-08-07 10:51 ` [PATCH v3 02/25] earlycpio.c: Fix the confusing comment of find_cpio_data() Tang Chen
2013-08-07 10:51 ` [PATCH v3 03/25] acpi: Remove "continue" in macro INVALID_TABLE() Tang Chen
2013-08-07 10:51 ` [PATCH v3 04/25] acpi: Introduce acpi_verify_initrd() to check if a table is invalid Tang Chen
2013-08-07 10:51 ` [PATCH v3 05/25] acpi, acpica: Split acpi_tb_install_table() into two parts Tang Chen
2013-08-07 10:51 ` [PATCH v3 06/25] acpi, acpica: Call two new functions instead of acpi_tb_install_table() in acpi_tb_parse_root_table() Tang Chen
2013-08-07 10:51 ` [PATCH v3 07/25] acpi, acpica: Split acpi_tb_parse_root_table() into two parts Tang Chen
2013-08-07 10:51 ` [PATCH v3 08/25] acpi, acpica: Call two new functions instead of acpi_tb_parse_root_table() in acpi_initialize_tables() Tang Chen
2013-08-07 10:52 ` [PATCH v3 09/25] acpi, acpica: Split acpi_initialize_tables() into two parts Tang Chen
2013-08-07 10:52 ` [PATCH v3 10/25] x86, acpi: Call two new functions instead of acpi_initialize_tables() in acpi_table_init() Tang Chen
2013-08-07 10:52 ` [PATCH v3 11/25] x86, acpi: Split acpi_table_init() into two parts Tang Chen
2013-08-07 10:52 ` [PATCH v3 12/25] x86, acpi: Rename check_multiple_madt() and make it global Tang Chen
2013-08-07 10:52 ` [PATCH v3 13/25] x86, acpi: Split acpi_boot_table_init() into two parts Tang Chen
2013-08-07 10:52 ` [PATCH v3 14/25] x86, acpi: Initialize acpi golbal root table list earlier Tang Chen
2013-08-07 10:52 ` [PATCH v3 15/25] x86: get pg_data_t's memory from other node Tang Chen
2013-08-07 10:52 ` [PATCH v3 16/25] x86: Make get_ramdisk_{image|size}() global Tang Chen
2013-08-07 10:52 ` [PATCH v3 17/25] x86, acpica, acpi: Try to find if SRAT is overrided earlier Tang Chen
2013-08-07 10:52 ` [PATCH v3 18/25] x86, acpica, acpi: Try to find SRAT in firmware earlier Tang Chen
2013-08-07 10:52 ` [PATCH v3 19/25] x86, acpi, numa, mem_hotplug: Find hotpluggable memory in SRAT memory affinities Tang Chen
2013-08-07 10:52 ` [PATCH v3 20/25] x86, numa, mem_hotplug: Skip all the regions the kernel resides in Tang Chen
2013-08-07 10:52 ` [PATCH v3 21/25] memblock, numa: Introduce flag into memblock Tang Chen
2013-08-07 10:52 ` [PATCH v3 22/25] memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions Tang Chen
2013-08-07 10:52 ` [PATCH v3 23/25] memblock, mem_hotplug: Make memblock skip hotpluggable regions by default Tang Chen
2013-08-07 10:52 ` Tang Chen [this message]
2013-08-07 10:52 ` [PATCH v3 25/25] x86, numa, acpi, memory-hotplug: Make movablenode have higher priority Tang Chen
2013-08-07 23:48 ` [PATCH v3 00/25] Arrange hotpluggable memory as ZONE_MOVABLE Rafael J. Wysocki
2013-08-08  3:01   ` Moore, Robert
2013-08-08  3:41     ` Tang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375872736-4822-25-git-send-email-tangchen@cn.fujitsu.com \
    --to=tangchen@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=jweiner@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lv.zheng@intel.com \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=prarit@redhat.com \
    --cc=riel@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=robert.moore@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    --cc=x86@kernel.org \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox