From: Tang Chen <tangchen@cn.fujitsu.com>
To: Tejun Heo <tj@kernel.org>
Cc: tglx@linutronix.de, mingo@elte.hu, hpa@zytor.com,
akpm@linux-foundation.org, trenn@suse.de, yinghai@kernel.org,
jiang.liu@huawei.com, wency@cn.fujitsu.com, laijs@cn.fujitsu.com,
isimatu.yasuaki@jp.fujitsu.com, izumi.taku@jp.fujitsu.com,
mgorman@suse.de, minchan@kernel.org, mina86@mina86.com,
gong.chen@linux.intel.com, vasilis.liaskovitis@profitbricks.com,
lwoodman@redhat.com, riel@redhat.com, jweiner@redhat.com,
prarit@redhat.com, zhangyanfei@cn.fujitsu.com,
yanghy@cn.fujitsu.com, x86@kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-acpi@vger.kernel.org
Subject: Re: [PATCH 12/21] x86, acpi: Try to find if SRAT is overrided earlier.
Date: Wed, 24 Jul 2013 14:57:33 +0800 [thread overview]
Message-ID: <51EF7ADD.6050500@cn.fujitsu.com> (raw)
In-Reply-To: <20130723202746.GQ21100@mtj.dyndns.org>
On 07/24/2013 04:27 AM, Tejun Heo wrote:
> On Fri, Jul 19, 2013 at 03:59:25PM +0800, Tang Chen wrote:
>> As we mentioned in previous patches, to prevent the kernel
>
> Prolly best to briefly describe what the overall goal is about.
Sure. Here is the overall picture, and will add it to log.
Linux cannot migrate pages used by the kernel due to the direct mapping
(va = pa + PAGE_OFFSET), any memory used by the kernel cannot be
hot-removed.
So in memory hotplug platform, we have to prevent the kernel from using
hotpluggable memory.
The ACPI table SRAT (System Resource Affinity Table) contains info to
specify
which memory is hotpluggble. After SRAT is parsed, we are aware of which
memory is hotpluggable.
At the early time when system is booting, SRAT has not been parsed. The boot
memory allocator memblock will allocate any memory to the kernel. So we need
SRAT parsed before memblock starts to work.
In this patch, we are going to parse SRAT earlier, right after memblock
is ready.
Generally speaking, SRAT is provided by firmware. But
ACPI_INITRD_TABLE_OVERRIDE
functionality allows users to customize their own SRAT in initrd, and
override
the one from firmware. So if we want to parse SRAT earlier, we also need
to do
SRAT override earlier.
First, we introduce early_acpi_override_srat() to check if SRAT will be
overridden
from initrd.
Second, we introduce reserve_hotpluggable_memory() to reserve
hotpluggable memory,
which will firstly call early_acpi_override_srat() to find out which
memory is
hotpluggable in the override SRAT.
>
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 28d2e60..9717760 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -1078,6 +1078,15 @@ void __init setup_arch(char **cmdline_p)
>> /* Initialize ACPI root table */
>> acpi_root_table_init();
>>
>> +#ifdef CONFIG_ACPI_NUMA
>> + /*
>> + * Linux kernel cannot migrate kernel pages, as a result, memory used
>> + * by the kernel cannot be hot-removed. Reserve hotpluggable memory to
>> + * prevent memblock from allocating hotpluggable memory for the kernel.
>> + */
>> + reserve_hotpluggable_memory();
>> +#endif
>
> Hmmm, so you're gonna reserve all hotpluggable memory areas until
> everything is up and running, which probably is why allocating
> node_data on hotpluggable node doesn't work, right?
Yes, that's right. The node_data of hotpluggable node is now put on another
unhotpluggable node.
>
......
>> +phys_addr_t __init early_acpi_override_srat(void)
>> +{
>> + int i;
>> + u32 length;
>> + long offset;
>> + void *ramdisk_vaddr;
>> + struct acpi_table_header *table;
>> + unsigned long map_step = NR_FIX_BTMAPS<< PAGE_SHIFT;
>> + phys_addr_t ramdisk_image = get_ramdisk_image();
>> + char cpio_path[32] = "kernel/firmware/acpi/";
>> + struct cpio_data file;
>
> Don't we usually put variable declarations with initializers before
> others? For some reason, the above block is painful to look at.
OK, followed.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-07-24 6:54 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-19 7:59 [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-07-19 7:59 ` [PATCH 01/21] acpi: Print Hot-Pluggable Field in SRAT Tang Chen
2013-07-23 18:48 ` Tejun Heo
2013-07-23 19:15 ` Joe Perches
2013-07-23 19:20 ` Tejun Heo
2013-07-23 19:26 ` Joe Perches
2013-07-24 1:46 ` Tang Chen
2013-07-19 7:59 ` [PATCH 02/21] memblock, numa: Introduce flag into memblock Tang Chen
2013-07-23 19:09 ` Tejun Heo
2013-07-24 2:53 ` Tang Chen
2013-07-24 15:54 ` Tejun Heo
2013-07-25 6:42 ` Tang Chen
2013-07-19 7:59 ` [PATCH 03/21] x86, acpi, numa, mem-hotplug: Introduce MEMBLK_HOTPLUGGABLE to reserve hotpluggable memory Tang Chen
2013-07-23 19:19 ` Tejun Heo
2013-07-24 2:55 ` Tang Chen
2013-07-19 7:59 ` [PATCH 04/21] acpi: Remove "continue" in macro INVALID_TABLE() Tang Chen
2013-07-23 19:15 ` Tejun Heo
2013-07-19 7:59 ` [PATCH 05/21] acpi: Introduce acpi_invalid_table() to check if a table is invalid Tang Chen
2013-07-19 7:59 ` [PATCH 06/21] x86, acpi: Split acpi_boot_table_init() into two parts Tang Chen
2013-07-19 7:59 ` [PATCH 07/21] x86, acpi: Initialize ACPI root table list earlier Tang Chen
2013-07-19 7:59 ` [PATCH 08/21] x86, acpi: Also initialize signature and length when parsing root table Tang Chen
2013-07-23 19:45 ` Tejun Heo
2013-07-25 6:50 ` Tang Chen
2013-07-19 7:59 ` [PATCH 09/21] x86: Make get_ramdisk_{image|size}() global Tang Chen
2013-07-23 19:56 ` Tejun Heo
2013-07-24 3:12 ` Tang Chen
2013-07-19 7:59 ` [PATCH 10/21] earlycpio.c: Fix the confusing comment of find_cpio_data() Tang Chen
2013-07-23 20:02 ` Tejun Heo
2013-07-24 3:20 ` Tang Chen
2013-07-19 7:59 ` [PATCH 11/21] x86: get pg_data_t's memory from other node Tang Chen
2013-07-23 20:09 ` Tejun Heo
2013-07-24 3:52 ` Tang Chen
2013-07-24 16:03 ` Tejun Heo
2013-07-19 7:59 ` [PATCH 12/21] x86, acpi: Try to find if SRAT is overrided earlier Tang Chen
2013-07-23 20:27 ` Tejun Heo
2013-07-24 6:57 ` Tang Chen [this message]
2013-07-19 7:59 ` [PATCH 13/21] x86, acpi: Try to find SRAT in firmware earlier Tang Chen
2013-07-23 20:49 ` Tejun Heo
2013-07-24 10:12 ` Tang Chen
2013-07-24 15:55 ` Tejun Heo
2013-07-23 23:26 ` Cody P Schafer
2013-07-24 10:16 ` Tang Chen
2013-07-19 7:59 ` [PATCH 14/21] x86, acpi, numa: Reserve hotpluggable memory at early time Tang Chen
2013-07-23 20:55 ` Tejun Heo
2013-07-23 21:32 ` Tejun Heo
2013-07-25 2:13 ` Tang Chen
2013-07-25 15:17 ` Tejun Heo
2013-07-26 3:45 ` Tang Chen
2013-07-26 10:26 ` Tejun Heo
2013-07-26 10:27 ` Tejun Heo
2013-07-29 2:12 ` Tang Chen
2013-07-29 17:10 ` Tejun Heo
2013-07-19 7:59 ` [PATCH 15/21] x86, acpi, numa: Don't reserve memory on nodes the kernel resides in Tang Chen
2013-07-23 20:59 ` Tejun Heo
2013-07-25 2:34 ` Tang Chen
2013-07-19 7:59 ` [PATCH 16/21] x86, memblock, mem-hotplug: Free hotpluggable memory reserved by memblock Tang Chen
2013-07-23 21:00 ` Tejun Heo
2013-07-25 2:35 ` Tang Chen
2013-07-19 7:59 ` [PATCH 17/21] page_alloc, mem-hotplug: Improve movablecore to {en|dis}able using SRAT Tang Chen
2013-07-23 21:04 ` Tejun Heo
2013-07-23 21:11 ` Tejun Heo
2013-07-25 3:50 ` Tang Chen
2013-07-25 15:09 ` Tejun Heo
2013-07-26 3:58 ` Tang Chen
2013-07-19 7:59 ` [PATCH 18/21] x86, numa: Synchronize nid info in memblock.reserve with numa_meminfo Tang Chen
2013-07-23 21:25 ` Tejun Heo
2013-07-25 4:09 ` Tang Chen
2013-07-25 15:05 ` Tejun Heo
2013-07-26 4:00 ` Tang Chen
2013-07-19 7:59 ` [PATCH 19/21] x86, numa: Save nid when reserve memory into memblock.reserved[] Tang Chen
2013-07-19 7:59 ` [PATCH 20/21] x86, numa, acpi, memory-hotplug: Make movablecore=acpi have higher priority Tang Chen
2013-07-23 21:21 ` Tejun Heo
2013-07-19 7:59 ` [PATCH 21/21] doc, page_alloc, acpi, mem-hotplug: Add doc for movablecore=acpi boot option Tang Chen
2013-07-23 21:21 ` Tejun Heo
2013-07-25 3:53 ` Tang Chen
2013-07-22 2:48 ` [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51EF7ADD.6050500@cn.fujitsu.com \
--to=tangchen@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=gong.chen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=jiang.liu@huawei.com \
--cc=jweiner@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lwoodman@redhat.com \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=mingo@elte.hu \
--cc=prarit@redhat.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=trenn@suse.de \
--cc=vasilis.liaskovitis@profitbricks.com \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
--cc=yanghy@cn.fujitsu.com \
--cc=yinghai@kernel.org \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox