linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Bob Liu <lliubbo@gmail.com>
To: Tang Chen <tangchen@cn.fujitsu.com>
Cc: hpa@zytor.com, akpm@linux-foundation.org, rob@landley.net,
	isimatu.yasuaki@jp.fujitsu.com, laijs@cn.fujitsu.com,
	wency@cn.fujitsu.com, linfeng@cn.fujitsu.com,
	jiang.liu@huawei.com, yinghai@kernel.org,
	kosaki.motohiro@jp.fujitsu.com, minchan.kim@gmail.com,
	mgorman@suse.de, rientjes@google.com, rusty@rustcorp.com.au,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, m.szyprowski@samsung.com
Subject: Re: [PATCH v2 0/5] Add movablecore_map boot option
Date: Tue, 27 Nov 2012 20:09:55 +0800	[thread overview]
Message-ID: <CAA_GA1ezZJyqVL=Dp5U2zzNw6bkfMKJY_STkt3E7TXkUYcv+jQ@mail.gmail.com> (raw)
In-Reply-To: <50B479FA.6010307@cn.fujitsu.com>

On Tue, Nov 27, 2012 at 4:29 PM, Tang Chen <tangchen@cn.fujitsu.com> wrote:
> On 11/27/2012 04:00 PM, Bob Liu wrote:
>>
>> Hi Tang,
>>
>> On Fri, Nov 23, 2012 at 6:44 PM, Tang Chen<tangchen@cn.fujitsu.com>
>> wrote:
>>>
>>> [What we are doing]
>>> This patchset provide a boot option for user to specify ZONE_MOVABLE
>>> memory
>>> map for each node in the system.
>>>
>>> movablecore_map=nn[KMG]@ss[KMG]
>>>
>>> This option make sure memory range from ss to ss+nn is movable memory.
>>>
>>>
>>> [Why we do this]
>>> If we hot remove a memroy, the memory cannot have kernel memory,
>>> because Linux cannot migrate kernel memory currently. Therefore,
>>> we have to guarantee that the hot removed memory has only movable
>>> memoroy.
>>>
>>> Linux has two boot options, kernelcore= and movablecore=, for
>>> creating movable memory. These boot options can specify the amount
>>> of memory use as kernel or movable memory. Using them, we can
>>> create ZONE_MOVABLE which has only movable memory.
>>>
>>> But it does not fulfill a requirement of memory hot remove, because
>>> even if we specify the boot options, movable memory is distributed
>>> in each node evenly. So when we want to hot remove memory which
>>> memory range is 0x80000000-0c0000000, we have no way to specify
>>> the memory as movable memory.
>>>
>>
>> Sorry, I'm still not get your idea.
>> Why you need a specify range that is movable?
>> Could you describe the requirement and situation a bit more?
>> Thank you.
>
>
> Hi Liu,
>
> This feature is used in memory hotplug.
>
> In order to implement a whole node hotplug, we need to make sure the
> node contains no kernel memory, because memory used by kernel could
> not be migrated. (Since the kernel memory is directly mapped,
> VA = PA + __PAGE_OFFSET. So the physical address could not be changed.)
>
> User could specify all the memory on a node to be movable, so that the
> node could be hot-removed.
>

Thank you for your explanation. It's reasonable.

But i think it's a bit duplicated with CMA, i'm not sure but maybe we
can combine it with CMA which already in mainline?

> Another approach is like the following:
> movable_node = 1,3-5,8
> This could set all the memory on the nodes to be movable. And the rest
> of memory works as usual. But movablecore_map is more flexible.
>
> Thanks. :)
>
>
>>
>>> So we proposed a new feature which specifies memory range to use as
>>> movable memory.
>>>
>>>
>>> [Ways to do this]
>>> There may be 2 ways to specify movable memory.
>>>   1. use firmware information
>>>   2. use boot option
>>>
>>> 1. use firmware information
>>>    According to ACPI spec 5.0, SRAT table has memory affinity structure
>>>    and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory
>>>    Affinity Structure". If we use the information, we might be able to
>>>    specify movable memory by firmware. For example, if Hot Pluggable
>>>    Filed is enabled, Linux sets the memory as movable memory.
>>>
>>> 2. use boot option
>>>    This is our proposal. New boot option can specify memory range to use
>>>    as movable memory.
>>>
>>>
>>> [How we do this]
>>> We chose second way, because if we use first way, users cannot change
>>> memory range to use as movable memory easily. We think if we create
>>> movable memory, performance regression may occur by NUMA. In this case,
>>> user can turn off the feature easily if we prepare the boot option.
>>> And if we prepare the boot optino, the user can select which memory
>>> to use as movable memory easily.
>>>
>>>
>>> [How to use]
>>> Specify the following boot option:
>>> movablecore_map=nn[KMG]@ss[KMG]
>>>
>>> That means physical address range from ss to ss+nn will be allocated as
>>> ZONE_MOVABLE.
>>>
>>> And the following points should be considered.
>>>
>>> 1) If the range is involved in a single node, then from ss to the end of
>>>     the node will be ZONE_MOVABLE.
>>> 2) If the range covers two or more nodes, then from ss to the end of
>>>     the node will be ZONE_MOVABLE, and all the other nodes will only
>>>     have ZONE_MOVABLE.
>>> 3) If no range is in the node, then the node will have no ZONE_MOVABLE
>>>     unless kernelcore or movablecore is specified.
>>> 4) This option could be specified at most MAX_NUMNODES times.
>>> 5) If kernelcore or movablecore is also specified, movablecore_map will
>>> have
>>>     higher priority to be satisfied.
>>> 6) This option has no conflict with memmap option.
>>>
>>>
>>>
>>> Tang Chen (4):
>>>    page_alloc: add movable_memmap kernel parameter
>>>    page_alloc: Introduce zone_movable_limit[] to keep movable limit for
>>>      nodes
>>>    page_alloc: Make movablecore_map has higher priority
>>>    page_alloc: Bootmem limit with movablecore_map
>>>
>>> Yasuaki Ishimatsu (1):
>>>    x86: get pg_data_t's memory from other node
>>>
>>>   Documentation/kernel-parameters.txt |   17 +++
>>>   arch/x86/mm/numa.c                  |   11 ++-
>>>   include/linux/memblock.h            |    1 +
>>>   include/linux/mm.h                  |   11 ++
>>>   mm/memblock.c                       |   15 +++-
>>>   mm/page_alloc.c                     |  216
>>> ++++++++++++++++++++++++++++++++++-
>>>   6 files changed, 263 insertions(+), 8 deletions(-)
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@kvack.org.  For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>
>>
>>
>
-- 
Regards,
--Bob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-11-27 12:09 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-23 10:44 Tang Chen
2012-11-23 10:44 ` [PATCH v2 1/5] x86: get pg_data_t's memory from other node Tang Chen
2012-11-24  1:19   ` Jiang Liu
2012-11-26  1:19     ` Tang Chen
2012-12-02 15:11   ` Jiang Liu
2012-11-23 10:44 ` [PATCH v2 2/5] page_alloc: add movable_memmap kernel parameter Tang Chen
2012-11-23 10:44 ` [PATCH v2 3/5] page_alloc: Introduce zone_movable_limit[] to keep movable limit for nodes Tang Chen
2012-12-05 15:46   ` Jiang Liu
2012-12-06  1:20     ` Tang Chen
2012-11-23 10:44 ` [PATCH v2 4/5] page_alloc: Make movablecore_map has higher priority Tang Chen
2012-12-05 15:43   ` Jiang Liu
2012-12-06  1:26     ` Tang Chen
2012-12-06  2:26       ` Jiang Liu
2012-12-06  2:51         ` Jianguo Wu
2012-12-06  2:57           ` Tang Chen
2012-12-09  8:10         ` Tang Chen
2012-12-10  2:15           ` Jiang Liu
2012-11-23 10:44 ` [PATCH v2 5/5] page_alloc: Bootmem limit with movablecore_map Tang Chen
2012-11-26 12:22   ` wujianguo
2012-11-26 12:53     ` Tang Chen
2012-11-26 12:40   ` wujianguo
2012-11-26 13:15     ` Tang Chen
2012-11-26 15:48       ` H. Peter Anvin
2012-11-27  0:58         ` Jianguo Wu
2012-11-27  3:19           ` Wen Congyang
2012-11-27  3:22             ` Jianguo Wu
2012-11-27  3:34               ` Wen Congyang
2012-11-27  1:12         ` Jiang Liu
2012-11-27  1:20           ` H. Peter Anvin
2012-11-27  3:15         ` Wen Congyang
2012-11-27  5:31           ` H. Peter Anvin
2012-12-06 17:28             ` Jiang Liu
2012-12-06 17:41               ` H. Peter Anvin
2012-12-07  0:18                 ` Jiang Liu
2012-12-19  9:17     ` Tang Chen
2012-11-27  3:10 ` [PATCH v2 0/5] Add movablecore_map boot option wujianguo
2012-11-27  5:43   ` Tang Chen
2012-11-27  6:20     ` H. Peter Anvin
2012-11-27  6:47     ` Jianguo Wu
2012-11-28  3:47   ` Tang Chen
2012-11-28  4:01     ` Jiang Liu
2012-11-28  5:21       ` Wen Congyang
2012-11-28  5:17         ` Jiang Liu
2012-11-28  4:53     ` Jianguo Wu
2012-11-27  8:00 ` Bob Liu
2012-11-27  8:29   ` Tang Chen
2012-11-27  8:49     ` H. Peter Anvin
2012-11-27  9:47       ` Wen Congyang
2012-11-27  9:53         ` H. Peter Anvin
2012-11-27  9:59       ` Yasuaki Ishimatsu
2012-11-27 12:09     ` Bob Liu [this message]
2012-11-27 12:49       ` Tang Chen
2012-11-28  3:24         ` Bob Liu
2012-11-28  4:08           ` Jiang Liu
2012-11-28  6:16             ` Tang Chen
2012-11-28  7:03               ` Jiang Liu
2012-11-28  8:29             ` Wen Congyang
2012-11-28  8:28               ` Jiang Liu
2012-11-28  8:38                 ` Wen Congyang
2012-11-29  0:43               ` Jaegeuk Hanse
2012-11-29  1:24                 ` Tang Chen
2012-11-30  9:20             ` Lai Jiangshan
2012-11-28  8:47 ` Jiang Liu
2012-11-28 21:34   ` Luck, Tony
2012-11-28 21:38     ` H. Peter Anvin
2012-11-29 11:00       ` Mel Gorman
2012-11-29 16:07         ` H. Peter Anvin
2012-11-29 22:41           ` Luck, Tony
2012-11-29 22:45             ` H. Peter Anvin
2012-11-30  2:56         ` Jiang Liu
2012-11-30  3:15           ` Yasuaki Ishimatsu
2012-11-30 15:36             ` Jiang Liu
2012-11-30  2:58         ` Luck, Tony
2012-11-30  3:28           ` H. Peter Anvin
2012-11-30 10:19           ` Glauber Costa
2012-11-30 10:52           ` Mel Gorman
2012-11-29 10:38     ` Yasuaki Ishimatsu
2012-11-29 11:05       ` Mel Gorman
2012-11-29 15:47       ` Jiang Liu
2012-11-29 15:53       ` Jiang Liu
2012-11-29  1:42   ` Jaegeuk Hanse
2012-11-29  2:25     ` Jiang Liu
2012-11-29  2:49       ` Wanpeng Li
2012-11-29  2:49       ` Wanpeng Li
2012-11-29  2:59         ` Jiang Liu
2012-11-30 22:27       ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA_GA1ezZJyqVL=Dp5U2zzNw6bkfMKJY_STkt3E7TXkUYcv+jQ@mail.gmail.com' \
    --to=lliubbo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linfeng@cn.fujitsu.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=rientjes@google.com \
    --cc=rob@landley.net \
    --cc=rusty@rustcorp.com.au \
    --cc=tangchen@cn.fujitsu.com \
    --cc=wency@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox