linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Xishi Qiu <qiuxishi@huawei.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	Hanjun Guo <guohanjun@huawei.com>, Xiexiuqi <xiexiuqi@huawei.com>,
	leon@leon.nu, Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Vlastimil Babka <vbabka@suse.cz>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations
Date: Mon, 13 Jul 2015 12:56:46 +0800	[thread overview]
Message-ID: <55A3450E.6050707@huawei.com> (raw)
In-Reply-To: <20150630115353.GB6812@suse.de>

On 2015/6/30 19:53, Mel Gorman wrote:

> On Tue, Jun 30, 2015 at 12:46:54PM +0200, Ingo Molnar wrote:
>>
>> * Mel Gorman <mgorman@suse.de> wrote:
>>
>>> [...]
>>>
>>> Basically, overall I feel this series is the wrong approach but not knowing who 
>>> the users are making is much harder to judge. I strongly suspect that if 
>>> mirrored memory is to be properly used then it needs to be available before the 
>>> page allocator is even active. Once active, there needs to be controlled access 
>>> for allocation requests that are really critical to mirror and not just all 
>>> kernel allocations. None of that would use a MIGRATE_TYPE approach. It would be 
>>> alterations to the bootmem allocator and access to an explicit reserve that is 
>>> not accounted for as "free memory" and accessed via an explicit GFP flag.
>>
>> So I think the main goal is to avoid kernel crashes when a #MC memory fault 
>> arrives on a piece of memory that is owned by the kernel.
>>
> 
> Sounds logical. In that case, bootmem awareness would be crucial.
> Enabling support in just the page allocator is too late.
> 
>> In that sense 'protecting' all kernel allocations is natural: we don't know how to 
>> recover from faults that affect kernel memory.
>>
> 
> It potentially uses all mirrored memory on memory that does not need that
> sort of guarantee. For example, if there was a MC on memory backing the
> inode cache then potentially that is recoverable as long as the inodes
> were not dirty. That's a minor detail as the kernel could later protect
> only MIGRATE_UNMOVABLE requests instead of all kernel allocations if fatal
> MC in kernel space could be distinguished from non-fatal checks.
> 
> Bootmem awareness is much more important either way. If that was addressed
> then potentially a MIGRATE_UNMOVABLE_MIRROR type could be created that
> is only used for MIGRATE_UNMOVABLE allocations and never for user-space.
> That misses MIGRATE_RECLAIMABLE so if that is required then we need
> something else that both preserves fragmentation avoidance and avoid
> introducing loads of new migratetypes.
> 
> Reclaim-related issues could be partially avoided by forbidding use from
> userspace and accounting for the size of MIGRATE_UNMOVABLE_MIRROR during
> watermark checks.
> 
>> We do know how to recover from faults that affect user-space memory alone.
>>
>> So if a mechanism is in place that prioritizes 3 groups of allocators:
>>
>>   - non-recoverable memory (kernel allocations mostly)
>>
> 
> So bootmem at the very least followed by MIGRATE_UNMOVABLE requests whether
> they are accounted for by zones of MIGRATE_TYPES.
> 
>>   - high priority user memory (critical apps that must never fail)
>>
> 
> This one is problematic with a MIGRATE_TYPE-based approach such as the one in
> this series. If a high priority requires memory and MIGRATE_MIRROR is full
> then some of it must be reclaimed. With a MIGRATE_TYPE approach, the kernel
> may reclaim a lot of unnecessary memory trying to free some MIGRATE_MIRROR
> memory with no guarantee of success. It'll look like unnecessary thrashing
> from userspace but difficult to diagnose as reclaim stats are per-zone based.
> Dealing with this needs either a zone-based approach or a lot of surgery
> to reclaim (similar to what the node-based LRU series does actually when
> it skips pages when the caller requires lowmem pages).
> 

Hi Mel,

Thank you for your comment. Sorry for replying late and some of it is not
very understanding for me.

If fatal memory faults in kernel space could be distinguished from non-fatal,
we can use only MIGRATE_UNMOVABLE_MIRROR, if can't, use two types for
MIGRATE_RECLAIMABLE and MIGRATE_UNMOVABLE, right?

Reclaim-related issues is similar to CMA in zone_watermark_ok(), right?

If we protect high priority user memory, use a new mirrored zone may be
better, right?

How about use a flag(e.g. GFP_MIRROR) to in kernel space allocation?
Can we use it to sort kernel space allocation? And it can also called by 
user space via madvise and mmap.

Thanks,
Xishi Qiu


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      parent reply	other threads:[~2015-07-13  5:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-27  2:19 Xishi Qiu
2015-06-27  2:23 ` [RFC v2 PATCH 1/8] mm: add a new config to manage the code Xishi Qiu
2015-06-29  6:50   ` Kamezawa Hiroyuki
2015-06-30  2:52     ` Xishi Qiu
2015-06-27  2:24 ` [RFC v2 PATCH 2/8] mm: introduce MIGRATE_MIRROR to manage the mirrored pages Xishi Qiu
2015-06-29  7:32   ` Kamezawa Hiroyuki
2015-06-30  2:45     ` Xishi Qiu
2015-06-30  7:53       ` Kamezawa Hiroyuki
2015-06-30  9:22         ` Xishi Qiu
2015-06-27  2:24 ` [RFC v2 PATCH 3/8] mm: find mirrored memory in memblock Xishi Qiu
2015-06-27  2:25 ` [RFC v2 PATCH 4/8] mm: add mirrored memory to buddy system Xishi Qiu
2015-06-29  7:39   ` Kamezawa Hiroyuki
2015-06-27  2:26 ` [RFC v2 PATCH 5/8] mm: introduce a new zone_stat_item NR_FREE_MIRROR_PAGES Xishi Qiu
2015-06-27  2:27 ` [RFC v2 PATCH 6/8] mm: add free mirrored pages info Xishi Qiu
2015-06-27  2:27 ` [RFC v2 PATCH 7/8] mm: add the buddy system interface Xishi Qiu
2015-06-29 23:11   ` Luck, Tony
2015-06-30  1:01     ` Kamezawa Hiroyuki
2015-06-30  1:31       ` Xishi Qiu
2015-06-30  2:01         ` Kamezawa Hiroyuki
2015-06-27  2:28 ` [RFC v2 PATCH 8/8] mm: add the PCP interface Xishi Qiu
2015-06-29 15:19 ` [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations Dave Hansen
2015-06-30  1:26   ` Xishi Qiu
2015-06-30  1:52     ` Dave Hansen
2015-06-30  2:48       ` Xishi Qiu
2015-06-30  9:41 ` Mel Gorman
2015-06-30 10:46   ` Ingo Molnar
2015-06-30 11:53     ` Mel Gorman
2015-06-30 18:12       ` Luck, Tony
2015-07-13  4:56       ` Xishi Qiu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A3450E.6050707@huawei.com \
    --to=qiuxishi@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=leon@leon.nu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=tony.luck@intel.com \
    --cc=vbabka@suse.cz \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox