From: Marcelo Tosatti <mtosatti@redhat.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Tejun Heo <tj@kernel.org>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations
Date: Tue, 27 May 2014 11:53:52 -0300 [thread overview]
Message-ID: <20140527145352.GB3765@amt.cnet> (raw)
In-Reply-To: <alpine.DEB.2.10.1405270917510.13999@gentwo.org>
On Tue, May 27, 2014 at 09:21:32AM -0500, Christoph Lameter wrote:
> On Fri, 23 May 2014, Marcelo Tosatti wrote:
>
> > Zone specific allocations, such as GFP_DMA32, should not be restricted
> > to cpusets allowed node list: the zones which such allocations demand
> > might be contained in particular nodes outside the cpuset node list.
> >
> > The alternative would be to not perform such allocations from
> > applications which are cpuset restricted, which is unrealistic.
> >
> > Fixes KVM's alloc_page(gfp_mask=GFP_DMA32) with cpuset as explained.
>
> Memory policies are only applied to a specific zone so this is not
> unprecedented. However, if a user wants to limit allocation to a specific
> node and there is no DMA memory there then may be that is a operator
> error? After all the application will be using memory from a node that the
> operator explicitly wanted not to be used.
Ok here is the use-case:
- machine contains driver which requires zone specific memory (such as
KVM, which requires root pagetable at paddr < 4GB).
- user wants to limit allocation of application to nodeX, and nodeX has
no memory < 4GB.
How would you solve that? Options:
1) force admin to allow allocation from node(s) which contain 0-4GB
range, which unfortunately would allow every allocation, including
ones which are not restricted to particular nodes, to be performed
there.
or
2) allow zone specific allocations to bypass memory policies.
It seems 2) is the best option (and there is precedent for it).
> There is also the hardwall flag. I think its ok to allocate outside of the
> cpuset if that flag is not set. However, if it is set then any attempt to
> alloc outside of the cpuset should fail.
GFP_ATOMIC bypasses hardwall:
* The second pass through get_page_from_freelist() doesn't even call
* here for GFP_ATOMIC calls. For those calls, the __alloc_pages()
* variable 'wait' is not set, and the bit ALLOC_CPUSET is not set
* in alloc_flags. That logic and the checks below have the combined
* affect that:
* in_interrupt - any node ok (current task context irrelevant)
* GFP_ATOMIC - any node ok
* TIF_MEMDIE - any node ok
* GFP_KERNEL - any node in enclosing hardwalled cpuset ok
* GFP_USER - only nodes in current tasks mems allowed ok.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-05-27 14:54 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-23 19:37 Marcelo Tosatti
2014-05-23 20:51 ` David Rientjes
2014-05-23 23:33 ` Marcelo Tosatti
2014-05-26 18:53 ` [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations (v2) Marcelo Tosatti
2014-05-28 7:02 ` Li Zefan
2014-05-28 22:43 ` [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations (v3) Marcelo Tosatti
2014-05-28 23:45 ` Christoph Lameter
2014-05-29 18:46 ` Marcelo Tosatti
2014-05-29 18:43 ` [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations (v4) Marcelo Tosatti
2014-05-29 22:40 ` Andrew Morton
2014-05-29 23:01 ` David Rientjes
2014-05-29 23:12 ` Andrew Morton
2014-05-30 13:48 ` Christoph Lameter
2014-05-30 21:43 ` Marcelo Tosatti
2014-05-29 23:28 ` [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations (v5) Marcelo Tosatti
2014-05-29 23:54 ` David Rientjes
2014-05-30 13:12 ` Marcelo Tosatti
2014-05-30 13:50 ` Christoph Lameter
2014-05-30 21:18 ` Andi Kleen
2014-05-27 14:21 ` [PATCH] page_alloc: skip cpuset enforcement for lower zone allocations Christoph Lameter
2014-05-27 14:53 ` Marcelo Tosatti [this message]
2014-05-27 14:57 ` Marcelo Tosatti
2014-05-27 15:31 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140527145352.GB3765@amt.cnet \
--to=mtosatti@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox