From: Christoph Lameter <cl@linux.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Vince Weaver <vincent.weaver@maine.edu>,
linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
trinity@vger.kernel.org, akpm@linux-foundation.org,
torvalds@linux-foundation.org, roland@kernel.org,
infinipath@qlogic.com, linux-mm@kvack.org,
linux-rdma@vger.kernel.org, Or Gerlitz <or.gerlitz@gmail.com>
Subject: Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK
Date: Fri, 24 May 2013 15:40:26 +0000 [thread overview]
Message-ID: <0000013ed732b615-748f574f-ccb8-4de7-bbe4-d85d1cbf0c9d-000000@email.amazonses.com> (raw)
In-Reply-To: <20130524140114.GK23650@twins.programming.kicks-ass.net>
On Fri, 24 May 2013, Peter Zijlstra wrote:
> Patch bc3e53f682 ("mm: distinguish between mlocked and pinned pages")
> broke RLIMIT_MEMLOCK.
Nope the patch fixed a problem with double accounting.
The problem that we seem to have is to define what mlocked and pinned mean
and how this relates to RLIMIT_MEMLOCK.
mlocked pages are pages that are movable (not pinned!!!) and that are
marked in some way by user space actions as mlocked (POSIX semantics).
They are marked with a special page flag (PG_mlocked).
Pinned pages are pages that have an elevated refcount because the hardware
needs to use these pages for I/O. The elevated refcount may be temporary
(then we dont care about this) or for a longer time (such as the memory
registration of the IB subsystem). That is when we account the memory as
pinned. The elevated refcount stops page migration and other things from
trying to move that memory.
Pages can be both pinned and mlocked. Before my patch some pages those two
issues were conflated since the same counter was used and therefore these
pages were counted twice. If an RDMA application was running using
mlockall() and was performing large scale I/O then the counters could show
extraordinary large numbers and the VM would start to behave erratically.
It is important for the VM to know which pages cannot be evicted but that
involves many more pages due to dirty pages etc etc.
So far the assumption has been that RLIMIT_MEMLOCK is a limit on the pages
that userspace has mlocked.
You want the counter to mean something different it seems. What is it?
I think we need to be first clear on what we want to accomplish and what
these counters actually should count before changing things.
Certainly would appreciate improvements in this area but resurrecting the
conflation between mlocked and pinned pages is not the way to go.
> This patch proposes to properly fix the problem by introducing
> VM_PINNED. This also provides the groundwork for a possible mpin()
> syscall or MADV_PIN -- although these are not included.
Maybe add a new PIN page flag? Pages are not pinned per vma as the patch
seems to assume.
> It recognises that pinned page semantics are a strict super-set of
> locked page semantics -- a pinned page will not generate major faults
> (and thus satisfies mlock() requirements).
Not exactly true. Pinned pages may not have the mlocked flag set and they
are not managed on the unevictable LRU lists of the MM.
> If people find this approach unworkable, I request we revert the above
> mentioned patch to at least restore RLIMIT_MEMLOCK to a usable state
> again.
Cannot do that. This will cause the breakage that the patch was fixing to
resurface.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-05-24 15:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <alpine.DEB.2.10.1305221523420.9944@vincent-weaver-1.um.maine.edu>
[not found] ` <alpine.DEB.2.10.1305221953370.11450@vincent-weaver-1.um.maine.edu>
[not found] ` <alpine.DEB.2.10.1305222344060.12929@vincent-weaver-1.um.maine.edu>
[not found] ` <20130523044803.GA25399@ZenIV.linux.org.uk>
[not found] ` <20130523104154.GA23650@twins.programming.kicks-ass.net>
[not found] ` <0000013ed1b8d0cc-ad2bb878-51bd-430c-8159-629b23ed1b44-000000@email.amazonses.com>
[not found] ` <20130523152458.GD23650@twins.programming.kicks-ass.net>
[not found] ` <0000013ed2297ba8-467d474a-7068-45b3-9fa3-82641e6aa363-000000@email.amazonses.com>
[not found] ` <20130523163901.GG23650@twins.programming.kicks-ass.net>
[not found] ` <0000013ed28b638a-066d7dc7-b590-49f8-9423-badb9537b8b6-000000@email.amazonses.com>
2013-05-24 14:01 ` Peter Zijlstra
2013-05-24 15:40 ` Christoph Lameter [this message]
2013-05-26 1:11 ` KOSAKI Motohiro
2013-05-28 16:19 ` Christoph Lameter
2013-05-27 6:48 ` Peter Zijlstra
2013-05-28 16:37 ` Christoph Lameter
2013-05-29 7:58 ` [regression] " Ingo Molnar
2013-05-29 19:53 ` KOSAKI Motohiro
2013-05-30 6:32 ` Ingo Molnar
2013-05-30 20:42 ` KOSAKI Motohiro
2013-05-31 9:27 ` Ingo Molnar
2013-05-30 18:30 ` Peter Zijlstra
2013-05-30 19:59 ` Pekka Enberg
2013-05-30 21:00 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0000013ed732b615-748f574f-ccb8-4de7-bbe4-d85d1cbf0c9d-000000@email.amazonses.com \
--to=cl@linux.com \
--cc=acme@ghostprotocols.net \
--cc=akpm@linux-foundation.org \
--cc=infinipath@qlogic.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=or.gerlitz@gmail.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=roland@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=trinity@vger.kernel.org \
--cc=vincent.weaver@maine.edu \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox