From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
npiggin@suse.de,
"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
avi@redhat.com,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
torvalds@linux-foundation.org, aarcange@redhat.com
Subject: Re: [PATCH 0/2] ZERO PAGE again v3.
Date: Mon, 13 Jul 2009 14:45:50 +0900 [thread overview]
Message-ID: <20090713144550.764c8f82.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090709122428.8c2d4232.kamezawa.hiroyu@jp.fujitsu.com>
Do you think this kind of document is necessary for v4 ?
Any commetns are welcome.
Maybe some amount of people are busy at Montreal, then I'm not in hurry ;)
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Add a documenation about zero page at re-introducing it.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Documentation/vm/zeropage.txt | 77 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
Index: zeropage-trialv4/Documentation/vm/zeropage.txt
===================================================================
--- /dev/null
+++ zeropage-trialv4/Documentation/vm/zeropage.txt
@@ -0,0 +1,77 @@
+Zero Page.
+
+ZERO Page is a page filled with Zero and never modified (write-protected).
+Each arch has its own ZERO_PAGE in the kernel and macro ZERO_PAGE(addr) is
+provided. Now, usage of ZERO_PAGE() is limited.
+
+This documentation explains ZERO_PAGE() for private anonymous mappings.
+
+If CONFIG_SUPPORT_ANON_ZERO_PAGE==y, ZERO_PAGE is used for private anonymous
+mapping. If a read fault to anonymous private mapping occurs, ZERO_PAGE is
+mapped for the faulted address instead of an usual anonymous page. This mapped
+ZERO_PAGE is write-protected and the user process will do copy-on-write when
+it writes there. ZERO_PAGE is used only when vma is for PRIVATE mapping and
+has no vm_ops.
+
+Implementation Details
+ - ZERO_PAGE uses pte_special() for implementation. Then, an arch has to support
+ pte_special() to support ZERO_PAGE for Anon.
+ - ZERO_PAGE for anon has no reference counter manipulation at map/unmap.
+ - When get_user_pages() finds ZERO_PAGE, page->count is got/put.
+ - By passing special flags FOLL_NOZERO, the caller can ignore zero pages.
+ - Because ZERO_PAGE is used only when a read fault on MAP_PRIVATE anonymous
+ MAP_POPULATE may map ZERO_PAGE when it handles read only PRIVATE anonymous
+ mapping. Then, usual anonymous pages will be used in such case.
+ - At coredump, ZERO PAGE will be used for not-existing memory.
+
+For User Applications.
+
+ZERO Page is not the best solution for applications in many case. It's tend
+to be the second best if you have enough time to improve your applications.
+
+Pros. of ZERO Page
+ - not consume extra memory
+ - cpu cache over head is small.(if your cache is physically tagged.)
+ - page's reference count overhead is hidden. This is good for fork()/exec()
+ processes.
+
+Cons. of ZERO Page
+ - Just available for read-faulted anonymous private mappings.
+ - If applications depend on ZERO_PAGE, it means it consume extra TLB.
+ - you can only reduce the memory usage of read-faulted pages.
+
+ZERO Page is helpful in some cases but you can use following techniques.
+Followings are typical solutions for avoiding ZERO Pages. But please note, there
+are always trade-off among designs.
+
+ => Avoid large continuous mapping and use small mmaps.
+ If # of mmap doesn't increase very much, this is good because your
+ application can avoid TLB pollution by ZERO Page and never do unnecessary
+ access.
+
+ => Use large continuous mapping and see /proc/<pid>/pagemap
+ You can check "Which ptes are valid ?" by checking /proc/<pid>/pagemap
+ and avoid unnecessary fault at scanning memory range. But reading
+ /proc/<pid>/pagemap is not very low cost, then the benefit of this technique
+ is depends on usage.
+
+ => Use KSM.(to be implemented..)
+ KSM(kernel shared memory) can merge your anonymous mapped pages with pages
+ of the same contents. Then, ZERO Page will be merged and more pages will
+ be merged. But in bad case, pages are heavily shared and it may affects
+ performance of fork/exit/exec. Behavior depends on the latest KSM
+ implementations, please check.
+
+For kernel developers.
+ Your arch has to support pte_special() and add ARCH_SUPPORT_ANON_ZERO_PAGE=y
+ to use ZERO PAGE. If your arch's cpu-cache is virtually tagged, it's
+ recommended to turn off this feature. To test this, following case should
+ be checked.
+ - mmap/munmap/fork/exit/exec and touch anonymous private pages by READ.
+ - MAP_POPULATE in above test.
+ - mlock()
+ - coredump
+ - /dev/zero PRIVATE mapping
+
+
+
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-07-13 5:26 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-09 3:24 KAMEZAWA Hiroyuki
2009-07-09 3:27 ` [PATCH 1/2] ZERO PAGE config KAMEZAWA Hiroyuki
2009-07-09 3:28 ` [PATCH 2/2] ZERO PAGE by pte_special KAMEZAWA Hiroyuki
2009-07-09 3:58 ` Linus Torvalds
2009-07-09 4:54 ` KAMEZAWA Hiroyuki
2009-07-13 5:45 ` KAMEZAWA Hiroyuki [this message]
2009-07-16 9:01 ` [PATCH 0/2] ZERO PAGE again v4 KAMEZAWA Hiroyuki
2009-07-16 9:03 ` [PATCH 1/2] " KAMEZAWA Hiroyuki
2009-07-16 9:04 ` [PATCH 2/2] ZERO PAGE based on pte_special KAMEZAWA Hiroyuki
2009-07-16 12:00 ` Minchan Kim
2009-07-16 13:02 ` KAMEZAWA Hiroyuki
2009-07-17 0:38 ` KAMEZAWA Hiroyuki
2009-07-22 23:51 ` [PATCH 0/2] ZERO PAGE again v4 KAMEZAWA Hiroyuki
2009-07-23 0:12 ` Andrew Morton
2009-07-23 0:33 ` KAMEZAWA Hiroyuki
2009-07-23 0:47 ` Andrew Morton
2009-07-26 16:00 ` Hugh Dickins
2009-07-26 22:56 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090713144550.764c8f82.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox