From: Hugh Dickins <hughd@google.com>
To: Nai Xia <nai.xia@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Chris Wright <chrisw@sous-sol.org>,
Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-janitors@vger.kernel.org
Subject: Re: [PATCH] ksm: add vm_stat and meminfo entry to reflect pte mapping to ksm pages
Date: Fri, 18 Mar 2011 15:40:00 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.00.1103181448100.2092@sister.anvils> (raw)
In-Reply-To: <201103181529.43659.nai.xia@gmail.com>
On Fri, 18 Mar 2011, Nai Xia wrote:
> >On Thursday 03 March 2011, at 06:31:42, <Andrew Morton <akpm@linux-foundation.org>> wrote
> > This patch obviously wasn't tested with CONFIG_KSM=n, which was a
> > pretty basic patch-testing failure :(
>
> Oops, I will be careful to avoid similar mistakes next time.
>
> >
> > I fixed up my tree with the below, but really the amount of ifdeffing
> > is unacceptable - please find a cleaner way to fix up this patch.
>
> Ok, I will have a try in my next patch submit.
A couple of notes on that.
akpm's fixup introduced an #ifdef CONFIG_KSM in mm/ksm.c: that should
be, er, unnecessary - since ksm.c is only compiled when CONFIG_KSM=y.
And PageKsm(page) evaluates to 0 when CONFIG_KSM is not set, so the
optimizer should eliminate code from most places without #ifdef:
though you need to keep the #ifdef around display in /proc/meminfo
itself, so as not to annoy non-KSM people with an always 0kB line.
But I am uncomfortable with the whole patch.
Can you make a stronger case for it? KSM is designed to have its own
cycle, and to keep out of the way of the rest of mm as much as possible
(not as much as originally hoped, I admit). Do we really want to show
its statistics in /proc/meminfo now? And do we really care that they
don't keep up with exiting processes when the scan rate is low?
I am not asserting that we don't, nor am I nacking your patch:
but I would like to hear more support for it, before it adds
yet another line to our user interface in /proc/meminfo.
And there is an awkward little bug in your patch, which amplifies
a more significant and shameful pair of bugs of mine in KSM itself -
no wonder that I'm anxious about your patch going in!
Your bug is precisely where akpm added the #ifdef in ksm.c. The
problem is that page_mapcount() is maintained atomically, generally
without spinlock or pagelock: so the value of mapcount there, unless
it is 1, can go up or down racily (as other processes sharing that
anonymous page fork or unmap at the same time).
I could hardly complain about that, while suggesting above that more
approximate numbers are good enough! Except that, when KSM is turned
off, there's a chance that you'd be left showing a non-0 kB in
/proc/meminfo. Then people will want a fix, and I don't yet know
what that fix will be.
My first bug is in the break_cow() technique used to get back to
normal, when merging into a KSM page fails for one reason or another:
that technique misses other mappings of the page. I did have a patch
in progress to fix that a few months ago, but it wasn't quite working,
and then I realized the second bug: that even when successful, if
VM_UNMERGEABLE has been used in forked processes, then we could end up
with a KSM page in a VM_UNMERGEABLE area, which is against the spec.
A solution to all three problems would be to revert to allocating a
separate KSM page, instead of using one of the pages already there.
But that feels like a regression, and I don't think anybody is really
hurting from the current situation, so I've not jumped to fix it yet.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-03-18 22:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-26 14:56 Nai Xia
2011-03-01 2:21 ` Dave Hansen
2011-03-18 6:44 ` xianai
2011-03-01 23:41 ` Andrew Morton
2011-03-18 7:16 ` Nai Xia
2011-03-02 22:31 ` Andrew Morton
2011-03-18 7:29 ` Nai Xia
2011-03-18 22:40 ` Hugh Dickins [this message]
2011-03-19 14:55 ` Nai Xia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.00.1103181448100.2092@sister.anvils \
--to=hughd@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chrisw@sous-sol.org \
--cc=kernel-janitors@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nai.xia@gmail.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox