From: Hugh Dickins <hughd@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>,
Chintan Pandya <cpandya@codeaurora.org>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
John Stultz <john.stultz@linaro.org>,
Ingo Molnar <mingo@redhat.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH v4 2/2] ksm: provide support to use deferrable timers for scanner thread
Date: Tue, 9 Sep 2014 13:14:50 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.1409091225310.8432@eggly.anvils> (raw)
In-Reply-To: <20140908093949.GZ6758@twins.programming.kicks-ass.net>
On Mon, 8 Sep 2014, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 01:25:36AM -0700, Hugh Dickins wrote:
> >
> > --- 3.17-rc4/include/linux/ksm.h 2014-03-30 20:40:15.000000000 -0700
> > +++ linux/include/linux/ksm.h 2014-09-07 11:54:41.528003316 -0700
>
> > @@ -87,6 +96,11 @@ static inline void ksm_exit(struct mm_st
> > {
> > }
> >
> > +static inline wait_queue_head_t *ksm_switch(struct mm_struct *mm)
>
> s/ksm_switch/__&/
I don't think so (and I did try building with CONFIG_KSM off too).
>
> > +{
> > + return NULL;
> > +}
> > +
> > static inline int PageKsm(struct page *page)
> > {
> > return 0;
>
> > --- 3.17-rc4/kernel/sched/core.c 2014-08-16 16:00:54.062189063 -0700
> > +++ linux/kernel/sched/core.c 2014-09-07 11:54:41.528003316 -0700
>
> > @@ -2304,6 +2305,7 @@ context_switch(struct rq *rq, struct tas
> > struct task_struct *next)
> > {
> > struct mm_struct *mm, *oldmm;
> > + wait_queue_head_t *wake_ksm = NULL;
> >
> > prepare_task_switch(rq, prev, next);
> >
> > @@ -2320,8 +2322,10 @@ context_switch(struct rq *rq, struct tas
> > next->active_mm = oldmm;
> > atomic_inc(&oldmm->mm_count);
> > enter_lazy_tlb(oldmm, next);
> > - } else
> > + } else {
> > switch_mm(oldmm, mm, next);
> > + wake_ksm = ksm_switch(mm);
>
> Is this the right mm?
It's next->mm, that's the one I intended (though the patch might
be equally workable using prev->mm instead: given free rein, I'd
have opted for hooking into both prev and next, but free rein is
definitely not what should be granted around here!).
> We've just switched the stack,
I thought that came in switch_to() a few lines further down,
but don't think it matters for this.
> so we're looing at next->mm when we switched away from current.
> That might not exist anymore.
I fail to see how that can be. Looking at the x86 switch_mm(),
I can see it referencing (unsurprisingly!) both old and new mms
at this point, and no reference to an mm is dropped before the
ksm_switch(). oldmm (there called mm) is mmdropped later in
finish_task_switch().
>
> > + }
> >
> > if (!prev->mm) {
> > prev->active_mm = NULL;
> > @@ -2348,6 +2352,9 @@ context_switch(struct rq *rq, struct tas
> > * frame will be invalid.
> > */
> > finish_task_switch(this_rq(), prev);
> > +
> > + if (wake_ksm)
> > + wake_up_interruptible(wake_ksm);
> > }
>
> Quite horrible for sure. I really hate seeing KSM cruft all the way down
Yes, I expected that, and I would certainly feel the same way.
And even worse, imagine if this were successful, we might come along
and ask to do something similar for khugepaged. Though if it comes to
that, I'm sure we would generalize into one hook which does not say
"ksm" or "khugepaged" on it, but would still a present a single unlikely
flag to be tested at this level. Maybe you would even prefer the
generalized version, but I don't want to complicate the prototype yet.
If it weren't for the "we already have the mm cachelines here" argument,
I by now feel fairly sure that I would be going for hooking into timer
tick instead (where Thomas could then express his horror!).
Do you think I should just forget about cacheline micro-optimizations
and go in that direction instead?
> here. Can't we create a new (timer) infrastructure that does the right
> thing? Surely this isn't the only such case.
A sleep-walking timer, that goes to sleep in one bed, but may wake in
another; and defers while beds are empty? I'd be happy to try using
that for KSM if it already existed, and no doubt Chintan would too
But I don't think KSM presents a very good case for developing it.
I think KSM's use of a sleep_millisecs timer is really just an apology
for the amount of often wasted work that it does, and dates from before
we niced it down 5. I prefer the idea of a KSM which waits on activity
amongst the restricted set of tasks it is tracking: as this patch tries.
But my preference may be naive: doing lots of unnecessary work doesn't
matter as much as waking cpus from deep sleep.
>
> I know both RCU and some NOHZ_FULL muck already track when the system is
> completely idle. This is yet another case of that.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-09-09 20:16 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-20 12:10 [PATCH v4 1/2] timer: provide an api for deferrable timeout Chintan Pandya
2014-08-20 12:10 ` [PATCH v4 2/2] ksm: provide support to use deferrable timers for scanner thread Chintan Pandya
2014-08-28 6:02 ` Hugh Dickins
2014-09-03 9:58 ` Peter Zijlstra
2014-09-03 10:32 ` Thomas Gleixner
2014-09-08 8:25 ` Hugh Dickins
2014-09-08 9:39 ` Peter Zijlstra
2014-09-09 14:52 ` Chintan Pandya
2014-09-09 20:37 ` Hugh Dickins
2014-09-09 20:14 ` Hugh Dickins [this message]
2014-09-10 8:10 ` Peter Zijlstra
2014-09-11 12:27 ` Hugh Dickins
2014-09-10 8:27 ` Peter Zijlstra
2014-09-11 12:59 ` Hugh Dickins
2014-09-11 8:01 ` Chintan Pandya
2014-09-11 13:25 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1409091225310.8432@eggly.anvils \
--to=hughd@google.com \
--cc=akpm@linux-foundation.org \
--cc=cpandya@codeaurora.org \
--cc=fweisbec@gmail.com \
--cc=john.stultz@linaro.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox