linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>,
	lwoodman@redhat.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	minchan.kim@gmail.com
Subject: Re: [PATCH 4/8] Use prepare_to_wait_exclusive() instead prepare_to_wait()
Date: Tue, 15 Dec 2009 09:28:59 +0100	[thread overview]
Message-ID: <1260865739.30062.16.camel@marge.simson.net> (raw)
In-Reply-To: <1260855146.6126.30.camel@marge.simson.net>

On Tue, 2009-12-15 at 06:32 +0100, Mike Galbraith wrote:
> On Tue, 2009-12-15 at 09:45 +0900, KOSAKI Motohiro wrote:
> > > On 12/14/2009 07:30 AM, KOSAKI Motohiro wrote:
> > > > if we don't use exclusive queue, wake_up() function wake _all_ waited
> > > > task. This is simply cpu wasting.
> > > >
> > > > Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
> > > 
> > > >   		if (zone_watermark_ok(zone, sc->order, low_wmark_pages(zone),
> > > >   					0, 0)) {
> > > > -			wake_up(wq);
> > > > +			wake_up_all(wq);
> > > >   			finish_wait(wq,&wait);
> > > >   			sc->nr_reclaimed += sc->nr_to_reclaim;
> > > >   			return -ERESTARTSYS;
> > > 
> > > I believe we want to wake the processes up one at a time
> > > here.  If the queue of waiting processes is very large
> > > and the amount of excess free memory is fairly low, the
> > > first processes that wake up can take the amount of free
> > > memory back down below the threshold.  The rest of the
> > > waiters should stay asleep when this happens.
> > 
> > OK.
> > 
> > Actually, wake_up() and wake_up_all() aren't different so much.
> > Although we use wake_up(), the task wake up next task before
> > try to alloate memory. then, it's similar to wake_up_all().
> 
> What happens to waiters should running tasks not allocate for a while?
> 
> > However, there are few difference. recent scheduler latency improvement
> > effort reduce default scheduler latency target. it mean, if we have
> > lots tasks of running state, the task have very few time slice. too
> > frequently context switch decrease VM efficiency.
> > Thank you, Rik. I didn't notice wake_up() makes better performance than
> > wake_up_all() on current kernel.
> 
> Perhaps this is a spot where an explicit wake_up_all_nopreempt() would
> be handy....

Maybe something like below.  I can also imagine that under _heavy_ vm
pressure, it'd likely be good for throughput to not provide for sleeper
fairness for these wakeups as well, as that increases vruntime spread,
and thus increases preemption with no benefit in sight.

---
 include/linux/sched.h |    1 +
 include/linux/wait.h  |    3 +++
 kernel/sched.c        |   21 +++++++++++++++++++++
 kernel/sched_fair.c   |    2 +-
 4 files changed, 26 insertions(+), 1 deletion(-)

Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -1065,6 +1065,7 @@ struct sched_domain;
  */
 #define WF_SYNC		0x01		/* waker goes to sleep after wakup */
 #define WF_FORK		0x02		/* child wakeup after fork */
+#define WF_NOPREEMPT	0x04		/* wakeup is not preemptive */
 
 struct sched_class {
 	const struct sched_class *next;
Index: linux-2.6/include/linux/wait.h
===================================================================
--- linux-2.6.orig/include/linux/wait.h
+++ linux-2.6/include/linux/wait.h
@@ -140,6 +140,7 @@ static inline void __remove_wait_queue(w
 }
 
 void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
+void __wake_up_nopreempt(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
 void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key);
 void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode, int nr,
 			void *key);
@@ -154,8 +155,10 @@ int out_of_line_wait_on_bit_lock(void *,
 wait_queue_head_t *bit_waitqueue(void *, int);
 
 #define wake_up(x)			__wake_up(x, TASK_NORMAL, 1, NULL)
+#define wake_up_nopreempt(x)		__wake_up_nopreempt(x, TASK_NORMAL, 1, NULL)
 #define wake_up_nr(x, nr)		__wake_up(x, TASK_NORMAL, nr, NULL)
 #define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
+#define wake_up_all_nopreempt(x)	__wake_up_nopreempt(x, TASK_NORMAL, 0, NULL)
 #define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL)
 
 #define wake_up_interruptible(x)	__wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -5682,6 +5682,27 @@ void __wake_up(wait_queue_head_t *q, uns
 }
 EXPORT_SYMBOL(__wake_up);
 
+/**
+ * __wake_up_nopreempt - wake up threads blocked on a waitqueue.
+ * @q: the waitqueue
+ * @mode: which threads
+ * @nr_exclusive: how many wake-one or wake-many threads to wake up
+ * @key: is directly passed to the wakeup function
+ *
+ * It may be assumed that this function implies a write memory barrier before
+ * changing the task state if and only if any tasks are woken up.
+ */
+void __wake_up_nopreempt(wait_queue_head_t *q, unsigned int mode,
+			int nr_exclusive, void *key)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->lock, flags);
+	__wake_up_common(q, mode, nr_exclusive, WF_NOPREEMPT, key);
+	spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(__wake_up_nopreempt);
+
 /*
  * Same as __wake_up but called with the spinlock in wait_queue_head_t held.
  */
Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -1709,7 +1709,7 @@ static void check_preempt_wakeup(struct
 			pse->avg_overlap < sysctl_sched_migration_cost)
 		goto preempt;
 
-	if (!sched_feat(WAKEUP_PREEMPT))
+	if (!sched_feat(WAKEUP_PREEMPT) || (wake_flags & WF_NOPREEMPT))
 		return;
 
 	update_curr(cfs_rq);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-12-15  8:29 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-11 21:46 [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone Rik van Riel
2009-12-14  0:14 ` Minchan Kim
2009-12-14  4:09   ` Rik van Riel
2009-12-14  4:19     ` Minchan Kim
2009-12-14  4:29       ` Rik van Riel
2009-12-14  5:00         ` Minchan Kim
2009-12-14 12:22 ` KOSAKI Motohiro
2009-12-14 12:23   ` [cleanup][PATCH 1/8] vmscan: Make shrink_zone_begin/end helper function KOSAKI Motohiro
2009-12-14 14:34     ` Rik van Riel
2009-12-14 22:39     ` Minchan Kim
2009-12-14 12:24   ` [PATCH 2/8] Mark sleep_on as deprecated KOSAKI Motohiro
2009-12-14 13:03     ` Christoph Hellwig
2009-12-14 16:04       ` Arjan van de Ven
2009-12-14 14:34     ` Rik van Riel
2009-12-14 22:44     ` Minchan Kim
2009-12-14 12:29   ` [PATCH 3/8] Don't use sleep_on() KOSAKI Motohiro
2009-12-14 14:35     ` Rik van Riel
2009-12-14 22:46     ` Minchan Kim
2009-12-14 12:30   ` [PATCH 4/8] Use prepare_to_wait_exclusive() instead prepare_to_wait() KOSAKI Motohiro
2009-12-14 14:33     ` Rik van Riel
2009-12-15  0:45       ` KOSAKI Motohiro
2009-12-15  5:32         ` Mike Galbraith
2009-12-15  8:28           ` Mike Galbraith [this message]
2009-12-15 14:36             ` Mike Galbraith
2009-12-15 14:58           ` Rik van Riel
2009-12-15 18:17             ` Mike Galbraith
2009-12-15 18:43             ` Mike Galbraith
2009-12-15 19:33               ` Rik van Riel
2009-12-16  0:48             ` KOSAKI Motohiro
2009-12-16  2:44               ` Rik van Riel
2009-12-16  5:43               ` Mike Galbraith
2009-12-14 23:03     ` Minchan Kim
2009-12-14 12:30   ` [PATCH 5/8] Use io_schedule() instead schedule() KOSAKI Motohiro
2009-12-14 14:37     ` Rik van Riel
2009-12-14 23:46     ` Minchan Kim
2009-12-15  0:56       ` KOSAKI Motohiro
2009-12-15  1:13         ` Minchan Kim
2009-12-14 12:31   ` [PATCH 6/8] Stop reclaim quickly when the task reclaimed enough lots pages KOSAKI Motohiro
2009-12-14 14:45     ` Rik van Riel
2009-12-14 23:51       ` KOSAKI Motohiro
2009-12-15  0:11     ` Minchan Kim
2009-12-15  0:35       ` KOSAKI Motohiro
2009-12-14 12:32   ` [PATCH 7/8] Use TASK_KILLABLE instead TASK_UNINTERRUPTIBLE KOSAKI Motohiro
2009-12-14 14:47     ` Rik van Riel
2009-12-14 23:52     ` Minchan Kim
2009-12-14 12:32   ` [PATCH 8/8] mm: Give up allocation if the task have fatal signal KOSAKI Motohiro
2009-12-14 14:48     ` Rik van Riel
2009-12-14 23:54     ` Minchan Kim
2009-12-15  0:50       ` KOSAKI Motohiro
2009-12-15  1:03         ` Minchan Kim
2009-12-15  1:16           ` KOSAKI Motohiro
2009-12-14 12:40   ` [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone KOSAKI Motohiro
2009-12-14 17:08 ` Larry Woodman
2009-12-15  0:49   ` KOSAKI Motohiro
     [not found]   ` <20091217193818.9FA9.A69D9226@jp.fujitsu.com>
2009-12-17 12:23     ` FWD: " Larry Woodman
2009-12-17 14:43       ` Rik van Riel
2009-12-17 19:55       ` Rik van Riel
2009-12-17 21:05         ` Hugh Dickins
2009-12-17 22:52           ` Rik van Riel
2009-12-18 16:23           ` Andrea Arcangeli
2009-12-18 17:43             ` Rik van Riel
2009-12-18 10:27       ` KOSAKI Motohiro
2009-12-18 14:09         ` Rik van Riel
2009-12-18 13:38 ` Avi Kivity
2009-12-18 14:12   ` Rik van Riel
2009-12-18 14:13     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1260865739.30062.16.camel@marge.simson.net \
    --to=efault@gmx.de \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox