From: Andrew Morton <akpm@linux-foundation.org>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>,
Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, Hugh Dickins <hugh@veritas.com>,
Ingo Molnar <mingo@elte.hu>
Subject: preemption and rwsems (was: Re: missing madvise functionality)
Date: Wed, 4 Apr 2007 16:00:06 -0700 [thread overview]
Message-ID: <20070404160006.8d81a533.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070403202937.GE355@devserv.devel.redhat.com>
On Tue, 3 Apr 2007 16:29:37 -0400
Jakub Jelinek <jakub@redhat.com> wrote:
> #include <pthread.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> void *
> tf (void *arg)
> {
> (void) arg;
> size_t ps = sysconf (_SC_PAGE_SIZE);
> void *p = mmap (NULL, 128 * ps, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> if (p == MAP_FAILED)
> exit (1);
> int i;
> for (i = 0; i < 100000; i++)
> {
> /* Pretend to use the buffer. */
> char *q, *r = (char *) p + 128 * ps;
> size_t s;
> for (q = (char *) p; q < r; q += ps)
> *q = 1;
> for (s = 0, q = (char *) p; q < r; q += ps)
> s += *q;
> /* Free it. Replace this mmap with
> madvise (p, 128 * ps, MADV_THROWAWAY) when implemented. */
> if (mmap (p, 128 * ps, PROT_NONE,
> MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) != p)
> exit (2);
> /* And immediately malloc again. This would then be deleted. */
> if (mprotect (p, 128 * ps, PROT_READ | PROT_WRITE))
> exit (3);
> }
> return NULL;
> }
>
> int
> main (void)
> {
> pthread_t th[32];
> int i;
> for (i = 0; i < 32; i++)
> if (pthread_create (&th[i], NULL, tf, NULL))
> exit (4);
> for (i = 0; i < 32; i++)
> pthread_join (th[i], NULL);
> return 0;
> }
This little test app is fun.
I run it all on a single CPU under `taskset -c 0' on the 8-way and it still
causes 160,000 context switches per second and takes 9.5 seconds (after
s/100000/1000).
The kernel has
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
and when I switch that to
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
the context switch rate falls to zilch and total runtime falls to 6.4
seconds.
Presumably the same problem will occur with CONFIG_PREEMPT_VOLUNTARY on
uniprocessor kernels.
<thinks>
What we effectively have is 32 threads on a single CPU all doing
for (ever) {
down_write()
up_write()
down_read()
up_read();
}
and rwsems are "fair". So
thread A thread B
down_write();
cond_resched()
->schedule()
down_read() -> blocks
up_write()
down_read()
up_read()
down_write() -> there's a reader: block
down_read() -> succeeds
up_read()
down_write() -> there's another down_writer: block
down_write() -> succeeds
up_write()
down_read() -> there's a down_writer: block
down_write() succeeds
up_write()
down_read() -> succeeds
up_read()
down_write() -> there's a down_reader: block
down_read() succeeds
ad nauseum.
If that cond_resched() was not there, none of this would ever happen - each
thread merrily chugs away doing its ups and downs until it expires its
timeslice. Interesting, in a sad sort of way.
Setting CONFIG_PREEMPT_NONE doesn't appear to make any difference to
context switch rate or runtime when all eight CPUs are used, so this
phenomenon is unlikely to be involved in the mysql problem.
I wonder why a similar thing doesn't happen when more than one CPU is used.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-04-04 23:00 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46128051.9000609@redhat.com>
[not found] ` <p73648dz5oa.fsf@bingen.suse.de>
[not found] ` <46128CC2.9090809@redhat.com>
[not found] ` <20070403172841.GB23689@one.firstfloor.org>
2007-04-03 19:59 ` missing madvise functionality Andrew Morton
2007-04-03 20:09 ` Andi Kleen
2007-04-03 20:17 ` Ulrich Drepper
2007-04-03 20:29 ` Jakub Jelinek
2007-04-03 20:38 ` Rik van Riel
2007-04-03 21:49 ` Andrew Morton
2007-04-03 23:01 ` Eric Dumazet
2007-04-04 2:22 ` Nick Piggin
2007-04-04 5:41 ` Eric Dumazet
2007-04-04 6:09 ` [patches] threaded vma patches (was Re: missing madvise functionality) Nick Piggin
2007-04-04 6:26 ` Andrew Morton
2007-04-04 6:38 ` Nick Piggin
2007-04-04 6:42 ` Ulrich Drepper
2007-04-04 6:44 ` Nick Piggin
2007-04-04 6:50 ` Eric Dumazet
2007-04-04 6:54 ` Ulrich Drepper
2007-04-04 7:33 ` Eric Dumazet
2007-04-04 8:25 ` missing madvise functionality Peter Zijlstra
2007-04-04 8:55 ` Nick Piggin
2007-04-04 9:12 ` William Lee Irwin III
2007-04-04 9:23 ` Nick Piggin
2007-04-04 9:34 ` Eric Dumazet
2007-04-04 9:45 ` Nick Piggin
2007-04-04 10:05 ` Nick Piggin
2007-04-04 11:54 ` Eric Dumazet
2007-04-05 2:01 ` Nick Piggin
2007-04-05 6:09 ` Eric Dumazet
2007-04-05 6:19 ` Ulrich Drepper
2007-04-05 6:54 ` Eric Dumazet
2007-04-03 23:02 ` Andrew Morton
2007-04-04 9:15 ` Hugh Dickins
2007-04-04 14:55 ` Rik van Riel
2007-04-04 15:25 ` Hugh Dickins
2007-04-05 1:44 ` Nick Piggin
2007-04-04 18:04 ` Andrew Morton
2007-04-04 18:08 ` Rik van Riel
2007-04-04 20:56 ` Andrew Morton
2007-04-04 18:39 ` Hugh Dickins
2007-04-03 23:44 ` Andrew Morton
2007-04-04 13:09 ` William Lee Irwin III
2007-04-04 13:38 ` William Lee Irwin III
2007-04-04 18:51 ` Andrew Morton
2007-04-05 4:14 ` William Lee Irwin III
2007-04-04 23:00 ` Andrew Morton [this message]
2007-04-05 7:31 ` Rik van Riel
2007-04-05 7:39 ` Rik van Riel
2007-04-05 8:32 ` Andrew Morton
2007-04-05 15:47 ` Rik van Riel
2007-04-05 8:08 ` Eric Dumazet
2007-04-05 8:31 ` Rik van Riel
2007-04-05 9:06 ` Eric Dumazet
2007-04-05 9:45 ` Jakub Jelinek
2007-04-05 16:15 ` Rik van Riel
2007-04-05 16:10 ` Ulrich Drepper
2007-04-06 2:28 ` Nick Piggin
2007-04-06 2:52 ` Ulrich Drepper
2007-04-06 2:59 ` Nick Piggin
2007-04-05 12:48 ` preemption and rwsems (was: Re: missing madvise functionality) David Howells
2007-04-05 19:11 ` Ingo Molnar
2007-04-05 20:37 ` Andrew Morton
2007-04-06 9:08 ` Ingo Molnar
2007-04-06 19:30 ` Andrew Morton
2007-04-06 19:40 ` Ingo Molnar
2007-04-05 19:27 ` Andrew Morton
2007-04-03 20:51 ` missing madvise functionality Andrew Morton
2007-04-03 20:57 ` Ulrich Drepper
2007-04-03 21:00 ` Rik van Riel
2007-04-03 21:10 ` Eric Dumazet
2007-04-03 21:12 ` Jörn Engel
2007-04-03 21:15 ` Rik van Riel
2007-04-03 21:30 ` Eric Dumazet
2007-04-03 21:22 ` Jeremy Fitzhardinge
2007-04-03 21:29 ` Rik van Riel
2007-04-03 21:46 ` Ulrich Drepper
2007-04-03 22:51 ` Andi Kleen
2007-04-03 23:07 ` Ulrich Drepper
2007-04-03 21:16 ` Andrew Morton
2007-04-04 18:49 ` Anton Blanchard
2007-04-04 7:46 ` Nick Piggin
2007-04-04 8:04 ` Nick Piggin
2007-04-04 8:20 ` Jakub Jelinek
2007-04-04 8:47 ` Nick Piggin
2007-04-05 4:23 ` Nick Piggin
2007-04-05 18:38 ` Rik van Riel
2007-04-05 21:07 ` Andrew Morton
2007-04-05 21:39 ` Rik van Riel
2007-04-06 1:28 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070404160006.8d81a533.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=drepper@redhat.com \
--cc=hugh@veritas.com \
--cc=jakub@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox