From: Eric Dumazet <dada1@cosmosbay.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jakub Jelinek <jakub@redhat.com>,
Ulrich Drepper <drepper@redhat.com>,
Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, Hugh Dickins <hugh@veritas.com>
Subject: Re: missing madvise functionality
Date: Wed, 04 Apr 2007 01:01:26 +0200 [thread overview]
Message-ID: <4612DCC6.7000504@cosmosbay.com> (raw)
In-Reply-To: <20070403144948.fe8eede6.akpm@linux-foundation.org>
Andrew Morton a ecrit :
> On Tue, 3 Apr 2007 16:29:37 -0400
> Jakub Jelinek <jakub@redhat.com> wrote:
>
>> On Tue, Apr 03, 2007 at 01:17:09PM -0700, Ulrich Drepper wrote:
>>> Andrew Morton wrote:
>>>> Ulrich, could you suggest a little test app which would demonstrate this
>>>> behaviour?
>>> It's not really reliably possible to demonstrate this with a small
>>> program using malloc. You'd need something like this mysql test case
>>> which Rik said is not hard to run by yourself.
>>>
>>> If somebody adds a kernel interface I can easily produce a glibc patch
>>> so that the test can be run in the new environment.
>>>
>>> But it's of course easy enough to simulate the specific problem in a
>>> micro benchmark. If you want that let me know.
>> I think something like following testcase which simulates what free
>> and malloc do when trimming/growing a non-main arena.
>>
>> My guess is that all the page zeroing is pretty expensive as well and
>> takes significant time, but I haven't profiled it.
>>
>> #include <pthread.h>
>> #include <stdlib.h>
>> #include <sys/mman.h>
>> #include <unistd.h>
>>
>> void *
>> tf (void *arg)
>> {
>> (void) arg;
>> size_t ps = sysconf (_SC_PAGE_SIZE);
>> void *p = mmap (NULL, 128 * ps, PROT_READ | PROT_WRITE,
>> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> if (p == MAP_FAILED)
>> exit (1);
>> int i;
>> for (i = 0; i < 100000; i++)
>> {
>> /* Pretend to use the buffer. */
>> char *q, *r = (char *) p + 128 * ps;
>> size_t s;
>> for (q = (char *) p; q < r; q += ps)
>> *q = 1;
>> for (s = 0, q = (char *) p; q < r; q += ps)
>> s += *q;
>> /* Free it. Replace this mmap with
>> madvise (p, 128 * ps, MADV_THROWAWAY) when implemented. */
>> if (mmap (p, 128 * ps, PROT_NONE,
>> MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) != p)
>> exit (2);
>> /* And immediately malloc again. This would then be deleted. */
>> if (mprotect (p, 128 * ps, PROT_READ | PROT_WRITE))
>> exit (3);
>> }
>> return NULL;
>> }
>>
>> int
>> main (void)
>> {
>> pthread_t th[32];
>> int i;
>> for (i = 0; i < 32; i++)
>> if (pthread_create (&th[i], NULL, tf, NULL))
>> exit (4);
>> for (i = 0; i < 32; i++)
>> pthread_join (th[i], NULL);
>> return 0;
>> }
>>
>
> whee. 135,000 context switches/sec on a slow 2-way. mmap_sem, most
> likely. That is ungood.
>
> Did anyone monitor the context switch rate with the mysql test?
>
> Interestingly, your test app (with s/100000/1000) runs to completion in 13
> seocnd on the slow 2-way. On a fast 8-way, it took 52 seconds and
> sustained 40,000 context switches/sec. That's a bit unexpected.
>
> Both machines show ~8% idle time, too :(
Yes... then add to this some futex work, and you get the picture.
I do think such workloads might benefit from a vma_cache not shared by all
threads but private to each thread. A sequence could invalidate the cache(s).
ie instead of a mm->mmap_cache, having a mm->sequence, and each thread having
a current->mmap_cache and current->mm_sequence
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-04-03 23:01 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46128051.9000609@redhat.com>
[not found] ` <p73648dz5oa.fsf@bingen.suse.de>
[not found] ` <46128CC2.9090809@redhat.com>
[not found] ` <20070403172841.GB23689@one.firstfloor.org>
2007-04-03 19:59 ` Andrew Morton
2007-04-03 20:09 ` Andi Kleen
2007-04-03 20:17 ` Ulrich Drepper
2007-04-03 20:29 ` Jakub Jelinek
2007-04-03 20:38 ` Rik van Riel
2007-04-03 21:49 ` Andrew Morton
2007-04-03 23:01 ` Eric Dumazet [this message]
2007-04-04 2:22 ` Nick Piggin
2007-04-04 5:41 ` Eric Dumazet
2007-04-04 6:09 ` [patches] threaded vma patches (was Re: missing madvise functionality) Nick Piggin
2007-04-04 6:26 ` Andrew Morton
2007-04-04 6:38 ` Nick Piggin
2007-04-04 6:42 ` Ulrich Drepper
2007-04-04 6:44 ` Nick Piggin
2007-04-04 6:50 ` Eric Dumazet
2007-04-04 6:54 ` Ulrich Drepper
2007-04-04 7:33 ` Eric Dumazet
2007-04-04 8:25 ` missing madvise functionality Peter Zijlstra
2007-04-04 8:55 ` Nick Piggin
2007-04-04 9:12 ` William Lee Irwin III
2007-04-04 9:23 ` Nick Piggin
2007-04-04 9:34 ` Eric Dumazet
2007-04-04 9:45 ` Nick Piggin
2007-04-04 10:05 ` Nick Piggin
2007-04-04 11:54 ` Eric Dumazet
2007-04-05 2:01 ` Nick Piggin
2007-04-05 6:09 ` Eric Dumazet
2007-04-05 6:19 ` Ulrich Drepper
2007-04-05 6:54 ` Eric Dumazet
2007-04-03 23:02 ` Andrew Morton
2007-04-04 9:15 ` Hugh Dickins
2007-04-04 14:55 ` Rik van Riel
2007-04-04 15:25 ` Hugh Dickins
2007-04-05 1:44 ` Nick Piggin
2007-04-04 18:04 ` Andrew Morton
2007-04-04 18:08 ` Rik van Riel
2007-04-04 20:56 ` Andrew Morton
2007-04-04 18:39 ` Hugh Dickins
2007-04-03 23:44 ` Andrew Morton
2007-04-04 13:09 ` William Lee Irwin III
2007-04-04 13:38 ` William Lee Irwin III
2007-04-04 18:51 ` Andrew Morton
2007-04-05 4:14 ` William Lee Irwin III
2007-04-04 23:00 ` preemption and rwsems (was: Re: missing madvise functionality) Andrew Morton
2007-04-05 7:31 ` missing madvise functionality Rik van Riel
2007-04-05 7:39 ` Rik van Riel
2007-04-05 8:32 ` Andrew Morton
2007-04-05 15:47 ` Rik van Riel
2007-04-05 8:08 ` Eric Dumazet
2007-04-05 8:31 ` Rik van Riel
2007-04-05 9:06 ` Eric Dumazet
2007-04-05 9:45 ` Jakub Jelinek
2007-04-05 16:15 ` Rik van Riel
2007-04-05 16:10 ` Ulrich Drepper
2007-04-06 2:28 ` Nick Piggin
2007-04-06 2:52 ` Ulrich Drepper
2007-04-06 2:59 ` Nick Piggin
2007-04-05 12:48 ` preemption and rwsems (was: Re: missing madvise functionality) David Howells
2007-04-05 19:11 ` Ingo Molnar
2007-04-05 20:37 ` Andrew Morton
2007-04-06 9:08 ` Ingo Molnar
2007-04-06 19:30 ` Andrew Morton
2007-04-06 19:40 ` Ingo Molnar
2007-04-05 19:27 ` Andrew Morton
2007-04-03 20:51 ` missing madvise functionality Andrew Morton
2007-04-03 20:57 ` Ulrich Drepper
2007-04-03 21:00 ` Rik van Riel
2007-04-03 21:10 ` Eric Dumazet
2007-04-03 21:12 ` Jörn Engel
2007-04-03 21:15 ` Rik van Riel
2007-04-03 21:30 ` Eric Dumazet
2007-04-03 21:22 ` Jeremy Fitzhardinge
2007-04-03 21:29 ` Rik van Riel
2007-04-03 21:46 ` Ulrich Drepper
2007-04-03 22:51 ` Andi Kleen
2007-04-03 23:07 ` Ulrich Drepper
2007-04-03 21:16 ` Andrew Morton
2007-04-04 18:49 ` Anton Blanchard
2007-04-04 7:46 ` Nick Piggin
2007-04-04 8:04 ` Nick Piggin
2007-04-04 8:20 ` Jakub Jelinek
2007-04-04 8:47 ` Nick Piggin
2007-04-05 4:23 ` Nick Piggin
2007-04-05 18:38 ` Rik van Riel
2007-04-05 21:07 ` Andrew Morton
2007-04-05 21:39 ` Rik van Riel
2007-04-06 1:28 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4612DCC6.7000504@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=drepper@redhat.com \
--cc=hugh@veritas.com \
--cc=jakub@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox