linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jakub Jelinek <jakub@redhat.com>,
	Ulrich Drepper <drepper@redhat.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Hugh Dickins <hugh@veritas.com>
Subject: Re: missing madvise functionality
Date: Wed, 04 Apr 2007 12:22:00 +1000	[thread overview]
Message-ID: <46130BC8.9050905@yahoo.com.au> (raw)
In-Reply-To: <4612DCC6.7000504@cosmosbay.com>

Eric Dumazet wrote:
> Andrew Morton a ecrit :
> 
>> On Tue, 3 Apr 2007 16:29:37 -0400
>> Jakub Jelinek <jakub@redhat.com> wrote:
>>
>>> On Tue, Apr 03, 2007 at 01:17:09PM -0700, Ulrich Drepper wrote:
>>>
>>>> Andrew Morton wrote:
>>>>
>>>>> Ulrich, could you suggest a little test app which would demonstrate 
>>>>> this
>>>>> behaviour?
>>>>
>>>> It's not really reliably possible to demonstrate this with a small
>>>> program using malloc.  You'd need something like this mysql test case
>>>> which Rik said is not hard to run by yourself.
>>>>
>>>> If somebody adds a kernel interface I can easily produce a glibc patch
>>>> so that the test can be run in the new environment.
>>>>
>>>> But it's of course easy enough to simulate the specific problem in a
>>>> micro benchmark.  If you want that let me know.
>>>
>>> I think something like following testcase which simulates what free
>>> and malloc do when trimming/growing a non-main arena.
>>>
>>> My guess is that all the page zeroing is pretty expensive as well and
>>> takes significant time, but I haven't profiled it.
>>>
>>> #include <pthread.h>
>>> #include <stdlib.h>
>>> #include <sys/mman.h>
>>> #include <unistd.h>
>>>
>>> void *
>>> tf (void *arg)
>>> {
>>>   (void) arg;
>>>   size_t ps = sysconf (_SC_PAGE_SIZE);
>>>   void *p = mmap (NULL, 128 * ps, PROT_READ | PROT_WRITE,
>>>                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>>   if (p == MAP_FAILED)
>>>     exit (1);
>>>   int i;
>>>   for (i = 0; i < 100000; i++)
>>>     {
>>>       /* Pretend to use the buffer.  */
>>>       char *q, *r = (char *) p + 128 * ps;
>>>       size_t s;
>>>       for (q = (char *) p; q < r; q += ps)
>>>         *q = 1;
>>>       for (s = 0, q = (char *) p; q < r; q += ps)
>>>         s += *q;
>>>       /* Free it.  Replace this mmap with
>>>          madvise (p, 128 * ps, MADV_THROWAWAY) when implemented.  */
>>>       if (mmap (p, 128 * ps, PROT_NONE,
>>>                 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) != p)
>>>         exit (2);
>>>       /* And immediately malloc again.  This would then be deleted.  */
>>>       if (mprotect (p, 128 * ps, PROT_READ | PROT_WRITE))
>>>         exit (3);
>>>     }
>>>   return NULL;
>>> }
>>>
>>> int
>>> main (void)
>>> {
>>>   pthread_t th[32];
>>>   int i;
>>>   for (i = 0; i < 32; i++)
>>>     if (pthread_create (&th[i], NULL, tf, NULL))
>>>       exit (4);
>>>   for (i = 0; i < 32; i++)
>>>     pthread_join (th[i], NULL);
>>>   return 0;
>>> }
>>>
>>
>> whee.  135,000 context switches/sec on a slow 2-way.  mmap_sem, most
>> likely.  That is ungood.
>>
>> Did anyone monitor the context switch rate with the mysql test?
>>
>> Interestingly, your test app (with s/100000/1000) runs to completion 
>> in 13
>> seocnd on the slow 2-way.  On a fast 8-way, it took 52 seconds and
>> sustained 40,000 context switches/sec.  That's a bit unexpected.
>>
>> Both machines show ~8% idle time, too :(
> 
> 
> Yes... then add to this some futex work, and you get the picture.
> 
> I do think such workloads might benefit from a vma_cache not shared by 
> all threads but private to each thread. A sequence could invalidate the 
> cache(s).
> 
> ie instead of a mm->mmap_cache, having a mm->sequence, and each thread 
> having a current->mmap_cache and current->mm_sequence

I have a patchset to do exactly this, btw.

Anyway what is the status of the private futex work. I don't think that
is very intrusive or complicated, so it should get merged ASAP (so then
at least we have the interface there).

-- 
SUSE Labs, Novell Inc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-04-04  2:22 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <46128051.9000609@redhat.com>
     [not found] ` <p73648dz5oa.fsf@bingen.suse.de>
     [not found]   ` <46128CC2.9090809@redhat.com>
     [not found]     ` <20070403172841.GB23689@one.firstfloor.org>
2007-04-03 19:59       ` Andrew Morton
2007-04-03 20:09         ` Andi Kleen
2007-04-03 20:17         ` Ulrich Drepper
2007-04-03 20:29           ` Jakub Jelinek
2007-04-03 20:38             ` Rik van Riel
2007-04-03 21:49             ` Andrew Morton
2007-04-03 23:01               ` Eric Dumazet
2007-04-04  2:22                 ` Nick Piggin [this message]
2007-04-04  5:41                   ` Eric Dumazet
2007-04-04  6:09                     ` [patches] threaded vma patches (was Re: missing madvise functionality) Nick Piggin
2007-04-04  6:26                       ` Andrew Morton
2007-04-04  6:38                         ` Nick Piggin
2007-04-04  6:42                       ` Ulrich Drepper
2007-04-04  6:44                         ` Nick Piggin
2007-04-04  6:50                         ` Eric Dumazet
2007-04-04  6:54                           ` Ulrich Drepper
2007-04-04  7:33                             ` Eric Dumazet
2007-04-04  8:25                   ` missing madvise functionality Peter Zijlstra
2007-04-04  8:55                     ` Nick Piggin
2007-04-04  9:12                       ` William Lee Irwin III
2007-04-04  9:23                         ` Nick Piggin
2007-04-04  9:34                       ` Eric Dumazet
2007-04-04  9:45                         ` Nick Piggin
2007-04-04 10:05                         ` Nick Piggin
2007-04-04 11:54                           ` Eric Dumazet
2007-04-05  2:01                             ` Nick Piggin
2007-04-05  6:09                               ` Eric Dumazet
2007-04-05  6:19                                 ` Ulrich Drepper
2007-04-05  6:54                                   ` Eric Dumazet
2007-04-03 23:02               ` Andrew Morton
2007-04-04  9:15                 ` Hugh Dickins
2007-04-04 14:55                   ` Rik van Riel
2007-04-04 15:25                     ` Hugh Dickins
2007-04-05  1:44                       ` Nick Piggin
2007-04-04 18:04                   ` Andrew Morton
2007-04-04 18:08                     ` Rik van Riel
2007-04-04 20:56                       ` Andrew Morton
2007-04-04 18:39                     ` Hugh Dickins
2007-04-03 23:44               ` Andrew Morton
2007-04-04 13:09             ` William Lee Irwin III
2007-04-04 13:38               ` William Lee Irwin III
2007-04-04 18:51               ` Andrew Morton
2007-04-05  4:14                 ` William Lee Irwin III
2007-04-04 23:00             ` preemption and rwsems (was: Re: missing madvise functionality) Andrew Morton
2007-04-05  7:31             ` missing madvise functionality Rik van Riel
2007-04-05  7:39               ` Rik van Riel
2007-04-05  8:32                 ` Andrew Morton
2007-04-05 15:47                   ` Rik van Riel
2007-04-05  8:08               ` Eric Dumazet
2007-04-05  8:31                 ` Rik van Riel
2007-04-05  9:06                   ` Eric Dumazet
2007-04-05  9:45               ` Jakub Jelinek
2007-04-05 16:15                 ` Rik van Riel
2007-04-05 16:10               ` Ulrich Drepper
2007-04-06  2:28                 ` Nick Piggin
2007-04-06  2:52                   ` Ulrich Drepper
2007-04-06  2:59                     ` Nick Piggin
2007-04-05 12:48             ` preemption and rwsems (was: Re: missing madvise functionality) David Howells
2007-04-05 19:11               ` Ingo Molnar
2007-04-05 20:37                 ` Andrew Morton
2007-04-06  9:08                   ` Ingo Molnar
2007-04-06 19:30                     ` Andrew Morton
2007-04-06 19:40                       ` Ingo Molnar
2007-04-05 19:27               ` Andrew Morton
2007-04-03 20:51           ` missing madvise functionality Andrew Morton
2007-04-03 20:57             ` Ulrich Drepper
2007-04-03 21:00             ` Rik van Riel
2007-04-03 21:10               ` Eric Dumazet
2007-04-03 21:12                 ` Jörn Engel
2007-04-03 21:15                 ` Rik van Riel
2007-04-03 21:30                   ` Eric Dumazet
2007-04-03 21:22                 ` Jeremy Fitzhardinge
2007-04-03 21:29                   ` Rik van Riel
2007-04-03 21:46                 ` Ulrich Drepper
2007-04-03 22:51                   ` Andi Kleen
2007-04-03 23:07                     ` Ulrich Drepper
2007-04-03 21:16               ` Andrew Morton
2007-04-04 18:49             ` Anton Blanchard
2007-04-04  7:46 ` Nick Piggin
2007-04-04  8:04   ` Nick Piggin
2007-04-04  8:20   ` Jakub Jelinek
2007-04-04  8:47     ` Nick Piggin
2007-04-05  4:23       ` Nick Piggin
2007-04-05 18:38   ` Rik van Riel
2007-04-05 21:07     ` Andrew Morton
2007-04-05 21:39       ` Rik van Riel
2007-04-06  1:28     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46130BC8.9050905@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=drepper@redhat.com \
    --cc=hugh@veritas.com \
    --cc=jakub@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox