From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ulrich Drepper <drepper@redhat.com>,
Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Jakub Jelinek <jakub@redhat.com>,
Linux Memory Management <linux-mm@kvack.org>
Subject: Re: missing madvise functionality
Date: Wed, 04 Apr 2007 18:04:30 +1000 [thread overview]
Message-ID: <46135C0E.5070803@yahoo.com.au> (raw)
In-Reply-To: <461357C4.4010403@yahoo.com.au>
Nick Piggin wrote:
> Ulrich Drepper wrote:
>
>> People might remember the thread about mysql not scaling and pointing
>> the finger quite happily at glibc. Well, the situation is not like that.
>>
>> The problem is glibc has to work around kernel limitations. If the
>> malloc implementation detects that a large chunk of previously allocated
>> memory is now free and unused it wants to return the memory to the
>> system. What we currently have to do is this:
>>
>> to free: mmap(PROT_NONE) over the area
>> to reuse: mprotect(PROT_READ|PROT_WRITE)
>>
>> Yep, that's expensive, both operations need to get locks preventing
>> other threads from doing the same.
>>
>> Some people were quick to suggest that we simply avoid the freeing in
>> many situations (that's what the patch submitted by Yanmin Zhang
>> basically does). That's no solution. One of the very good properties
>> of the current allocator is that it does not use much memory.
>
>
> Does mmap(PROT_NONE) actually free the memory?
>
>
>> A solution for this problem is a madvise() operation with the following
>> property:
>>
>> - the content of the address range can be discarded
>>
>> - if an access to a page in the range happens in the future it must
>> succeed. The old page content can be provided or a new, empty page
>> can be provided
>>
>> That's it. The current MADV_DONTNEED doesn't cut it because it zaps the
>> pages, causing *all* future reuses to create page faults. This is what
>> I guess happens in the mysql test case where the pages where unused and
>> freed but then almost immediately reused. The page faults erased all
>> the benefits of using one mprotect() call vs a pair of mmap()/mprotect()
>> calls.
>
>
> Two questions.
>
> In the case of pages being unused then almost immediately reused, why is
> it a bad solution to avoid freeing? Is it that you want to avoid
> heuristics because in some cases they could fail and end up using memory?
>
> Secondly, why is MADV_DONTNEED bad? How much more expensive is a pagefault
> than a syscall? (including the cost of the TLB fill for the memory access
> after the syscall, of course).
>
> zapping the pages puts them on a nice LIFO cache hot list of pages that
> can be quickly used when the next fault comes in, or used for any other
> allocation in the kernel. Putting them on some sort of reclaim list seems
> a bit pointless.
>
> Oh, also: something like this patch would help out MADV_DONTNEED, as it
> means it can run concurrently with page faults. I think the locking will
> work (but needs forward porting).
BTW. and this way it becomes much more attractive than using mmap/mprotect
can ever be, because they must take mmap_sem for writing always.
You don't actually need to protect the ranges unless running with use after
free debugging turned on, do you?
--
SUSE Labs, Novell Inc.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-04-04 8:04 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46128051.9000609@redhat.com>
[not found] ` <p73648dz5oa.fsf@bingen.suse.de>
[not found] ` <46128CC2.9090809@redhat.com>
[not found] ` <20070403172841.GB23689@one.firstfloor.org>
2007-04-03 19:59 ` Andrew Morton
2007-04-03 20:09 ` Andi Kleen
2007-04-03 20:17 ` Ulrich Drepper
2007-04-03 20:29 ` Jakub Jelinek
2007-04-03 20:38 ` Rik van Riel
2007-04-03 21:49 ` Andrew Morton
2007-04-03 23:01 ` Eric Dumazet
2007-04-04 2:22 ` Nick Piggin
2007-04-04 5:41 ` Eric Dumazet
2007-04-04 6:09 ` [patches] threaded vma patches (was Re: missing madvise functionality) Nick Piggin
2007-04-04 6:26 ` Andrew Morton
2007-04-04 6:38 ` Nick Piggin
2007-04-04 6:42 ` Ulrich Drepper
2007-04-04 6:44 ` Nick Piggin
2007-04-04 6:50 ` Eric Dumazet
2007-04-04 6:54 ` Ulrich Drepper
2007-04-04 7:33 ` Eric Dumazet
2007-04-04 8:25 ` missing madvise functionality Peter Zijlstra
2007-04-04 8:55 ` Nick Piggin
2007-04-04 9:12 ` William Lee Irwin III
2007-04-04 9:23 ` Nick Piggin
2007-04-04 9:34 ` Eric Dumazet
2007-04-04 9:45 ` Nick Piggin
2007-04-04 10:05 ` Nick Piggin
2007-04-04 11:54 ` Eric Dumazet
2007-04-05 2:01 ` Nick Piggin
2007-04-05 6:09 ` Eric Dumazet
2007-04-05 6:19 ` Ulrich Drepper
2007-04-05 6:54 ` Eric Dumazet
2007-04-03 23:02 ` Andrew Morton
2007-04-04 9:15 ` Hugh Dickins
2007-04-04 14:55 ` Rik van Riel
2007-04-04 15:25 ` Hugh Dickins
2007-04-05 1:44 ` Nick Piggin
2007-04-04 18:04 ` Andrew Morton
2007-04-04 18:08 ` Rik van Riel
2007-04-04 20:56 ` Andrew Morton
2007-04-04 18:39 ` Hugh Dickins
2007-04-03 23:44 ` Andrew Morton
2007-04-04 13:09 ` William Lee Irwin III
2007-04-04 13:38 ` William Lee Irwin III
2007-04-04 18:51 ` Andrew Morton
2007-04-05 4:14 ` William Lee Irwin III
2007-04-04 23:00 ` preemption and rwsems (was: Re: missing madvise functionality) Andrew Morton
2007-04-05 7:31 ` missing madvise functionality Rik van Riel
2007-04-05 7:39 ` Rik van Riel
2007-04-05 8:32 ` Andrew Morton
2007-04-05 15:47 ` Rik van Riel
2007-04-05 8:08 ` Eric Dumazet
2007-04-05 8:31 ` Rik van Riel
2007-04-05 9:06 ` Eric Dumazet
2007-04-05 9:45 ` Jakub Jelinek
2007-04-05 16:15 ` Rik van Riel
2007-04-05 16:10 ` Ulrich Drepper
2007-04-06 2:28 ` Nick Piggin
2007-04-06 2:52 ` Ulrich Drepper
2007-04-06 2:59 ` Nick Piggin
2007-04-05 12:48 ` preemption and rwsems (was: Re: missing madvise functionality) David Howells
2007-04-05 19:11 ` Ingo Molnar
2007-04-05 20:37 ` Andrew Morton
2007-04-06 9:08 ` Ingo Molnar
2007-04-06 19:30 ` Andrew Morton
2007-04-06 19:40 ` Ingo Molnar
2007-04-05 19:27 ` Andrew Morton
2007-04-03 20:51 ` missing madvise functionality Andrew Morton
2007-04-03 20:57 ` Ulrich Drepper
2007-04-03 21:00 ` Rik van Riel
2007-04-03 21:10 ` Eric Dumazet
2007-04-03 21:12 ` Jörn Engel
2007-04-03 21:15 ` Rik van Riel
2007-04-03 21:30 ` Eric Dumazet
2007-04-03 21:22 ` Jeremy Fitzhardinge
2007-04-03 21:29 ` Rik van Riel
2007-04-03 21:46 ` Ulrich Drepper
2007-04-03 22:51 ` Andi Kleen
2007-04-03 23:07 ` Ulrich Drepper
2007-04-03 21:16 ` Andrew Morton
2007-04-04 18:49 ` Anton Blanchard
2007-04-04 7:46 ` Nick Piggin
2007-04-04 8:04 ` Nick Piggin [this message]
2007-04-04 8:20 ` Jakub Jelinek
2007-04-04 8:47 ` Nick Piggin
2007-04-05 4:23 ` Nick Piggin
2007-04-05 18:38 ` Rik van Riel
2007-04-05 21:07 ` Andrew Morton
2007-04-05 21:39 ` Rik van Riel
2007-04-06 1:28 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46135C0E.5070803@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=drepper@redhat.com \
--cc=jakub@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox