From: Andrew Morton <akpm@linux-foundation.org>
To: Andi Kleen <andi@firstfloor.org>
Cc: Ulrich Drepper <drepper@redhat.com>,
Rik van Riel <riel@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Jakub Jelinek <jakub@redhat.com>,
linux-mm@kvack.org, Hugh Dickins <hugh@veritas.com>
Subject: Re: missing madvise functionality
Date: Tue, 3 Apr 2007 12:59:03 -0700 [thread overview]
Message-ID: <20070403125903.3e8577f4.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070403172841.GB23689@one.firstfloor.org>
On Tue, 3 Apr 2007 19:28:41 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> On Tue, Apr 03, 2007 at 10:20:02AM -0700, Ulrich Drepper wrote:
> > Andi Kleen wrote:
> > > Why do you need a lock for that? I don't see any problem with
> > > two threads doing that in parallel. The kernel would
> > > serialize it internally and one would fail, but that shouldn't
> > > be a problem.
> >
> > There is no lock at all at userlevel. I'm talking about locks in the
> > kernel.
>
> mmap_sem? Your new operation wouldn't solve that neither.
It might, a bit. Both mmap() and mprotect() currently take mmap_sem() for
writing. If we're careful, we could probably arrange for MADV_ULRICH to
take it for reading, which will help a little bit, hopefully.
It's a little sad that mprotect() takes mmap_sem for writing, really. I think
the only reason for doing that is because we might do a vma_merge() as a
result. Perhaps this is on the wrong side of the speed/space tradeoff.
otoh, converting a down_write() to a down_read() may well not have much
effect.
Ulrich, could you suggest a little test app which would demonstrate this
behaviour?
> There were some proposals to fix mmap_sem (it's a big issue
> for futexes too) but they're are quite involved.
yup.
Question:
> - if an access to a page in the range happens in the future it must
> succeed. The old page content can be provided or a new, empty page
> can be provided
How important is this "use the old page if it is available" feature? If we
were to simply implement a fast unconditional-free-the-page, so that
subsequent accesses always returned a new, zeroed page, do we expect that
this will be a 90%-good-enough thing, or will it be significantly
inefficient?
If we do implement this retain-the-old-page-if-possible feature, I'm
thinking that we can possibly reuse swapcache concepts. Such a page is
very similar to a clean, unmapped swapcache page, only it doesn't actually
have a swap mapping (well, it might have a swap mapping, in which case we
don't need to do anything at all, except deactivate it).
So perhaps we can do something like chop swapper_space in half: the lower
50% represent offsets which have a swap mapping and the upper 50% are fake
swapcache pages which don't actually consume swapspace. These pages are
unmapped from pagetables, marked clean, added to the fake part of
swapper_space and are deactivated. Teach the low-level swap code to ignore
the request to free physical swapspace when these pages are released.
Or, if that's all too hacky, create a new address_space for these pages and
burn a new page flag. But I suspect we'd end up duplicating so much
swapcache handling that this will end up looking silly.
This would all halve the maximum amount of swap which can be used. iirc
i386 supports 27 bits of swapcache indexing, and 26 bits is 274GB, which
is hopefully enough..
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next parent reply other threads:[~2007-04-03 19:59 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46128051.9000609@redhat.com>
[not found] ` <p73648dz5oa.fsf@bingen.suse.de>
[not found] ` <46128CC2.9090809@redhat.com>
[not found] ` <20070403172841.GB23689@one.firstfloor.org>
2007-04-03 19:59 ` Andrew Morton [this message]
2007-04-03 20:09 ` Andi Kleen
2007-04-03 20:17 ` Ulrich Drepper
2007-04-03 20:29 ` Jakub Jelinek
2007-04-03 20:38 ` Rik van Riel
2007-04-03 21:49 ` Andrew Morton
2007-04-03 23:01 ` Eric Dumazet
2007-04-04 2:22 ` Nick Piggin
2007-04-04 5:41 ` Eric Dumazet
2007-04-04 6:09 ` [patches] threaded vma patches (was Re: missing madvise functionality) Nick Piggin
2007-04-04 6:26 ` Andrew Morton
2007-04-04 6:38 ` Nick Piggin
2007-04-04 6:42 ` Ulrich Drepper
2007-04-04 6:44 ` Nick Piggin
2007-04-04 6:50 ` Eric Dumazet
2007-04-04 6:54 ` Ulrich Drepper
2007-04-04 7:33 ` Eric Dumazet
2007-04-04 8:25 ` missing madvise functionality Peter Zijlstra
2007-04-04 8:55 ` Nick Piggin
2007-04-04 9:12 ` William Lee Irwin III
2007-04-04 9:23 ` Nick Piggin
2007-04-04 9:34 ` Eric Dumazet
2007-04-04 9:45 ` Nick Piggin
2007-04-04 10:05 ` Nick Piggin
2007-04-04 11:54 ` Eric Dumazet
2007-04-05 2:01 ` Nick Piggin
2007-04-05 6:09 ` Eric Dumazet
2007-04-05 6:19 ` Ulrich Drepper
2007-04-05 6:54 ` Eric Dumazet
2007-04-03 23:02 ` Andrew Morton
2007-04-04 9:15 ` Hugh Dickins
2007-04-04 14:55 ` Rik van Riel
2007-04-04 15:25 ` Hugh Dickins
2007-04-05 1:44 ` Nick Piggin
2007-04-04 18:04 ` Andrew Morton
2007-04-04 18:08 ` Rik van Riel
2007-04-04 20:56 ` Andrew Morton
2007-04-04 18:39 ` Hugh Dickins
2007-04-03 23:44 ` Andrew Morton
2007-04-04 13:09 ` William Lee Irwin III
2007-04-04 13:38 ` William Lee Irwin III
2007-04-04 18:51 ` Andrew Morton
2007-04-05 4:14 ` William Lee Irwin III
2007-04-04 23:00 ` preemption and rwsems (was: Re: missing madvise functionality) Andrew Morton
2007-04-05 7:31 ` missing madvise functionality Rik van Riel
2007-04-05 7:39 ` Rik van Riel
2007-04-05 8:32 ` Andrew Morton
2007-04-05 15:47 ` Rik van Riel
2007-04-05 8:08 ` Eric Dumazet
2007-04-05 8:31 ` Rik van Riel
2007-04-05 9:06 ` Eric Dumazet
2007-04-05 9:45 ` Jakub Jelinek
2007-04-05 16:15 ` Rik van Riel
2007-04-05 16:10 ` Ulrich Drepper
2007-04-06 2:28 ` Nick Piggin
2007-04-06 2:52 ` Ulrich Drepper
2007-04-06 2:59 ` Nick Piggin
2007-04-05 12:48 ` preemption and rwsems (was: Re: missing madvise functionality) David Howells
2007-04-05 19:11 ` Ingo Molnar
2007-04-05 20:37 ` Andrew Morton
2007-04-06 9:08 ` Ingo Molnar
2007-04-06 19:30 ` Andrew Morton
2007-04-06 19:40 ` Ingo Molnar
2007-04-05 19:27 ` Andrew Morton
2007-04-03 20:51 ` missing madvise functionality Andrew Morton
2007-04-03 20:57 ` Ulrich Drepper
2007-04-03 21:00 ` Rik van Riel
2007-04-03 21:10 ` Eric Dumazet
2007-04-03 21:12 ` Jörn Engel
2007-04-03 21:15 ` Rik van Riel
2007-04-03 21:30 ` Eric Dumazet
2007-04-03 21:22 ` Jeremy Fitzhardinge
2007-04-03 21:29 ` Rik van Riel
2007-04-03 21:46 ` Ulrich Drepper
2007-04-03 22:51 ` Andi Kleen
2007-04-03 23:07 ` Ulrich Drepper
2007-04-03 21:16 ` Andrew Morton
2007-04-04 18:49 ` Anton Blanchard
2007-04-04 7:46 ` Nick Piggin
2007-04-04 8:04 ` Nick Piggin
2007-04-04 8:20 ` Jakub Jelinek
2007-04-04 8:47 ` Nick Piggin
2007-04-05 4:23 ` Nick Piggin
2007-04-05 18:38 ` Rik van Riel
2007-04-05 21:07 ` Andrew Morton
2007-04-05 21:39 ` Rik van Riel
2007-04-06 1:28 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070403125903.3e8577f4.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=drepper@redhat.com \
--cc=hugh@veritas.com \
--cc=jakub@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox