From: Linus Torvalds <torvalds@transmeta.com>
To: Matthew Dillon <dillon@apollo.backplane.com>
Cc: Rik van Riel <riel@conectiva.com.br>,
Chris Wedgwood <cw@f00f.org>,
linux-mm@kvack.org, linux-kernel@vger.rutgers.edu
Subject: Re: RFC: design for new VM
Date: Fri, 4 Aug 2000 17:03:43 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.10.10008041655420.11340-100000@penguin.transmeta.com> (raw)
In-Reply-To: <200008042351.QAA89101@apollo.backplane.com>
On Fri, 4 Aug 2000, Matthew Dillon wrote:
> :
> :There are architecture-specific special cases, of course. On ia64, the
> :..
>
> I spent a weekend a few months ago trying to implement page table
> sharing in FreeBSD -- and gave up, but it left me with the feeling
> that it should be possible to do without polluting the general VM
> architecture.
>
> For IA32, what it comes down to is that the page table generated by
> any segment-aligned mmap() (segment == 4MB) made by two processes
> should be shareable, simply be sharing the page directory entry (and thus
> the physical page representing 4MB worth of mappings). This would be
> restricted to MAP_SHARED mappings with the same protections, but the two
> processes would not have to map the segments at the same VM address, they
> need only be segment-aligned.
I agree that from a page table standpoint you should be correct.
I don't think that the other issues are as easily resolved, though.
Especially with address space ID's on other architectures it can get
_really_ interesting to do TLB invalidates correctly to other CPU's etc
(you need to keep track of who shares parts of your page tables etc).
> This would be a transparent optimization wholely invisible to the process,
> something that would be optionally implemented in the machine-dependant
> part of the VM code (with general support in the machine-independant
> part for the concept). If the process did anything to create a mapping
> mismatch, such as call mprotect(), the shared page table would be split.
Right. But what about the TLB?
It's not a problem on the x86, because the x86 doesn't have ASN's anyway.
But fo rit to be a valid notion, I feel that it should be able to be
portable too.
You have to have some page table locking mechanism for SMP eventually: I
think you miss some of the problems because the current FreeBSD SMP stuff
is mostly still "big kernel lock" (outdated info?), and you'll end up
kicking yourself in a big way when you have the 300 processes sharing the
same lock for that region..
(Not that I think you'd necessarily have much contention on the lock - the
problem tends to be more in the logistics of keeping track of the locks of
partial VM regions etc).
> (Linux falls on its face for other reasons, mainly the fact that it
> maps all of physical memory into KVM in order to manage it).
Not true any more.. Trying to map 64GB of RAM convinced us otherwise ;)
> I think the loss of MP locking for this situation is outweighed by the
> benefit of a huge reduction in page faults -- rather then see 300
> processes each take a page fault on the same page, only the first process
> would and the pte would already be in place when the others got to it.
> When it comes right down to it, page faults on shared data sets are not
> really an issue for MP scaleability.
I think you'll find that there are all these small details that just
cannot be solved cleanly. Do you want to be stuck with a x86-only
solution?
That said, I cannot honestly say that I have tried very hard to come up
with solutions. I just have this feeling that it's a dark ugly hole that I
wouldn't want to go down..
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-08-05 0:03 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-08-02 22:08 Rik van Riel
2000-08-03 7:19 ` Chris Wedgwood
2000-08-03 16:01 ` Rik van Riel
2000-08-04 15:41 ` Matthew Dillon
2000-08-04 17:49 ` Linus Torvalds
2000-08-04 23:51 ` Matthew Dillon
2000-08-05 0:03 ` Linus Torvalds [this message]
2000-08-05 1:52 ` Matthew Dillon
2000-08-05 1:09 ` Matthew Wilcox
2000-08-05 2:05 ` Linus Torvalds
2000-08-05 2:17 ` Alexander Viro
2000-08-07 17:55 ` Matthew Dillon
2000-08-05 22:48 ` Theodore Y. Ts'o
2000-08-03 18:27 ` lamont
2000-08-03 18:34 ` Linus Torvalds
2000-08-03 19:11 ` Chris Wedgwood
2000-08-03 21:04 ` Benjamin C.R. LaHaise
2000-08-03 19:32 ` Rik van Riel
2000-08-03 18:05 ` Linus Torvalds
2000-08-03 18:50 ` Rik van Riel
2000-08-03 20:22 ` Linus Torvalds
2000-08-03 22:05 ` Rik van Riel
2000-08-03 22:19 ` Linus Torvalds
2000-08-03 19:00 ` Richard B. Johnson
2000-08-03 19:29 ` Rik van Riel
2000-08-03 20:23 ` Linus Torvalds
2000-08-03 19:37 ` Ingo Oeser
2000-08-03 20:40 ` Linus Torvalds
2000-08-03 21:56 ` Ingo Oeser
2000-08-03 22:12 ` Linus Torvalds
2000-08-04 2:33 ` David Gould
2000-08-16 15:10 ` Stephen C. Tweedie
2000-08-03 19:26 ` Roger Larsson
2000-08-03 21:50 ` Rik van Riel
2000-08-03 22:28 ` Roger Larsson
2000-08-04 13:52 Mark_H_Johnson
[not found] <8725692F.0079E22B.00@d53mta03h.boulder.ibm.com>
2000-08-07 17:40 ` Gerrit.Huizenga
2000-08-07 18:37 ` Matthew Wilcox
2000-08-07 20:55 ` Chuck Lever
2000-08-07 21:59 ` Rik van Riel
2000-08-08 3:26 ` David Gould
2000-08-08 5:54 ` Kanoj Sarcar
2000-08-08 7:15 ` David Gould
[not found] <87256934.0072FA16.00@d53mta04h.boulder.ibm.com>
2000-08-08 0:36 ` Gerrit.Huizenga
[not found] <87256934.0078DADB.00@d53mta03h.boulder.ibm.com>
2000-08-08 0:48 ` Gerrit.Huizenga
2000-08-08 15:21 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.10.10008041655420.11340-100000@penguin.transmeta.com \
--to=torvalds@transmeta.com \
--cc=cw@f00f.org \
--cc=dillon@apollo.backplane.com \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=riel@conectiva.com.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox