linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@transmeta.com>
To: Matthew Dillon <dillon@apollo.backplane.com>
Cc: Rik van Riel <riel@conectiva.com.br>,
	Chris Wedgwood <cw@f00f.org>,
	linux-mm@kvack.org, linux-kernel@vger.rutgers.edu
Subject: Re: RFC: design for new VM
Date: Fri, 4 Aug 2000 10:49:24 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.10.10008041033230.813-100000@penguin.transmeta.com> (raw)
In-Reply-To: <200008041541.IAA88364@apollo.backplane.com>


On Fri, 4 Aug 2000, Matthew Dillon wrote:
> 
>   							  The second eyesore
>     is the lack of physically shared page table segments for 'standard'
>     processes.  At the moment, it's an all (rfork/RFMEM/clone) or nothing
>     (fork) deal.  Physical segment sharing outside of clone is something
>     Linux could use to, I don't think it does it either.  It's not easy to
>     do right.

It's probably impossible to do right. Basically, if you do it, you do it
wrong.

As far as I can tell, you basically screw yourself on the TLB and locking
if you ever try to implement this. And frankly I don't see how you could
avoid getting screwed.

There are architecture-specific special cases, of course. On ia64, the
page table is not really one page table, it's a number of pretty much
independent page tables, and it would be possible to extend the notion of
fork vs clone to be a per-page-table thing (ie the single-bit thing would
become a multi-bit thing, and the single "struct mm_struct" would become
an array of independent mm's).

You could do similar tricks on x86 by virtually splitting up the page
directory into independent (fixed-size) pieces - this is similar to what
the PAE stuff does in hardware, after all. So you could have (for example)
each process be quartered up into four address spaces with the top two
address bits being the address space sub-ID.

Quite frankly, it tends to be a nightmare to do that. It's also
unportable: it works on architectures that either support it natively
(like the ia64 that has the split page tables because of how it covers
large VM areas) or by "faking" the split on regular page tables. But it
does _not_ work very well at all on CPU's where the native page table is
actually a hash (old sparc, ppc, and the "other mode" in IA64). Unless the
hash happens to have some of the high bits map into a VM ID (which is
common, but not really something you can depend on).

And even when it "works" by emulation, you can't share the TLB contents
anyway. Again, it can be possible on a per-architecture basis (if the
different regions can have different ASI's - ia64 again does this, and I
think it originally comes from the 64-bit PA-RISC VM stuff). But it's one
of those bad ideas that if people start depending on it, it simply won't
work that well on some architectures. And one of the beauties of UNIX is
that it truly is fairly architecture-neutral.

And that's just the page table handling. The SMP locking for all this
looks even worse - you can't share a per-mm lock like with the clone()
thing, so you have to create some other locking mechanism. 

I'd be interested to hear if you have some great idea (ie "oh, if you look
at it _this_ way all your concerns go away"), but I suspect you have only
looked at it from 10,000 feet and thought "that would be a cool thing".
And I suspect it ends up being anything _but_ cool once actually
implemented.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2000-08-04 17:49 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-08-02 22:08 Rik van Riel
2000-08-03  7:19 ` Chris Wedgwood
2000-08-03 16:01   ` Rik van Riel
2000-08-04 15:41     ` Matthew Dillon
2000-08-04 17:49       ` Linus Torvalds [this message]
2000-08-04 23:51         ` Matthew Dillon
2000-08-05  0:03           ` Linus Torvalds
2000-08-05  1:52             ` Matthew Dillon
2000-08-05  1:09               ` Matthew Wilcox
2000-08-05  2:05               ` Linus Torvalds
2000-08-05  2:17               ` Alexander Viro
2000-08-07 17:55                 ` Matthew Dillon
2000-08-05 22:48     ` Theodore Y. Ts'o
2000-08-03 18:27   ` lamont
2000-08-03 18:34     ` Linus Torvalds
2000-08-03 19:11       ` Chris Wedgwood
2000-08-03 21:04         ` Benjamin C.R. LaHaise
2000-08-03 19:32       ` Rik van Riel
2000-08-03 18:05 ` Linus Torvalds
2000-08-03 18:50   ` Rik van Riel
2000-08-03 20:22     ` Linus Torvalds
2000-08-03 22:05       ` Rik van Riel
2000-08-03 22:19         ` Linus Torvalds
2000-08-03 19:00   ` Richard B. Johnson
2000-08-03 19:29     ` Rik van Riel
2000-08-03 20:23     ` Linus Torvalds
2000-08-03 19:37   ` Ingo Oeser
2000-08-03 20:40     ` Linus Torvalds
2000-08-03 21:56       ` Ingo Oeser
2000-08-03 22:12         ` Linus Torvalds
2000-08-04  2:33   ` David Gould
2000-08-16 15:10   ` Stephen C. Tweedie
2000-08-03 19:26 ` Roger Larsson
2000-08-03 21:50   ` Rik van Riel
2000-08-03 22:28     ` Roger Larsson
2000-08-04 13:52 Mark_H_Johnson
     [not found] <8725692F.0079E22B.00@d53mta03h.boulder.ibm.com>
2000-08-07 17:40 ` Gerrit.Huizenga
2000-08-07 18:37   ` Matthew Wilcox
2000-08-07 20:55   ` Chuck Lever
2000-08-07 21:59     ` Rik van Riel
2000-08-08  3:26   ` David Gould
2000-08-08  5:54     ` Kanoj Sarcar
2000-08-08  7:15       ` David Gould
     [not found] <87256934.0072FA16.00@d53mta04h.boulder.ibm.com>
2000-08-08  0:36 ` Gerrit.Huizenga
     [not found] <87256934.0078DADB.00@d53mta03h.boulder.ibm.com>
2000-08-08  0:48 ` Gerrit.Huizenga
2000-08-08 15:21   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.10.10008041033230.813-100000@penguin.transmeta.com \
    --to=torvalds@transmeta.com \
    --cc=cw@f00f.org \
    --cc=dillon@apollo.backplane.com \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox