From: ebiederm@xmission.com (Eric W. Biederman)
To: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Daniel Phillips <phillips@bonn-fries.net>,
Rob Fuller <rfuller@nsisoftware.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: broken VM in 2.4.10-pre9
Date: 19 Sep 2001 15:03:21 -0600 [thread overview]
Message-ID: <m1iteegag6.fsf@frodo.biederman.org> (raw)
In-Reply-To: <E15jnIB-0003gh-00@the-village.bc.nu>
Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> > On September 17, 2001 06:03 pm, Eric W. Biederman wrote:
> > > In linux we have avoided reverse maps (unlike the BSD's) which tends
> > > to make the common case fast at the expense of making it more
> > > difficult to handle times when the VM system is under extreme load and
> > > we are swapping etc.
> >
> > What do you suppose is the cost of the reverse map? I get the impression you
>
> > think it's more expensive than it is.
>
> We can keep the typical page table cost lower than now (including reverse
> maps) just by doing some common sense small cleanups to get the page struct
> down to 48 bytes on x86
I have to admit the first time I looked at reverse maps our struct page
was much lighter weight, then now (64 bytes x86 UP). And our cost per
page was noticeably fewer bytes than the BSDs. average_mem_per_page =
sizeof(struct page) + sizeof(pte_t) + sizeof(reverse_pte_t)*average_user_per_page.
But struct page has grown pretty significantly since then, and could
use a cleanup.
So I figure it is worth going through and computing the costs of
reverse page tables and not, dismissing them out of hand. But the
fact that the linux VM could get good performance in most
circumstances without reverse page tables has always enchanted me.
That added to the fact that last time someone ran the numbers linux
was considerably faster than the BSD for mm type operations when not
swapping. And this is the common case.
I admit reverse page tables make it easier under a high load to get
good paging performance, as the algorithms are more straigh forward.
But I have not seen the argument that not having reverse maps make it
undoable. In fact previous versions of linux seem to put the proof
that you can get at least reasonable swapping under load without
reverse page tables.
There is also the cache thrashing case. While scaning page table
entries it is probably impossible to prevent cache thrashing, but
reverse page tables look like they make it worse.
With respect to the current VM the primary complaint I have heard is
that anonymous pages are not in the page cache so cannot be aged. At
least that was the complaint that started this thread. For adding
pages to the page cache we currently have conflicting tensions. Do we
want it in the page cache to age better or do we not want to allocate
the swap space yet?
So my suggestion was to look at getting anonymous pages backed by what
amounts to a shared memory segment. In that vein. By using an extent
based data structure we can get the cost down under the current 8 bits
per page that we have for the swap counts, and make allocating swap
pages faster. And we want to cluster related swap pages anyway so
an extent based system is a natural fit.
If we loose the requirement that swapped out pages need to be in the
page tables. It becomes a trivial issue to drop page tables with all of
their pages swapped out. Plus there are a million other special cases
we can remove from the current VM.
So right now I can see a bigger benefit from anonymouse pages with a
``backing store'' then I can from reverse maps.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
next prev parent reply other threads:[~2001-09-19 21:03 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-09-17 15:40 Rob Fuller
2001-09-17 16:03 ` Eric W. Biederman
2001-09-19 9:45 ` Daniel Phillips
2001-09-19 19:45 ` Alan Cox
2001-09-19 21:03 ` Eric W. Biederman [this message]
2001-09-19 22:04 ` Alan Cox
2001-09-19 22:26 ` Eric W. Biederman
2001-09-19 23:05 ` Rik van Riel
2001-09-20 11:28 ` Daniel Phillips
2001-09-20 12:06 ` Rik van Riel
2001-09-21 8:13 ` Daniel Phillips
2001-09-21 12:10 ` Rik van Riel
2001-09-21 15:27 ` Jan Harkes
2001-09-22 7:09 ` Daniel Phillips
2001-09-25 11:04 ` Mike Fedyk
2001-09-20 12:57 ` Alan Cox
2001-09-20 13:40 ` Daniel Phillips
2001-09-24 22:50 ` Pavel Machek
2001-09-26 18:22 ` Marcelo Tosatti
2001-09-26 23:44 ` Pavel Machek
2001-09-27 13:52 ` Eric W. Biederman
2001-10-01 11:37 ` Marcelo Tosatti
2001-09-19 23:00 ` Rik van Riel
2001-09-21 8:23 ` Eric W. Biederman
2001-09-21 12:01 ` Rik van Riel
2001-09-22 2:14 ` Alexander Viro
2001-09-22 3:09 ` Rik van Riel
2001-09-19 21:37 ` Eric W. Biederman
2001-09-19 21:55 ` David S. Miller
2001-09-20 13:02 ` Rik van Riel
-- strict thread matches above, loose matches on Subject: below --
2001-09-19 22:15 Rob Fuller
2001-09-19 22:21 ` David S. Miller
2001-09-19 22:30 ` Alan Cox
2001-09-19 22:48 ` Eric W. Biederman
2001-09-19 22:51 ` Bryan O'Sullivan
[not found] <Pine.LNX.4.33L.0109161330000.9536-100000@imladris.rielhome.conectiva>
2001-09-17 8:06 ` Eric W. Biederman
2001-09-17 12:12 ` Rik van Riel
2001-09-17 15:45 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1iteegag6.fsf@frodo.biederman.org \
--to=ebiederm@xmission.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=phillips@bonn-fries.net \
--cc=rfuller@nsisoftware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox