linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: William Lee Irwin III <wli@holomorphy.com>,
	Rik van Riel <riel@redhat.com>,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	Mel Gorman <mel@csn.ul.ie>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: What to expect with the 2.6 VM
Date: Fri, 4 Jul 2003 02:40:00 +0200	[thread overview]
Message-ID: <20030704004000.GQ23578@dualathlon.random> (raw)
In-Reply-To: <20030703201607.GK20413@holomorphy.com>

On Thu, Jul 03, 2003 at 01:16:07PM -0700, William Lee Irwin III wrote:
> I don't know what kind of moron you take me for but I don't care to be
> patronized like that.

If it's such a strong feature, go ahead and show me a patch to a real
life applications (there are plenty of things using hashes or btrees,
feel free to choose the one that according to you will behave closest to
the "exploit" [1]) using remap_file_pages to avoid the pte overhead and
show the huge improvement in the numbers compared to changing the design
of the code to have some sort of locality in the I/O access patterns
(NOTE: I don't care about ram, I care about speed). Given how hard you
advocate for this I assume you at least expect a 1% improvement, right?
Once you make the patch I can volounteer to benchmark it if you don't
have time or hardware for that. After you made the patch and you showed
a >=1% improvement by not keeping the file mapped linearly, but by
mapping it nonlinearly using remap_file_pages, I reserve myself to fix
the app to have some sort of locality of information so that the I/O
side will be able to get a boost too.

the fact is, no matter the VM side, your app has no way to nearly
perform in terms of I/O seeks if you're filling a page per pmd due the
huge seek it will generate with the major faults. And even the minor
faults if has no locality at all and it seeks all over the place in a
non predictable manner, the tlb flushes will kill performance compared
to keeping the file mapped statically, and it'll make it even slower
than accessing a new pte every time.

Until you produce pratical results IMHO the usage you advocated to use
remap_file_pages to avoid doing big linear mappings that may allocate
more ptes, sounds completely vapourware overkill overdesign that won't
last past emails.  All in my humble opinion of course. I've no problem
to be wrong, I just don't buy what you say since it is not obvious at
all given the huge cost of entering exiting kernel, reaching the
pagetable in software, mangling them, flushing the tlb (on all common
archs that I'm assuming this doesn't only mean to flush a range but to
flush it all but it'd be slower even with a range-flush operation),
compared to doing nothing with a static linear mapping (regardless the
fact there are more ptes with a big linear mapping, I don't care to save
ram).

If you really go to change the app to use remap_file_pages, rather than
just compact the vm side with remap_file_pages (which will waste lots of
cpu power and it'll run slower IMHO), you'd better introduce locality
knowledge so the I/O side will have a slight chance to perform too and
the VM side will be improved as well, potentially also sharing the same
page, not only the same pmd (and after you do that if you really need to
save ram [not cpu] you can munmap/mmap at the same cost but this is just
a said note, I said I don't care to save ram, I care to perform the
fastest). reiserfs and other huge btree users have to do this locality
stuff all the time with their trees, for example to avoid a directory to
be completely scattered everywhere in the tree and in turn triggering
an huge amount of I/O seeks that may not even fit in buffercache. w/o
the locality information there would be no way for reiserfs to perform
with big filesystems and many directories, this is just the simples
example I can think of huge btrees that we use everyday.

Again, I don't care about saving ram, we're talking 64bit, I care about
speed, I hope I already made this clear enough in the previous email.

My arguments sounds all pretty strightforward to me.

Andrea

[1] I called the exploit because it was posted originally on bugtraq a
number of years ago, the pmd weren't reclaimed, and Linus fixed it (IIRC
in 2.3) by freeing the pmds when a PGD_SIZE range was completely
released. Of course yours isn't an exploit but it just behaves like
that by wasting lots of ram with pmds compared to the actual mapped
pages.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2003-07-04  0:40 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-01  1:39 Mel Gorman
2003-06-30 17:43 ` Daniel Phillips
2003-07-01 20:10   ` Martin J. Bligh
2003-07-01 21:41   ` Mel Gorman
2003-07-01 21:51     ` Davide Libenzi
2003-07-01 21:58     ` Martin J. Bligh
2003-07-02  9:01       ` Mel Gorman
2003-07-01  2:25 ` Andrea Arcangeli
2003-07-01  3:02   ` Andrew Morton
2003-07-01  3:22     ` Andrea Arcangeli
2003-07-01  3:25       ` Andrea Arcangeli
2003-07-01  3:29       ` Rik van Riel
2003-07-01  4:04         ` Andrea Arcangeli
2003-07-01 11:01     ` Hugh Dickins
2003-07-01  3:25   ` William Lee Irwin III
2003-07-01  4:39     ` Andrea Arcangeli
2003-07-01  6:33       ` William Lee Irwin III
2003-07-01  7:49         ` Andrea Arcangeli
2003-07-01  8:59           ` William Lee Irwin III
2003-07-01  9:27             ` Andrea Arcangeli
2003-07-01 14:24             ` Martin J. Bligh
2003-07-01 16:22               ` William Lee Irwin III
2003-07-01 17:54                 ` Martin J. Bligh
2003-07-02  3:04                   ` Andrea Arcangeli
2003-07-01 14:42           ` Martin J. Bligh
2003-07-01 21:45     ` Mel Gorman
2003-07-01 22:06       ` Martin J. Bligh
2003-07-01 21:46   ` Mel Gorman
2003-07-02  3:08     ` Andrea Arcangeli
2003-07-02 15:57   ` Mel Gorman
2003-07-02 17:11     ` Andrea Arcangeli
2003-07-02 17:10       ` Martin J. Bligh
2003-07-02 17:47         ` Andrea Arcangeli
2003-07-02 17:52           ` Martin J. Bligh
2003-07-02 18:13             ` Andrea Arcangeli
2003-07-02 18:05           ` Rik van Riel
2003-07-02 20:05             ` Martin J. Bligh
2003-07-02 21:40           ` William Lee Irwin III
2003-07-02 21:48             ` Martin J. Bligh
2003-07-02 22:14               ` William Lee Irwin III
2003-07-02 22:02             ` Andrea Arcangeli
2003-07-02 22:15               ` William Lee Irwin III
2003-07-02 22:26                 ` Andrea Arcangeli
2003-07-02 23:11                   ` William Lee Irwin III
2003-07-02 23:30                     ` Andrea Arcangeli
2003-07-02 23:55                       ` William Lee Irwin III
2003-07-03 11:31                         ` Andrea Arcangeli
2003-07-03 11:46                           ` William Lee Irwin III
2003-07-03 12:58                             ` Andrea Arcangeli
2003-07-03 13:06                               ` Rik van Riel
2003-07-03 13:48                                 ` Andrea Arcangeli
2003-07-03 18:53                                 ` William Lee Irwin III
2003-07-03 19:27                                   ` Andrea Arcangeli
2003-07-03 19:32                                     ` Rik van Riel
2003-07-03 20:16                                     ` William Lee Irwin III
2003-07-04  0:40                                       ` Andrea Arcangeli [this message]
2003-07-04  1:46                                         ` William Lee Irwin III
2003-07-04  2:34                                           ` Andrea Arcangeli
2003-07-04  4:10                                             ` William Lee Irwin III
2003-07-04  5:54                                               ` Andrea Arcangeli
2003-07-04  8:15                                                 ` William Lee Irwin III
2003-07-04 23:44                                                   ` Andrea Arcangeli
2003-07-05  0:05                                                     ` William Lee Irwin III
2003-07-05  0:08                                                       ` Andrea Arcangeli
2003-07-03 18:48                               ` Jamie Lokier
2003-07-03 18:54                                 ` William Lee Irwin III
2003-07-03 19:33                                   ` Andrea Arcangeli
2003-07-03 22:21                                     ` William Lee Irwin III
2003-07-04  0:46                                       ` Andrea Arcangeli
2003-07-04  1:33                                         ` Jamie Lokier
2003-07-04  1:36                                         ` William Lee Irwin III
2003-07-03 19:06                           ` Andrew Morton
2003-07-03 19:34                             ` Andrea Arcangeli
2003-07-02 18:07         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030704004000.GQ23578@dualathlon.random \
    --to=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@aracnet.com \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox