From: Andrea Arcangeli <andrea@suse.de>
To: Benjamin LaHaise <bcrl@redhat.com>
Cc: Andrew Morton <akpm@digeo.com>,
mingo@elte.hu, hugh@veritas.com, dmccr@us.ibm.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: objrmap and vmtruncate
Date: Sat, 5 Apr 2003 04:22:50 +0200 [thread overview]
Message-ID: <20030405022250.GM16293@dualathlon.random> (raw)
In-Reply-To: <20030404205248.C21819@redhat.com>
On Fri, Apr 04, 2003 at 08:52:48PM -0500, Benjamin LaHaise wrote:
> On Sat, Apr 05, 2003 at 03:31:43AM +0200, Andrea Arcangeli wrote:
> > Also consider this significant factor: the larger the shmfs the smaller
> > the nonlinear 1G window will be and the higher the trashing. With 32G of
> > bigpages the remap_file_pages will trash like crazy generating an order
> > of mangnitude more of "window misses". I mean 32bit are just pushed at
> > the limit today regardless the lack of remap_file_pages. Example, if
> > you don't use largepages going past 16G of shm is going to be derimental.
> > The cost of the mmap doesn't sounds like the showstopper.
>
> You're guessing here. At least for oracle, that behaviour is dependant on
> the locality of accesses. Given that each user has their own process you
> can bet there is a fair amount of locality to their transactions.
>
I'm definitely not guessing about the largepage factor, you'd better
drop some ram and run with largepages. I've to guess about
remap_file_pages only because that's not backported yet (thankfully due
its insane api). But if largepages makes such an huge difference, mmap
can't be the big cost under such a tlb trashing scenarios. largepages
shouldn't affect the mmap frequency at all.
Sure the locality exists, but if you wouldn't need a moving window you
wouldn't need the vlm and with 32G shm vs 512M window, your trashing
will be an order of magnitude higher than with a 1G shm, obviously.
I'm guessing but I'm guessing based on non-guesses.
However I'm not questioning that remap_file_pages will help, it will
obviously, I just don't think it's worthwhile enough and I don't see
mmap as the big cost, the big cost is the pagetable mangling and tlb
flushing that will have to happen anyways, regardless if you overwrite
the vma with an mmap or if you call remap_file_pages.
> > you could try to avoid the need of the sysctl by teaching the vm to
> > unmap such vma, but I don't think it worth and I'm sure those apps
> > prefers to have the stuff pinned anyways w/o the risk of sigbus and w/o
> > the need of mlock and it looks cleaner to me to avoid any mess with the
> > vm and long term nobody will care about this sysctl since 64bit will run
> > so much fatster w/o any remap_file_pages and tlb flush running at all
>
> It is still useful for things outside of the pure databases on 32 bits
> realm. Consider a fast bochs running 32 bit apps on a 64 bit machine --
> should it have to deal with the overhead of zillions of vmas for emulating
> page tables?
I can't understand this very well so it maybe my fault, but it doesn't
make any sense to me. I don't know how bochs works but for certain you
won't get any help from the API of remap_file_pages implemented in
2.5.66 in a 64bit arch.
If you think you can get any benefit, then I tell you, rather than using
remap_file_pages, just go ahead mmap the whole file for me, as large as
it is, likely you're dealing with a 32bit address space so it will be
a mere 4G. I doubt you're dealing with 1 terabytes files with bochs that
is by definintion a 32bit thing.
map it all with mmap, and access it sparse. Then you have
remap_file_pages in the 64bit archs, for free w/o special syscalls and
w/o any sigbus handling, the kernel will do the paging for you to the
swap and back into the right place in ram w/o passing through a slower
userspace signal.
I can't see any useful application of the current API of
remap_file_pages in a 64bit arch, but it's possible I'm missing
something. And no, I don't mind to waste 3.8G of address space in the
bochs process, since I still have some petabyte of it unused.
> If anything, I think we should be moving in the direction of doing more
> along the lines of remap_file_pages: things like executables might as well
> keep their state in page tables since we never discard them and instead
> toss the vma out the window.
I'm sorry, but I don't understand very well this, sorry. Could you
elaborate? What state do you want to put in the pagetables? Are you
talking about the pagetables of the cpu or a simulated one in userspace?
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2003-04-05 2:22 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-04 14:34 Hugh Dickins
2003-04-04 16:14 ` William Lee Irwin III
2003-04-04 16:29 ` Hugh Dickins
2003-04-04 18:54 ` Andrew Morton
2003-04-04 21:43 ` Hugh Dickins
2003-04-04 21:45 ` Andrea Arcangeli
2003-04-04 21:58 ` Benjamin LaHaise
2003-04-04 23:07 ` Andrew Morton
2003-04-05 0:03 ` Andrea Arcangeli
2003-04-05 0:31 ` Andrew Morton
2003-04-05 1:31 ` Andrea Arcangeli
2003-04-05 1:52 ` Benjamin LaHaise
2003-04-05 2:22 ` Andrea Arcangeli [this message]
2003-04-05 10:01 ` Jamie Lokier
2003-04-05 10:11 ` William Lee Irwin III
2003-04-05 2:06 ` Andrew Morton
2003-04-05 2:24 ` Andrea Arcangeli
2003-04-05 2:13 ` Martin J. Bligh
2003-04-05 2:44 ` Andrea Arcangeli
2003-04-05 3:24 ` Andrew Morton
2003-04-05 12:06 ` Andrew Morton
2003-04-05 15:11 ` Martin J. Bligh
[not found] ` <20030405161758.1ee19bfa.akpm@digeo.com>
2003-04-06 0:17 ` Andrew Morton
2003-04-06 7:07 ` William Lee Irwin III
2003-04-05 16:30 ` Andrea Arcangeli
2003-04-05 19:01 ` Andrea Arcangeli
2003-04-05 20:14 ` Andrew Morton
2003-04-05 21:24 ` Andrew Morton
2003-04-05 22:06 ` Andrea Arcangeli
2003-04-05 22:31 ` Andrew Morton
2003-04-05 23:10 ` Andrea Arcangeli
2003-04-06 1:58 ` Andrew Morton
2003-04-06 14:47 ` Andrea Arcangeli
2003-04-06 21:35 ` William Lee Irwin III
2003-04-06 7:38 ` William Lee Irwin III
2003-04-06 14:51 ` Andrea Arcangeli
2003-04-06 12:37 ` Jamie Lokier
2003-04-06 13:12 ` William Lee Irwin III
2003-04-22 11:00 ` Ingo Molnar
2003-04-22 11:54 ` William Lee Irwin III
2003-04-22 14:31 ` Ingo Molnar
2003-04-22 14:56 ` William Lee Irwin III
2003-04-22 15:26 ` Ingo Molnar
2003-04-22 16:20 ` William Lee Irwin III
2003-04-22 16:57 ` Andrea Arcangeli
2003-04-22 17:21 ` William Lee Irwin III
2003-04-22 18:08 ` Andrea Arcangeli
2003-04-22 17:34 ` Ingo Molnar
2003-04-22 18:04 ` Benjamin LaHaise
2003-04-22 16:58 ` Martin J. Bligh
2003-04-22 12:37 ` Andrea Arcangeli
2003-04-22 13:20 ` William Lee Irwin III
2003-04-22 14:38 ` Martin J. Bligh
2003-04-22 15:10 ` William Lee Irwin III
2003-04-22 15:53 ` Martin J. Bligh
2003-04-22 14:52 ` Andrea Arcangeli
2003-04-22 14:29 ` Martin J. Bligh
2003-04-22 15:07 ` Ingo Molnar
2003-04-22 15:42 ` William Lee Irwin III
2003-04-22 15:55 ` Ingo Molnar
2003-04-22 16:58 ` William Lee Irwin III
2003-04-22 17:07 ` Ingo Molnar
2003-04-22 15:16 ` Andrea Arcangeli
2003-04-22 15:49 ` Ingo Molnar
2003-04-22 16:16 ` Martin J. Bligh
2003-04-22 17:24 ` Ingo Molnar
2003-04-22 17:45 ` John Bradford
2003-04-22 14:32 ` Martin J. Bligh
2003-04-22 15:09 ` Ingo Molnar
2003-04-05 21:34 ` Rik van Riel
2003-04-06 9:29 ` Benjamin LaHaise
2003-04-05 23:25 ` William Lee Irwin III
2003-04-05 23:57 ` Andrew Morton
2003-04-06 0:14 ` Andrea Arcangeli
2003-04-06 1:39 ` Andrew Morton
2003-04-06 2:13 ` William Lee Irwin III
2003-04-06 9:26 ` Benjamin LaHaise
2003-04-06 9:41 ` William Lee Irwin III
2003-04-06 9:54 ` William Lee Irwin III
2003-04-06 2:23 ` Martin J. Bligh
2003-04-06 3:55 ` Andrew Morton
2003-04-06 3:08 ` Martin J. Bligh
2003-04-06 7:42 ` William Lee Irwin III
2003-04-06 14:49 ` Alan Cox
2003-04-06 16:13 ` Martin J. Bligh
2003-04-06 21:34 ` subobj-rmap Martin J. Bligh
2003-04-06 21:42 ` subobj-rmap Rik van Riel
2003-04-06 21:55 ` subobj-rmap Jamie Lokier
2003-04-06 22:39 ` subobj-rmap William Lee Irwin III
2003-04-06 22:03 ` subobj-rmap Martin J. Bligh
2003-04-06 22:06 ` subobj-rmap Martin J. Bligh
2003-04-06 22:15 ` subobj-rmap Andrea Arcangeli
2003-04-06 22:25 ` subobj-rmap Martin J. Bligh
2003-04-07 21:25 ` subobj-rmap Andrea Arcangeli
2003-04-06 23:06 ` subobj-rmap Jamie Lokier
2003-04-06 23:26 ` subobj-rmap Martin J. Bligh
2003-04-05 3:45 ` objrmap and vmtruncate Martin J. Bligh
2003-04-05 3:59 ` Rik van Riel
2003-04-05 4:10 ` William Lee Irwin III
2003-04-05 4:49 ` Martin J. Bligh
2003-04-05 13:31 ` Rik van Riel
2003-04-05 4:52 ` Martin J. Bligh
2003-04-05 3:22 ` Andrew Morton
2003-04-05 3:35 ` Martin J. Bligh
2003-04-05 3:53 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030405022250.GM16293@dualathlon.random \
--to=andrea@suse.de \
--cc=akpm@digeo.com \
--cc=bcrl@redhat.com \
--cc=dmccr@us.ibm.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox