From: Gerrit Huizenga <gh@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>,
andrea@suse.de, ak@suse.de, hugh@veritas.com, jdike@addtoit.com,
dvhltc@us.ibm.com, linux-mm@kvack.org
Subject: Re: [RFC] madvise(MADV_TRUNCATE)
Date: Thu, 27 Oct 2005 12:40:05 -0700 [thread overview]
Message-ID: <E1EVDbZ-0004fp-00@w-gerrit.beaverton.ibm.com> (raw)
In-Reply-To: Your message of Thu, 27 Oct 2005 11:50:50 PDT. <20051027115050.7f5a6fb7.akpm@osdl.org>
On Thu, 27 Oct 2005 11:50:50 PDT, Andrew Morton wrote:
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> >
> > I have 2 reasons (I don't know if Andrea has more uses/reasons):
> >
> > (1) Our database folks want to drop parts of shared memory segments
> > when they see memory pressure
>
> How do they "see memory pressure"?
>
> The kernel's supposed to write the memory out to swap under memory
> pressure, so why is a manual interface needed?
>
> > or memory hotplug/virtualization stuff.
>
> Really? Are you sure? Is this the only means by which the memory hotplug
> developers can free up shmem pages? I think not...
On pSeries, an LPAR shrink the amount of memory/number of processors
available to an OS instance. The most convenient way for this to happen
for some applications is to tell them that their world has shrunk, so
they can conssciously resize their various data pools, mmap segments,
buffers, pre-fault rates, heaps, etc. in some uniform way. Once they
have been told the world is going to shrink the LPAR can more easily
find free pages to scavenge without sending them machine into paroxysms
of page paging and thrashing.
> > madvise(DONTNEED) is not really releasing the pagecache pages. So
> > they want madvise(DISCARD).
> >
> > (2) Jeff Dike wants to use this for UML.
>
> Why? For what purpose? Will he only ever want it for shmem segments?
I don't know Jeff's purpose, but this allows some large applications
to mmap a rediculously large mmap segment which doesn't have to be
remapped every time the underlying hardware changes. At the same time,
some applications (DB2 is the prime example here, but Java wants this
as well) know when pages are no longer needed and would like to free
them.
In Java, for instance, the heap can a two hand sweep and compress,
moving active pages from one side of the heap to the other periodically.
(Actually the heap management is a bit more complex than that, but...)
The overall heap is a large virtual address space but in reality
when pages are freed from it, the application really believes those
pages can go away and should not be cached or preserved for that section.
The physical pages can be re-used immediately and re-faulted (possibly
ZFOD) if necessary afterwards.
> > Please advise on what you would prefer. A small extension to madvise()
> > to solve few problems right now OR lets do real sys_holepunch() and
> > bite the bullet (even though we may not get any more users for it).
>
> I don't think that the benefits for a full holepunch would be worth the
> complexity - nasty, complex, rarely-tested changes to every filesystem. So
> let's not go there.
>
> If we take the position that this is a shmem-specific thing and we don't
> intend to extend it to real/regular filesytems then perhaps a new syscall
> would be more appropriate. On x86 that'd probably be another entry in the
> sys_shm() switch statement. Maybe?
I believe Java uses mmap() today for this; DB2 probably uses both mmap()
and shm*().
gerrit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-10-27 19:40 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-26 22:49 Badari Pulavarty
2005-10-27 8:38 ` Andi Kleen
2005-10-27 13:17 ` Andrea Arcangeli
2005-10-27 15:00 ` Badari Pulavarty
2005-10-27 15:11 ` Andrea Arcangeli
2005-10-27 18:20 ` Andrew Morton
2005-10-27 18:35 ` Badari Pulavarty
2005-10-27 18:50 ` Andrew Morton
2005-10-27 19:40 ` Gerrit Huizenga [this message]
2005-10-27 19:56 ` Andi Kleen
2005-10-27 23:21 ` Darren Hart
2005-10-27 20:05 ` Theodore Ts'o
2005-10-27 20:16 ` Andrea Arcangeli
2005-10-28 1:42 ` Badari Pulavarty
2005-10-28 16:33 ` Theodore Ts'o
2005-10-27 20:22 ` Jeff Dike
2005-10-27 20:04 ` Andrea Arcangeli
2005-10-27 20:50 ` Andrew Morton
2005-10-27 21:37 ` Andrea Arcangeli
2005-10-27 22:23 ` Andrew Morton
2005-10-27 23:05 ` Badari Pulavarty
2005-10-27 23:16 ` Andrew Morton
2005-10-27 23:33 ` Peter Chubb
2005-10-28 0:22 ` Andrea Arcangeli
2005-10-28 0:32 ` Andrew Morton
2005-10-28 1:10 ` Andrea Arcangeli
2005-10-28 1:27 ` Badari Pulavarty
2005-10-28 2:00 ` Andrew Morton
2005-10-27 22:32 ` Badari Pulavarty
2005-10-27 23:28 ` Peter Chubb
2005-10-27 23:49 ` Andrew Morton
2005-10-27 23:56 ` Nathan Scott
2005-10-28 0:15 ` Andrea Arcangeli
2005-10-27 23:59 ` Peter Chubb
2005-10-28 3:46 ` Jeff Dike
2005-10-28 11:03 ` Blaisorblade
2005-10-28 13:29 ` Andrea Arcangeli
2005-10-28 16:56 ` Blaisorblade
2005-10-28 16:16 ` Badari Pulavarty
2005-10-28 18:40 ` Blaisorblade
2005-10-28 18:56 ` Badari Pulavarty
2005-10-29 0:35 ` Badari Pulavarty
2005-10-28 16:19 ` Badari Pulavarty
2005-10-28 17:10 ` Blaisorblade
2005-10-28 18:28 ` Jeff Dike
2005-10-28 18:44 ` Blaisorblade
2005-10-28 18:42 ` Jeff Dike
2005-10-28 18:54 ` Badari Pulavarty
2005-10-29 0:03 ` Badari Pulavarty
2005-10-29 2:51 ` Jeff Dike
2005-10-31 16:34 ` Badari Pulavarty
2005-10-31 19:15 ` Badari Pulavarty
2005-10-31 19:49 ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01 0:05 ` Jeff Dike
2005-11-02 1:15 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Badari Pulavarty
2005-11-02 1:43 ` Andrea Arcangeli
2005-11-02 15:49 ` Badari Pulavarty
2005-11-02 16:12 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54 ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Blaisorblade
2005-11-02 20:12 ` Hugh Dickins
2005-11-02 20:45 ` Hugh Dickins
2005-11-02 21:36 ` Badari Pulavarty
2005-11-02 21:55 ` Hugh Dickins
2005-11-02 22:02 ` Badari Pulavarty
2005-11-12 0:25 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12 0:34 ` Badari Pulavarty
2005-11-12 1:43 ` Andrew Morton
2005-11-12 4:41 ` Badari Pulavarty
2006-01-16 13:06 ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 16:02 ` Suleiman Souhlal
2006-01-16 16:28 ` Andrea Arcangeli
2006-01-16 17:03 ` Suleiman Souhlal
2006-01-16 17:24 ` Andrea Arcangeli
2006-01-16 21:43 ` Eric W. Biederman
2006-01-17 0:24 ` Suleiman Souhlal
2006-01-17 1:04 ` Nicholas Miell
2006-01-17 12:43 ` Christoph Hellwig
2006-01-17 18:23 ` Eric W. Biederman
2006-01-17 22:55 ` Nicholas Miell
2007-03-01 18:11 ` Samuel Thibault
2006-01-17 19:06 ` Badari Pulavarty
2006-01-17 1:06 ` Blaisorblade
2006-01-17 1:33 ` Andrea Arcangeli
2005-11-12 0:34 ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-10-28 17:55 ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1EVDbZ-0004fp-00@w-gerrit.beaverton.ibm.com \
--to=gh@us.ibm.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=dvhltc@us.ibm.com \
--cc=hugh@veritas.com \
--cc=jdike@addtoit.com \
--cc=linux-mm@kvack.org \
--cc=pbadari@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox