linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <cel@monkey.org>
To: Jamie Lokier <jamie.lokier@cern.ch>
Cc: linux-mm@kvack.org
Subject: Re: MADV_DONTNEED
Date: Wed, 22 Mar 2000 12:04:58 -0500 (EST)	[thread overview]
Message-ID: <Pine.BSO.4.10.10003221125170.16476-100000@funky.monkey.org> (raw)
In-Reply-To: <20000321022937.B4271@pcep-jamie.cern.ch>

hi jamie-

On Tue, 21 Mar 2000, Jamie Lokier wrote:
> > > In particular, using the name MADV_DONTNEED is a really bad idea.  It
> > > means completely different things on different OSes.  For example your
> > > meaning of MADV_DONTNEED is different to BSD's: a program that assumes
> > > the BSD behaviour may well crash with your implementation and will
> > > almost certainly give invalid results if it doesn't crash.
> > 
> > i'm more concerned about portability from operating systems like Solaris,
> > because there are many more server applications there than on *BSD that
> > have been designed to use these interfaces.
> ...
> > my preference is for the DU semantic of tossing dirty data instead of
> > flushing onto backing store, simply because that's what so many
> > applications expect DONTNEED to do.
> 
> That's interesting.  When I saw MADV_DONTNEED, I immediately assumed it
> was the natural counterpoint to MADV_WILLNEED.

yes, i did too.  but i realized later that "will" is *not* the opposite of
"dont".

> Useful even for
> sequential accesses, to say "my streaming window has moved beyond this
> point".  Do you agree that a counterpoint to MADV_WILLNEED is useful?

if you look at the implementation of nopage_sequential_readahead, you'll
see that it doesn't use MADV_DONTNEED, but the internal implementation of
msync(MS_INVALIDATE).  i'm not completely confident in this
implementation, but my intent was to release behind, not discard data.
so, yes, a counterpoint to WILLNEED is a good idea.  perhaps that *was*
the original intent of MADV_DONTNEED, but i don't see any documentation
that ties WILLNEED and DONTNEED together, semantically.

> > i'm not saying the *BSD way is wrong, but i think it would be a more
> > useful compromise to make *BSD functionality available via some other
> > interface (like MADV_ZERO).
> 
> You got it the wrong way around.  MADV_ZERO is more like what your
> implementation of MADV_DONTNEED does.  The BSD behaviour is nothing like
> MADV_ZERO.  BSD simply means "increment the paging priority" -- the
> page contents are unchanged.
> 
> BSD's behaviour is the obvious counterpoint to MADV_WILLNEED afaict.

it is, but it's not the behavior that most applications expect.  i'd like
to have something like this, but it should probably be named MADV_FREE, or
how about MADV_WONTNEED ? :)

so we agree that both behaviors might be useful to expose to an
application.  the only question is what to name them.

function 1 (could be MADV_DISCARD; currently MADV_DONTNEED):
  discard pages.  if they are referenced again, the process causes page
  faults to read original data (zero page for anonymous maps).

function 2 (could be MADV_FREE; currently msync(MS_INVALIDATE)):
  release pages, syncing dirty data.  if they are referenced again, the
  process causes page faults to read in latest data.

function 3 (could be MADV_ZERO):
  discard pages.  if they are referenced again, the process sees C-O-W 
  zeroed pages.

function 4 (for comparison; currently munmap):
  release pages, syncing dirty data.  if they are referenced again, the
  process causes invalid memory access faults.

i'm interested to hear what big database folks have to say about this.

> By the way, Linux MADV_DONTNEED does some of the things
> msync(MS_INVALIDATE) does but not others (in the implementation --
> ignore the man page).
> 
> Can you explain how the two things differ?  I.e., why does MS_INVALIDATE
> fiddle with swap cache pages.  Does this indicate a bug in your
> MADV_DONTNEED implementation?

for MADV_DONTNEED, i re-used code.  i'm not convinced that it's correct,
though, as i stated when i submitted the patch.  it may abandon swap cache
pages, and there may be some undefined interaction between file truncation
and MADV_DONTNEED.

	- Chuck Lever
--
corporate:	<chuckl@netscape.com>
personal:	<chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project:
	http://www.citi.umich.edu/projects/linux-scalability/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2000-03-22 17:04 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20000320135939.A3390@pcep-jamie.cern.ch>
2000-03-20 19:09 ` MADV_SPACEAVAIL and MADV_FREE in pre2-3 Chuck Lever
2000-03-21  1:20   ` madvise (MADV_FREE) Jamie Lokier
2000-03-21  2:24     ` William J. Earl
2000-03-21 14:08       ` Jamie Lokier
2000-03-22 16:24     ` Chuck Lever
2000-03-22 18:05       ` Jamie Lokier
2000-03-22 21:39         ` Chuck Lever
2000-03-22 22:31           ` Jamie Lokier
2000-03-22 22:44             ` Stephen C. Tweedie
2000-03-23 18:53             ` Chuck Lever
2000-03-24  0:00               ` /dev/recycle Jamie Lokier
2000-03-24  9:14                 ` /dev/recycle Christoph Rohland
2000-03-24 13:10                   ` /dev/recycle Jamie Lokier
2000-03-24 13:54                     ` /dev/recycle Christoph Rohland
2000-03-24 14:17                       ` /dev/recycle Jamie Lokier
2000-03-24 17:40                         ` /dev/recycle Christoph Rohland
2000-03-24 18:13                           ` /dev/recycle Jamie Lokier
2000-03-25  8:35                             ` /dev/recycle Christoph Rohland
2000-03-28  0:48                 ` /dev/recycle Chuck Lever
2000-03-24  0:21               ` madvise (MADV_FREE) Jamie Lokier
2000-03-24  7:21                 ` lars brinkhoff
2000-03-24 17:42                   ` Jeff Dike
2000-03-24 16:49                     ` Jamie Lokier
2000-03-24 17:08                     ` Stephen C. Tweedie
2000-03-24 19:58                       ` Jeff Dike
2000-03-25  0:30                         ` Stephen C. Tweedie
2000-03-22 22:33           ` Stephen C. Tweedie
2000-03-22 22:45             ` Jamie Lokier
2000-03-22 22:48               ` Stephen C. Tweedie
2000-03-22 22:55                 ` Q. about swap-cache orphans Jamie Lokier
2000-03-22 22:58                   ` Stephen C. Tweedie
2000-03-22 18:15       ` madvise (MADV_FREE) Christoph Rohland
2000-03-22 18:30         ` Jamie Lokier
2000-03-23 16:56           ` Christoph Rohland
2000-03-21  1:29   ` MADV_DONTNEED Jamie Lokier
2000-03-22 17:04     ` Chuck Lever [this message]
2000-03-22 17:10       ` MADV_DONTNEED Stephen C. Tweedie
2000-03-22 17:32         ` MADV_DONTNEED Jamie Lokier
2000-03-22 17:33         ` MADV_DONTNEED Jamie Lokier
2000-03-22 17:37           ` MADV_DONTNEED Stephen C. Tweedie
2000-03-22 17:43       ` MADV_DONTNEED Jamie Lokier
2000-03-22 21:54         ` MADV_DONTNEED Chuck Lever
2000-03-22 22:41           ` MADV_DONTNEED Jamie Lokier
2000-03-23 19:13             ` MADV_DONTNEED James Antill
2000-03-21  1:47   ` Extensions to mincore Jamie Lokier
2000-03-21  9:11     ` Eric W. Biederman
2000-03-21  9:40       ` lars brinkhoff
2000-03-21 11:34       ` Stephen C. Tweedie
2000-03-21 15:15         ` Jamie Lokier
2000-03-21 15:41           ` Stephen C. Tweedie
2000-03-21 15:55             ` Jamie Lokier
2000-03-21 16:08               ` Stephen C. Tweedie
2000-03-21 16:48                 ` Jamie Lokier
2000-03-22  7:36                   ` Eric W. Biederman
2000-03-21  1:50   ` MADV flags as mmap options Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.BSO.4.10.10003221125170.16476-100000@funky.monkey.org \
    --to=cel@monkey.org \
    --cc=jamie.lokier@cern.ch \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox