From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Tue, 21 Mar 2000 02:29:37 +0100 From: Jamie Lokier Subject: MADV_DONTNEED Message-ID: <20000321022937.B4271@pcep-jamie.cern.ch> References: <20000320135939.A3390@pcep-jamie.cern.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: ; from Chuck Lever on Mon, Mar 20, 2000 at 02:09:26PM -0500 Sender: owner-linux-mm@kvack.org Return-Path: To: Chuck Lever Cc: linux-mm@kvack.org List-ID: Hi Chuck About MADV_DONTNEED ------------------- > > In particular, using the name MADV_DONTNEED is a really bad idea. It > > means completely different things on different OSes. For example your > > meaning of MADV_DONTNEED is different to BSD's: a program that assumes > > the BSD behaviour may well crash with your implementation and will > > almost certainly give invalid results if it doesn't crash. > > i'm more concerned about portability from operating systems like Solaris, > because there are many more server applications there than on *BSD that > have been designed to use these interfaces. ... > my preference is for the DU semantic of tossing dirty data instead of > flushing onto backing store, simply because that's what so many > applications expect DONTNEED to do. That's interesting. When I saw MADV_DONTNEED, I immediately assumed it was the natural counterpoint to MADV_WILLNEED. Useful even for sequential accesses, to say "my streaming window has moved beyond this point". Do you agree that a counterpoint to MADV_WILLNEED is useful? The names are so similar, I consider using MADV_DONTNEED to mean "trash this memory" quite misleading. (If there was no MADV_WILLNEED I wouldn't mind). > i'm not saying the *BSD way is wrong, but i think it would be a more > useful compromise to make *BSD functionality available via some other > interface (like MADV_ZERO). You got it the wrong way around. MADV_ZERO is more like what your implementation of MADV_DONTNEED does. The BSD behaviour is nothing like MADV_ZERO. BSD simply means "increment the paging priority" -- the page contents are unchanged. BSD's behaviour is the obvious counterpoint to MADV_WILLNEED afaict. > as far as i can tell, linux's msync(MS_INVALIDATE) behaves like freeBSD's > MADV_DONTNEED. Doesn't look like that. 1. MS_INVALIDATE only works on file mappings -- BSD's MADV_DONTNEED is defined (if you believe the documentation) for any mapping. 2. The msync() manual page doesn't agree with you, but I'm not sure about the implementation. The manual says: MS_INVALIDATE asks to invalidate other mappings of the same file (so that they can be updated with the fresh values just written). The implementation seems to invalidate _this_ mapping. Either way, they are different from BSD's MADV_DONTNEED. 3. Your MADV_DONTNEED does different things to msync(MS_INVALIDATE) Actually I like what MADV_DONTNEED does, but I would like it to have a different name to avoid potentially dangerous ambiguity with BSD's meaning. If Linux MADV_DONTNEED were just a hint it would be fine, but it actively trashes memory. By the way, Linux MADV_DONTNEED does some of the things msync(MS_INVALIDATE) does but not others (in the implementation -- ignore the man page). Can you explain how the two things differ? I.e., why does MS_INVALIDATE fiddle with swap cache pages. Does this indicate a bug in your MADV_DONTNEED implementation? > MADV_ZERO makes sense to me as an efficient way to zero a range of > addresses in a mapping. but i think it's useful as a *separate* function, > not as combined with, say, MADV_DONTNEED. Agreed. I mention DONTNEED only because some OS's documentation of DONTNEED appears to be equivalent to MADV_ZERO. And of course, on a mapping of /dev/zero they are equivalent. To be honest, the MADV_DONTNEED behaviour on private mappings is probably much more useful than zeroing a range anyway. You've always got read(/dev/zero) for the latter. enjoy, -- Jamie -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/