From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 8 Jan 2001 09:29:15 -0800 (PST) From: Linus Torvalds Subject: Re: Subtle MM bug In-Reply-To: <20010108135700.O9321@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: "Stephen C. Tweedie" Cc: "David S. Miller" , Rik van Riel , Marcelo Tosatti , linux-mm@kvack.org List-ID: On Mon, 8 Jan 2001, Stephen C. Tweedie wrote: > > > Then, with something like the above, we just try to make sure that we scan > > the whole virtual memory space every once in a while. Make the "every once > > in a while" be some simple heuristic like "try to keep the active list to > > less than 50% of all memory". > > ... which will produce an enormous storm of soft page faults for > workloads involving mmaping large amounts of data or where we have > a lot of space devoted to anonymous pages, such as static > computational workloads. I don't think you'll find that in practice. It would obviously trigger only on low-memory code _anyway_ (we don't even get into "try_to_free_pages()" unless there is memory pressure), so I think you're _completely_ off the mark here. Remember: the thing doesn't require that < 50% of memory is in the page tables. It only says: if 50% or more of memory is in the page tables, we will always scan the page tables first when we try to find free pages. If you have a well-behaving application that doesn't even have memory pressure, but fills up >50% of memory in its VM, nothing will actually happen in the steady state. It can have 99% of available memory, and not a single soft page fault. But think about what happens if you now start up another application? And think about what SHOULD happen. The 50% ruls is perfectly fine: if we're starting to swap, we're better off taking soft page faults that give us a better LRU than letting the MM scrub the same pages over and over because it effectively only sees a subset of the total pages (with the mapped pages being "invisible"). The fact is, that we absolutely _have_ to do the VM scan in order for the inactive lists to be at all representative of the state of affairs. If we just rely on page_launder() and refill_inactive() as the #1 way to get free pages, we will never consider anything but the pages that are already on the lists. Stephen: have you tried the behaviour of a working set that is dirty in the VM's and slightly larger than available ram? Not pretty. We do _really_ well on many loads, but this one we do badly on. And from what I've been able to see so far, it's because we're just too damn good at waiting on page_launder() and doing refill_inactive_scan(). There's another advantage to the 50% rule: if we are under memory pressure, and somebody is dirtying pages in its VM (which is otherwise an "invisible" event to the kernel), the 50% rule is much more likely to mean that we actually _see_ the dirtying, and can slow it down. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/