From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 18 Feb 2002 19:22:55 -0800 (PST) From: Linus Torvalds Subject: Re: [RFC] Page table sharing In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Daniel Phillips Cc: Rik van Riel , Hugh Dickins , dmccr@us.ibm.com, Kernel Mailing List , linux-mm@kvack.org, Robert Love , mingo@redhat.co, Andrew Morton , manfred@colorfullife.com, wli@holomorphy.com List-ID: On Mon, 18 Feb 2002, Linus Torvalds wrote: > > We can, of course, introduce a "pmd-rmap" thing, with a pointer to a > circular list of all mm's using that pmd inside the "struct page *" of the > pmd. Right now the rmap patches just make the pointer point directly to > the one exclusive mm that holds the pmd, right? There's another approach: - get rid of "page_table_lock" - replace it with a "per-pmd lock" - notice that we already _have_ such a lock The lock we have is the lock that we've always had in "struct page". There are some interesting advantages from this: - we allow even more parallelism from threads across different CPU's. - we already have the cacheline for the pmd "struct page" because we needed it for the pmd count. That still leaves the TLB invalidation issue, but we could handle that with an alternate approach: use the same "free_pte_ctx" kind of gathering that the zap_page_range() code uses for similar reasons (ie gather up the pte entries that you're going to free first, and then do a global invalidate later). Note that this is likely to speed things up anyway (whether the pages are gathered by rmap or by the current linear walk), by virtue of being able to do just _one_ TLB invalidate (potentially cross-CPU) rather than having to do it once for each page we free. At that point you might as well make the TLB shootdown global (ie you keep track of a mask of CPU's whose TLB's you want to kill, and any pmd that has count > 1 just makes that mask be "all CPU's"). I'm a bit worried about the "lock each mm on the pmd-rmap list" approach, because I think we need to lock them _all_ to be safe (as opposed to locking them one at a time), which always implies all the nasty potential deadlocks you get for doing multiple locking. The "page-lock + potentially one global TLB flush" approach looks a lot safer in this respect. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/