From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id NAA21597 for ; Wed, 13 Nov 2002 13:40:30 -0800 (PST) Message-ID: <3DD2C6CC.F831A6F2@digeo.com> Date: Wed, 13 Nov 2002 13:40:28 -0800 From: Andrew Morton MIME-Version: 1.0 Subject: Re: 2.5.47-mm2 References: <3DD21113.B4F3857@digeo.com> <20021113091116.GG23425@holomorphy.com> <3DD287EF.DCBFB5D0@digeo.com> <20021113212252.GW22031@holomorphy.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: William Lee Irwin III Cc: linux-mm@kvack.org List-ID: William Lee Irwin III wrote: > > On Wed, Nov 13, 2002 at 12:45:07AM -0800, Andrew Morton wrote: > >>> page-reservation.patch > >>> Page reservation API > > William Lee Irwin III wrote: > >> Don't drop it yet, I've got a caller of this on the back burner. > > On Wed, Nov 13, 2002 at 09:12:15AM -0800, Andrew Morton wrote: > > Well so have I. Right now, if pte_chain_alloc() fails the > > kernel oopses. > > That's the one. I keep choking on mm/slab.c though. =( > Well my plan here is to go to all code paths which end up allocating a pte chain and do: reserve_local_pages(GFP_KERNEL, 2); spin_lock(some_lock); pte_alloc_map(); /* That's one */ pte_chain_alloc(); /* That's two */ spin_unlock(some_lock); release_local_pages(GFP_KERNEL, 2); When you're inside reserve_local_pages(), you are running atomically: preempt is disabled. Because the reserved pages are per-cpu. Consequently all those pagetable allocation functions can no longer use GFP_KERNEL and they can not have their sleep-and-try-again stuff. They must be atomic. That's why the above code reserved a page for them too. This assumes that every architecture's pagetable allocation code only uses zero-order pages. If that's not true I am screwed. Only allocations which use __GFP_RESERVE may dip into those pages. With this we _could_ take out all the (nasty) dropping of page_table_lock everywhere where we allocate a pagetable page. But I figured I'd keep that there because it works, and memsetting a whole page while holding page_table_lock is unfriendly. A similar bunch-o-crap needs to be done for ratnode allocations. It isn't going to be pretty, but I haven't really been able to come up with anything better. A per-task reserved page pool would not be very good - either we pin boatloads of memory or we do tons more allocations and frees than necessary... What do you think? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/