From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <437E2F22.6000809@argo.co.il> Date: Fri, 18 Nov 2005 21:44:34 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [RFC][PATCH 0/8] Critical Page Pool References: <437E2C69.4000708@us.ibm.com> In-Reply-To: <437E2C69.4000708@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Matthew Dobson Cc: linux-kernel@vger.kernel.org, Linux Memory Management List-ID: Matthew Dobson wrote: >We have a clustering product that needs to be able to guarantee that the >networking system won't stop functioning in the case of OOM/low memory >condition. The current mempool system is inadequate because to keep the >whole networking stack functioning, we need more than 1 or 2 slab caches to >be guaranteed. We need to guarantee that any request made with a specific >flag will succeed, assuming of course that you've made your "critical page >pool" big enough. > >The following patch series implements such a critical page pool. It >creates 2 userspace triggers: > >/proc/sys/vm/critical_pages: write the number of pages you want to reserve >for the critical pool into this file > >/proc/sys/vm/in_emergency: write a non-zero value to tell the kernel that >the system is in an emergency state and authorize the kernel to dip into >the critical pool to satisfy critical allocations. > >We mark critical allocations with the __GFP_CRITICAL flag, and when the >system is in an emergency state, we are allowed to delve into this pool to >satisfy __GFP_CRITICAL allocations that cannot be satisfied through the >normal means. > > > 1. If you have two subsystems which allocate critical pages, how do you protect against the condition where one subsystem allocates all the critical memory, causing the second to oom? 2. There already exists a critical pool: ordinary allocations fail if free memory is below some limit, but special processes (kswapd) can allocate that memory by setting PF_MEMALLOC. Perhaps this should be extended, possibly with a per-process threshold. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org