From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 30 Jul 2007 17:01:38 -0700 From: Ravikiran G Thirumalai Subject: Re: [rfc] [patch] mm: zone_reclaim fix for pseudo file systems Message-ID: <20070731000138.GA32468@localdomain> References: <20070727232753.GA10311@localdomain> <20070730132314.f6c8b4e1.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070730132314.f6c8b4e1.akpm@linux-foundation.org> Sender: owner-linux-mm@kvack.org Return-Path: To: Andrew Morton Cc: linux-mm@kvack.org, Christoph Lameter , shai@scalex86.org List-ID: On Mon, Jul 30, 2007 at 01:23:14PM -0700, Andrew Morton wrote: >On Fri, 27 Jul 2007 16:27:53 -0700 >Ravikiran G Thirumalai wrote: > >> Don't go into zone_reclaim if there are no reclaimable pages. >> >> While using RAMFS as scratch space for some tests, we found one of the >> processes got into zone reclaim, and got stuck trying to reclaim pages >> from a zone. > >Would like to see an expanded definition of "stuck", please ;) Well, we were running a multiprocess finite element analysis HPC benchmark, and one of the processes went into 'system' and the benchmark never completed. Of course this happens only when we use ramfs for scratch IO. What I mean is, on invoking 'top', we could see that one of the process was spending all its time in system - 100% system, for a compute benchmark which should not be spending any time in the system at all. > >ie: let's see the bug report before we see the fix? > >> On examination of the code, we found that the VM was fooled >> into believing that the zone had reclaimable pages, when it actually had >> RAMFS backed pages, which could not be written back to the disk. >> >> Fix this by adding a zvc "NR_PSEUDO_FS_PAGES" for file pages with no >> backing store, and using this counter to determine if reclaim is possible. >> >> Patch tested,on 2.6.22. Fixes the above mentioned problem. > >The (cheesy) way in which reclaim currently handles this sort of thing is >to scan like mad, then to eventually set zone->all_unreclaimable. Once >that has been set, the kernel will reduce the amount of scanning effort it >puts into that zone by a very large amount. If the zone later comes back >to life, all_unreclaimable gets cleared and things proceed as normal. I see. But this obviously does not work in this case. I have noticed the process getting into 'system' and staying there for hours. I have never noticed the app complete. Perhaps because I did not wait long enough. So do you think a more aggressive auto setting/unsetting of 'all_unreclaimable' is a better approach? > ... >It is a numa-specific change which adds overhead to non-NUMA builds :( I can (and will) place it with other NUMA specific counters, so the non-NUMA builds will not have any overhead. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org