From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [PATCH 00 of 24] OOM related fixes Message-Id: Date: Wed, 22 Aug 2007 14:48:47 +0200 From: Andrea Arcangeli Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: David Rientjes List-ID: This is a set of fixes done in the context of a quite evil workload reading from nfs large files with big read buffers in parallel from many tasks at the same time until the system goes oom. Mostly all of these fixes seems to be required to fix the customer workload on top of an older sles kernel. The forward port of the fixes has been already tested successfully on similar evil workloads. The oom deadlock detection triggers a couple of times against the PG_locked deadlock: Jun 8 13:51:19 kvm kernel: Killed process 3504 (recursive_readd) Jun 8 13:51:19 kvm kernel: detected probable OOM deadlock, so killing another task Jun 8 13:51:19 kvm kernel: Out of memory: kill process 3532 (recursive_readd) score 1225 or a child Example of stack trace of TIF_MEMDIE killed task (not literally verified that this was the one with TIF_MEMDIE set but it's the same as before with the verified one): recursive_rea D ffff810001056418 0 3548 3544 (NOTLB) ffff81000e57dba8 0000000000000082 ffff8100010af5e8 ffff8100148df730 ffff81001ff3ea10 0000000000bd2e1b ffff8100148df908 0000000000000046 ffff81001fd5f170 ffffffff8031c36d ffff81001fd5f170 ffff810001056418 Call Trace: [] __generic_unplug_device+0x13/0x24 [] sync_page+0x0/0x40 [] io_schedule+0xf/0x17 [] sync_page+0x3b/0x40 [] __wait_on_bit_lock+0x36/0x65 [] __lock_page+0x5e/0x64 [] wake_bit_function+0x0/0x23 [] find_get_page+0xe/0x40 [] do_generic_mapping_read+0x200/0x450 [] file_read_actor+0x0/0x11d [] get_page_from_freelist+0x2d3/0x36e [] generic_file_aio_read+0x11d/0x159 [] do_sync_read+0xc9/0x10c [] vma_merge+0x10c/0x195 [] autoremove_wake_function+0x0/0x2e [] do_mmap_pgoff+0x5e1/0x74c [] vfs_read+0xaa/0x132 [] sys_read+0x45/0x6e [] system_call+0x7e/0x83 At the end I merged David Rientjes's patches to adapt cpuset oom killing to the new changes and to further improve it. There's one patch that is controversial (remove_nr_scan) and that can be deferred, though I guess if it slowdown AIM we should fix it in some other way not by leaving that patch out. I'll do some local testing with AIM soon. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org