I have seen some OOM-killer action on my s390x system when using large amounts of anonymous memory: [cborntra@t63lp34 ~]$ cat memeat.c #include #include #include #include int main() { char *start; char *a; start = mmap(NULL, 4300000000UL, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1 , 0); if (start == MAP_FAILED) { printf("cannot map guest memory\n"); exit (1); } for (a = start; a < start + 4300000000UL; a += 4096) *a='a'; exit(0); } [cborntra@t63lp34 ~]$ ./memeat Connection to t63lp34 closed. I attached the dmesg with the oom messages. As you can see we are failing several order 0 allocations with gfpmask=0x201da. The application uses slightly more memory than is available. The thing is, that there is plenty of swap space to fullfill the (non-atomic) request: [cborntra@t63lp34 ~]$ free total used free shared buffers cached Mem: 4166560 127148 4039412 0 2256 19752 -/+ buffers/cache: 105140 4061420 Swap: 9615904 8328 9607576 Since old kernels never showed OOM, I was able to bisect the first kernel that shows this behaviour: commit 8cab4754d24a0f2e05920170c845bd84472814c6 Author: Wu Fengguang vmscan: make mapped executable pages the first class citizen In fact, applying this patch makes the problem go away: --- linux-2.6.orig/mm/vmscan.c +++ linux-2.6/mm/vmscan.c @@ -1345,22 +1345,8 @@ static void shrink_active_list(unsigned /* page_referenced clears PageReferenced */ if (page_mapping_inuse(page) && - page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) { + page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) nr_rotated++; - /* - * Identify referenced, file-backed active pages and - * give them one more trip around the active list. So - * that executable code get better chances to stay in - * memory under moderate memory pressure. Anon pages - * are not likely to be evicted by use-once streaming - * IO, plus JVM can create lots of anon VM_EXEC pages, - * so we ignore them here. - */ - if ((vm_flags & VM_EXEC) && !PageAnon(page)) { - list_add(&page->lru, &l_active); - continue; - } - } ClearPageActive(page); /* we are de-activating */ list_add(&page->lru, &l_inactive); the interesting part is, that s390x in the default configuration has no no- execute feature, resulting in the following map c0000000-1c04cd000 rwxs 00000000 00:04 18517 /dev/zero (deleted) As you can see, this area looks file mapped (/dev/zero) and executable. On the other hand, the !PageAnon clause should cover this case. I am lost. Does anybody on the CC (taken from the original patch) has an idea what the problem is and how to fix this properly? Christian