linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: David Rientjes <rientjes@google.com>,
	Nick Piggin <npiggin@suse.de>, Paul Menage <menage@google.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: [PATCH 0/2] fix oom happening when changing cpuset'mems(was: [regression] cpuset,mm: update tasks' mems_allowed in time (58568d2))
Date: Thu, 22 Apr 2010 22:11:12 +0800	[thread overview]
Message-ID: <4BD05900.7040203@cn.fujitsu.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1641 bytes --]

Nick Piggin reported that the allocator may see an empty nodemask when
changing cpuset's mems.

The problem is that:
Cpuset updates task->mems_allowed and mempolicy by setting all new bits
in the nodemask first, and clearing all old unallowed bits later.
But the allocator may load a word of the mask before setting all new bits
and then load another word of the mask after clearing all old unallowed
bits, in this way, the allocator sees an empty nodemask.

It happens only on the kernel that do not do atomic nodemask_t stores.
(MAX_NUMNODES > BITS_PER_LONG)

But I found that there is also a problem on the kernel that can do atomic
nodemask_t stores. The problem is that the allocator can't find a node to
alloc page when changing cpuset's mems though there is a lot of free memory.

I can use the attached program reproduce it by the following step:
# mkdir /dev/cpuset
# mount -t cpuset cpuset /dev/cpuset
# mkdir /dev/cpuset/1
# echo `cat /dev/cpuset/cpus` > /dev/cpuset/1/cpus
# echo `cat /dev/cpuset/mems` > /dev/cpuset/1/mems
# echo $$ > /dev/cpuset/1/tasks
# numactl --membind=`cat /dev/cpuset/mems` ./cpuset_mem_hog <nr_tasks> &
   <nr_tasks> = max(nr_cpus - 1, 1)
# killall -s SIGUSR1 cpuset_mem_hog
# ./change_mems.sh

several hours later, oom will happen though there is a lot of free memory.

The problem is following:
	task1					task2
	mmap()				mems=1
	  Can alloc page on node0? NO	mems=1
					mems=0	change mems from 1 to 0
					mems=0-1  set all new bits
					mems=0	  clear all disallowed bits
	  Can alloc page on node1? NO	mems=0
	  ...
	can't alloc page
	  goto oom

this patchset fixes those problems.

Thanks
Miao

[-- Attachment #2: reproduce_prog.tar.gz --]
[-- Type: application/gzip, Size: 1190 bytes --]

                 reply	other threads:[~2010-04-22 14:14 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BD05900.7040203@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=menage@google.com \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox