From: David Rientjes <rientjes@google.com>
To: Alex Thorlton <athorlton@sgi.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, mgorman@suse.de, riel@redhat.com,
kirill.shutemov@linux.intel.com, mingo@kernel.org,
hughd@google.com, lliubbo@gmail.com, hannes@cmpxchg.org,
srivatsa.bhat@linux.vnet.ibm.com, dave.hansen@linux.intel.com,
dfults@sgi.com, hedi@sgi.com
Subject: Re: [BUG] THP allocations escape cpuset when defrag is off
Date: Wed, 23 Jul 2014 16:05:36 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.02.1407231600110.1389@chino.kir.corp.google.com> (raw)
In-Reply-To: <20140723225742.GU8578@sgi.com>
On Wed, 23 Jul 2014, Alex Thorlton wrote:
> > It's also been a long-standing issue that cpusets and mempolicies are
> > ignored by khugepaged that allows memory to be migrated remotely to nodes
> > that are not allowed by a cpuset's mems or a mempolicy's nodemask. Even
> > with this issue fixed, you may find that some memory is migrated remotely,
> > although it may be negligible, by khugepaged.
>
> A bit here and there is manageable. There is, of course, some work to
> be done there, but for now we're mainly concerned with a job that's
> supposed to be confined to a cpuset spilling out and soaking up all the
> memory on a machine.
>
You may find my patch[*] in -mm to be helpful if you enable
zone_reclaim_mode. It changes khugepaged so that it is not allowed to
migrate any memory to a remote node where the distance between the nodes
is greater than RECLAIM_DISTANCE.
These issues are still pending and we've encountered a couple of them in
the past weeks ourselves. The definition of RECLAIM_DISTANCE, currently
at 30 for x86, is relying on the SLIT to define when remote access is
costly and there are cases where people need to alter the BIOS to
workaround this definition.
We can hope that NUMA balancing will solve a lot of these problems for us,
but there's always a chance that the VM does something totally wrong which
you've undoubtedly encountered already.
[*] http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-only-collapse-hugepages-to-nodes-with-affinity-for-zone_reclaim_mode.patch
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-07-23 23:05 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-23 22:05 Alex Thorlton
2014-07-23 22:28 ` David Rientjes
2014-07-23 22:50 ` [patch] mm, thp: do not allow thp faults to avoid cpuset restrictions David Rientjes
2014-07-23 23:20 ` Alex Thorlton
2014-07-25 9:14 ` Michal Hocko
2014-07-23 22:57 ` [BUG] THP allocations escape cpuset when defrag is off Alex Thorlton
2014-07-23 23:05 ` David Rientjes [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.02.1407231600110.1389@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=athorlton@sgi.com \
--cc=dave.hansen@linux.intel.com \
--cc=dfults@sgi.com \
--cc=hannes@cmpxchg.org \
--cc=hedi@sgi.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lliubbo@gmail.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=riel@redhat.com \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox