From: Michal Hocko <mhocko@suse.com>
To: Feng Tang <feng.tang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Mel Gorman <mgorman@suse.de>,
dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node
Date: Wed, 4 Nov 2020 08:13:08 +0100 [thread overview]
Message-ID: <20201104071308.GN21990@dhcp22.suse.cz> (raw)
In-Reply-To: <1604470210-124827-1-git-send-email-feng.tang@intel.com>
On Wed 04-11-20 14:10:08, Feng Tang wrote:
> Hi,
>
> This patchset tries to report a problem and get suggestion/review
> for the RFC fix patches.
>
> We recently got a OOM report, that when user try to bind a docker(container)
> instance to a memory node which only has movable zones, and OOM killing
> still can't solve the page allocation failure.
This is a cpuset node binding right?
> The callstack was:
>
> [ 1387.877565] runc:[2:INIT] invoked oom-killer: gfp_mask=0x500cc2(GFP_HIGHUSER|__GFP_ACCOUNT), order=0, oom_score_adj=0
> [ 1387.877568] CPU: 8 PID: 8291 Comm: runc:[2:INIT] Tainted: G W I E 5.8.2-0.g71b519a-default #1 openSUSE Tumbleweed (unreleased)
> [ 1387.877569] Hardware name: Dell Inc. PowerEdge R640/0PHYDR, BIOS 2.6.4 04/09/2020
> [ 1387.877570] Call Trace:
> [ 1387.877579] dump_stack+0x6b/0x88
> [ 1387.877584] dump_header+0x4a/0x1e2
> [ 1387.877586] oom_kill_process.cold+0xb/0x10
> [ 1387.877588] out_of_memory.part.0+0xaf/0x230
> [ 1387.877591] out_of_memory+0x3d/0x80
> [ 1387.877595] __alloc_pages_slowpath.constprop.0+0x954/0xa20
> [ 1387.877599] __alloc_pages_nodemask+0x2d3/0x300
> [ 1387.877602] pipe_write+0x322/0x590
> [ 1387.877607] new_sync_write+0x196/0x1b0
> [ 1387.877609] vfs_write+0x1c3/0x1f0
> [ 1387.877611] ksys_write+0xa7/0xe0
> [ 1387.877617] do_syscall_64+0x52/0xd0
> [ 1387.877621] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The meminfo log only shows the movable only node, which has plenty
> of free memory. And in our reproducing with 1/2 patch, the normal
> node (has DMA/DMA32/Normal) also has lot of free memory when OOM
> happens.
OK, so you are bidning to a movable node only and your above request is
for GFP_HIGHUSER which _cannot_ be satisfied from the movable zones
because that memory is not movable. So the system behaves as expected.
Your cpuset is misconfigured IMHO. Movable only nodes come with their
risk and configuration price.
> If we hack to make this (GFP_HIGHUSER|__GFP_ACCOUNT) request get
> a page, and following full docker run (like installing and running
> 'stress-ng' stress test) will see more allocation failures due to
> different kinds of request(gfp_masks). And the 2/2 patch will detect
> such cases that the allowed target nodes only have movable zones
> and loose the binding check, otherwise it will trigger OOM while
> the OOM won't do any help, as the problem is not lack of free memory.
Well, this breaks the cpuset containment, right? I consider this quite
unexpected for something that looks like a misconfiguration. I do agree
that this is unexpected for anybody who is not really familiar with
concept of movable zone but we should probably call out all these
details rather than tweak the existing semantic.
Could you be more specific about the usecase here? Why do you need a
binding to a pure movable node?
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2020-11-04 7:13 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-04 6:10 Feng Tang
2020-11-04 6:10 ` [RFC PATCH 1/2] mm, oom: dump meminfo for all memory nodes Feng Tang
2020-11-04 7:18 ` Michal Hocko
2020-11-04 6:10 ` [RFC PATCH 2/2] mm, page_alloc: loose the node binding check to avoid helpless oom killing Feng Tang
2020-11-04 7:23 ` Michal Hocko
2020-11-04 7:13 ` Michal Hocko [this message]
2020-11-04 7:38 ` [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node Feng Tang
2020-11-04 7:58 ` Michal Hocko
2020-11-04 8:40 ` Feng Tang
2020-11-04 8:53 ` Michal Hocko
[not found] ` <20201105014028.GA86777@shbuild999.sh.intel.com>
2020-11-05 12:08 ` Michal Hocko
2020-11-05 12:53 ` Vlastimil Babka
2020-11-05 12:58 ` Michal Hocko
2020-11-05 13:07 ` Feng Tang
2020-11-05 13:12 ` Michal Hocko
2020-11-05 13:43 ` Feng Tang
2020-11-05 16:16 ` Michal Hocko
2020-11-06 7:06 ` Feng Tang
2020-11-06 8:10 ` Michal Hocko
2020-11-06 9:08 ` Feng Tang
2020-11-06 10:35 ` Michal Hocko
2020-11-05 13:14 ` Vlastimil Babka
2020-11-05 13:19 ` Michal Hocko
2020-11-05 13:34 ` Vlastimil Babka
2020-11-06 4:32 ` Huang, Ying
2020-11-06 7:43 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201104071308.GN21990@dhcp22.suse.cz \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=feng.tang@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox