From: Mark Hills <mark@xwax.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
Subject: Re: ps lockups, cgroup memory reclaim
Date: Wed, 18 Sep 2013 01:50:05 +0100 (BST) [thread overview]
Message-ID: <1309180141270.29932@wes.ijneb.com> (raw)
In-Reply-To: <20130917162807.GF3278@cmpxchg.org>
On Tue, 17 Sep 2013, Johannes Weiner wrote:
> On Tue, Sep 17, 2013 at 04:50:42PM +0100, Mark Hills wrote:
> > I'm investigating intermitten kernel lockups in an HPC environment, with
> > the RedHat kernel.
> >
> > The symptoms are seen as lockups of multiple ps commands, with one
> > consuming full CPU:
> >
> > # ps aux | grep ps
> > root 19557 68.9 0.0 108100 908 ? D Sep16 1045:37 ps --ppid 1 -o args=
> > root 19871 0.0 0.0 108100 908 ? D Sep16 0:00 ps --ppid 1 -o args=
> >
> > SIGKILL on the busy one causes the other ps processes to run to completion
> > (TERM has no effect).
>
> Any chance you can get to the stack of the non-busy blocked tasks?
>
> It would be /proc/19871/stack in this case.
I had to return the machine above to the cluster, but next time I'll log
this information.
> > In this case I was able to run my own ps to see the process list, but not
> > always.
> >
> > perf shows the locality of the spinning, roughly:
> >
> > proc_pid_cmdline
> > get_user_pages
> > handle_mm_fault
> > mem_cgroup_try_charge_swapin
> > mem_cgroup_reclaim
> >
> > There are two entry points, the codepaths taken are better shown by the
> > attached profile of CPU time.
>
> Looks like it's spinning like crazy in shrink_mem_cgroup_zone().
> Maybe an LRU counter underflow, maybe endlessly looping on the
> should_continue_reclaim() compaction condition. But I don't see an
> obvious connection to why killing the busy task wakes up the blocked
> one.
Maybe it's as simple as the lock taken at quite a high level; perhaps even
a lock when reading values for /proc.
But no need for me to guess, we'll find out next time from the /proc
information.
> So yeah, it would be helpful to know what that task is waiting for.
>
> > We've had this behaviour since switching to Scientific Linux 6 (based on
> > RHEL6, like CentOS) at kernel 2.6.32-279.9.1.el6.x86_64.
> >
> > The example above is kernel 2.6.32-358.el6.x86_64.
>
> Can you test with the debug build? That should trap LRU counter
> underflows at least.
Ah, excellent -- I did not realise there was a kernel-debug package.
I'll see if I can isolate this with more detail from that kernel
(including the stack trace of each task that is hung)
> If you have the possibility to recompile the distribution kernel I can
> provide you with debug patches.
Absolutely, I can deploy a patched kernels but not cluster-wide. Let's
hope I can get enough coverage to catch the bug.
Hopefully I'll have more information soon.
Thanks
--
Mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-09-18 0:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-17 15:50 Mark Hills
2013-09-17 16:28 ` Johannes Weiner
2013-09-18 0:50 ` Mark Hills [this message]
2013-10-24 17:39 ` Mark Hills
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1309180141270.29932@wes.ijneb.com \
--to=mark@xwax.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox