From: Michal Hocko <mhocko@suse.com>
To: Nico Pache <npache@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Rafael Aquini <aquini@redhat.com>,
Waiman Long <longman@redhat.com>, Baoquan He <bhe@redhat.com>,
Christoph von Recklinghausen <crecklin@redhat.com>,
Don Dutile <ddutile@redhat.com>,
"Herton R . Krzesinski" <herton@redhat.com>,
David Rientjes <rientjes@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
tglx@linutronix.de, mingo@redhat.com, dvhart@infradead.org,
dave@stgolabs.net, andrealmeid@collabora.com,
peterz@infradead.org, Joel Savitz <jsavitz@redhat.com>
Subject: Re: [PATCH v4] mm/oom_kill.c: futex: Don't OOM reap a process with a futex robust list
Date: Wed, 9 Mar 2022 14:09:59 +0100 [thread overview]
Message-ID: <YiinJ3A6WoTJLN8d@dhcp22.suse.cz> (raw)
In-Reply-To: <20220309002550.103786-1-npache@redhat.com>
On Tue 08-03-22 17:25:50, Nico Pache wrote:
> The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which can
> be targeted by the oom reaper. This mapping is also used to store the futex
> robust list; the kernel does not keep a copy of the robust list and instead
> references a userspace address to maintain the robustness during a process
> death. A race can occur between exit_mm and the oom reaper that allows
> the oom reaper to clear the memory of the futex robust list before the
> exit path has handled the futex death.
The above is missing the important part of the problem description. So
the oom_reaper frees the memory which is backing the robust list. It
would be useful to link that to the lockup on the futex.
> Prevent the OOM reaper from concurrently reaping the mappings if the dying
> process contains a robust_list. If the dying task_struct does not contain
> a pointer in tsk->robust_list, we can assume there was either never one
> setup for this task struct, or futex_cleanup has properly handled the
> futex death and we can safely reap this memory.
I do agree with Waiman that this should go into a helper function. This
would be a quick workaround but I believe that it would be much better
to either do the futex cleanup in the oom_reaper context if that could
be done without blocking. If that is really not feasible for some reason
then we could skip over vmas which are backing the robust list. Have you
considered any of those solutions?
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2022-03-09 13:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-09 0:25 Nico Pache
2022-03-09 0:48 ` Waiman Long
2022-03-09 13:09 ` Michal Hocko [this message]
2022-03-09 21:34 ` Nico Pache
2022-03-10 8:46 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YiinJ3A6WoTJLN8d@dhcp22.suse.cz \
--to=mhocko@suse.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrealmeid@collabora.com \
--cc=aquini@redhat.com \
--cc=bhe@redhat.com \
--cc=crecklin@redhat.com \
--cc=dave@stgolabs.net \
--cc=ddutile@redhat.com \
--cc=dvhart@infradead.org \
--cc=herton@redhat.com \
--cc=jsavitz@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox