From: Matthew Wilcox <willy@infradead.org>
To: Tong Tiangen <tongtiangen@huawei.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Miaohe Lin <linmiaohe@huawei.com>,
wangkefeng.wang@huawei.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
Date: Tue, 22 Aug 2023 13:08:52 +0100 [thread overview]
Message-ID: <ZOSlVGxcxT9JLoUv@casper.infradead.org> (raw)
In-Reply-To: <0bbbb7d8-699b-30ac-9657-840112c41a78@huawei.com>
On Tue, Aug 22, 2023 at 11:41:41AM +0800, Tong Tiangen wrote:
> 在 2023/8/22 2:33, Matthew Wilcox 写道:
> > On Mon, Aug 21, 2023 at 05:13:12PM +0800, Tong Tiangen wrote:
> > > We can see that CPU1 waiting for CPU0 respond IPI,CPU0 waiting for CPU2
> > > unlock tasklist_lock, CPU2 waiting for CPU1 unlock page->ptl. As a result,
> > > softlockup is triggered.
> > >
> > > For collect_procs_anon(), we will not modify the tasklist, but only perform
> > > read traversal. Therefore, we can use rcu lock instead of spin lock
> > > tasklist_lock, from this, we can break the softlock chain above.
> >
> > The only thing that's giving me pause is that there's no discussion
> > about why this is safe. "We're not modifying it" isn't really enough
> > to justify going from read_lock() to rcu_read_lock(). When you take a
> > normal read_lock(), writers are not permitted and so you see an atomic
> > snapshot of the list. With rcu_read_lock() you can see inconsistencies.
>
> Hi Matthew:
>
> When rcu_read_lock() is used, the task list can be modified during the
> iteration, but cannot be seen during iteration. After the iteration is
> complete, the task list can be updated in the RCU mechanism. Therefore, the
> task list used by iteration can also be considered as a snapshot.
No, that's not true! You are not iterating a snapshot of the list,
you're iterating the live list. It will change under you. RCU provides
you with some guarantees about that list. See Documentation/RCU/listRCU.rst
> > For example, if new tasks can be added to the tasklist, they may not
> > be seen by an iteration. Is this OK?
>
> The newly added tasks does not access the HWPoison page, because the
> HWPoison page has been isolated from the
> buddy(memory_failure()->take_page_off_buddy()). Therefore, it is safe to see
> the newly added task during the iteration and not be seen by iteration.
>
> Tasks may be removed from the
> > tasklist after they have been seen by the iteration. Is this OK?
>
> Task be seen during iteration are deleted from the task list after
> iteration, it's task_struct is not released because reference counting is
> added in __add_to_kill(). Therefore, the subsequent processing of
> kill_procs() is not affected (sending signals to the task deleted from task
> list). so i think it's safe too.
I don't know this code, but it seems unsafe to me. Look:
collect_procs_anon:
for_each_process(tsk) {
struct task_struct *t = task_early_kill(tsk, force_early);
add_to_kill_anon_file(t, page, vma, to_kill);
add_to_kill_anon_file:
__add_to_kill(tsk, p, vma, to_kill, 0, FSDAX_INVALID_PGOFF);
__add_to_kill:
get_task_struct(tsk);
static inline struct task_struct *get_task_struct(struct task_struct *t)
{
refcount_inc(&t->usage);
return t;
}
/**
* refcount_inc - increment a refcount
* @r: the refcount to increment
*
* Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
*
* Provides no memory ordering, it is assumed the caller already has a
* reference on the object.
*
* Will WARN if the refcount is 0, as this represents a possible use-after-free
* condition.
*/
I don't see anything that prevents that refcount_inc from seeing a zero
refcount. Usually that would be prevented by tasklist_lock, right?
Andrew, I think this patch is bad and needs to be dropped.
next prev parent reply other threads:[~2023-08-22 12:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-21 9:13 Tong Tiangen
2023-08-21 18:33 ` Matthew Wilcox
2023-08-22 3:41 ` Tong Tiangen
2023-08-22 12:08 ` Matthew Wilcox [this message]
2023-08-25 6:02 ` Naoya Horiguchi
2023-08-26 1:46 ` Tong Tiangen
2023-08-26 20:28 ` Matthew Wilcox
2023-08-28 2:36 ` Tong Tiangen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZOSlVGxcxT9JLoUv@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=tongtiangen@huawei.com \
--cc=wangkefeng.wang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox