From: Robin Holt <holt@sgi.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-mm@kvack.org
Subject: Re: Question/problem with mmu_notifier_unregister.
Date: Wed, 16 Jan 2013 14:00:18 -0600 [thread overview]
Message-ID: <20130116200018.GA3460@sgi.com> (raw)
In-Reply-To: <20130115162956.GH3438@sgi.com>
On Tue, Jan 15, 2013 at 10:29:56AM -0600, Robin Holt wrote:
> Andrea,
>
> On a community or upcoming distro based kernel, I have one XPMEM test
> which is failing. The following patch (with lots of context) fixes it.
> I do not yet understand the cause of the race condition. I did hope it
> would be obvious to you and save me the time of investigating further.
>
> The first test finds mn->hlist is on the chain. The new second test
> does not. I have verified that neither xpmem nor gru is calling to
> unregister the same mmu_notifier twice. It is very difficult to find the
> problem as the test case requires a very long time to trip. To increase
> the likelihood, I run many copies in parallel. Each is 2 processes each
> with one pthread. I run two per socket on an 8 core per socket system
> with 256 socket system. When one job hits the null pointer deref, I can
> have many thousands of lines of debug before the other jobs stop running.
> Each process has /dev/gru mapped into their address space. They have
> used xpmem to attach part of the address space of the other process.
> The tasks are exiting at approximately the same time. The race seems to
> be with one pthread hitting the final mmput at about the same time as the
> other task is getting into the filp_close->flush() callout.
>
> I have a couple other things to work on today, and will likely try to
> chase this failure more tomorrow. With this patch, I have not been able
> to trigger any other issues in many hours of testing. The test likewise
> runs for many hours on a RHEL 6.3 based system and a SLES11 SP2 based
> system so it might have something to do with the change in srcu locking.
Using code inspection, I think I understand this issue some more.
Assume two tasks, one calling mmu_notifier_unregister() as a result
of a filp_close() ->flush() callout (task A), and the other calling
mmu_notifier_release() from an mmput() (task B).
A B
t1 srcu_read_lock()
t2 if (!hlist_unhashed())
t3 srcu_read_lock()
t4 srcu_read_unlock()
t5 hlist_del...()
t6 synchronize_srcu()
t7 srcu_read_unlock()
t8 hlist_del_rcu() <--- NULL pointer deref.
Does this seem plausible? Am I missing something?
My earlier patch feels wrong as I think we will see two ->release()
calls are made, but I am not sure of that.
Thanks,
Robin
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 8a5ac8c..e2c9827 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -297,37 +299,38 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
> if (!hlist_unhashed(&mn->hlist)) {
> /*
> * SRCU here will force exit_mmap to wait ->release to finish
> * before freeing the pages.
> */
> int id;
>
> id = srcu_read_lock(&srcu);
> /*
> * exit_mmap will block in mmu_notifier_release to
> * guarantee ->release is called before freeing the
> * pages.
> */
> if (mn->ops->release)
> mn->ops->release(mn, mm);
> srcu_read_unlock(&srcu, id);
>
> spin_lock(&mm->mmu_notifier_mm->lock);
> - hlist_del_rcu(&mn->hlist);
> + if (!hlist_unhashed(&mn->hlist))
> + hlist_del_rcu(&mn->hlist);
> spin_unlock(&mm->mmu_notifier_mm->lock);
> }
>
> /*
> * Wait any running method to finish, of course including
> * ->release if it was run by mmu_notifier_relase instead of us.
> */
> synchronize_srcu(&srcu);
>
> BUG_ON(atomic_read(&mm->mm_count) <= 0);
>
> mmdrop(mm);
> }
> EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
>
> static int __init mmu_notifier_init(void)
> {
> return init_srcu_struct(&srcu);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-01-16 20:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-15 16:29 Robin Holt
2013-01-16 20:00 ` Robin Holt [this message]
2013-01-16 21:01 ` [PATCH] [Patch] mmu_notifier_unregister NULL Pointer deref fix Robin Holt
2013-01-17 2:45 ` Xiao Guangrong
2013-01-17 11:12 ` Robin Holt
2013-01-17 12:19 ` Xiao Guangrong
2013-01-17 13:45 ` Robin Holt
2013-01-18 2:42 ` Xiao Guangrong
2013-01-18 2:48 ` Robin Holt
2013-01-18 3:04 ` Xiao Guangrong
2013-01-18 12:14 ` Robin Holt
2013-01-18 12:51 ` Xiao Guangrong
2013-01-18 15:14 ` Robin Holt
2013-01-23 2:00 ` Wanpeng Li
2013-01-23 2:00 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130116200018.GA3460@sgi.com \
--to=holt@sgi.com \
--cc=aarcange@redhat.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox