From: Mateusz Guzik <mjguzik@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: remove unintentional voluntary preemption in get_mmap_lock_carefully
Date: Mon, 21 Aug 2023 03:13:03 +0200 [thread overview]
Message-ID: <20230821011303.hoeqjbmjaxajh255@f> (raw)
In-Reply-To: <ZOJXgFJybD1ljCHL@casper.infradead.org>
On Sun, Aug 20, 2023 at 07:12:16PM +0100, Matthew Wilcox wrote:
> On Sun, Aug 20, 2023 at 12:43:03PM +0200, Mateusz Guzik wrote:
> > Found by checking off-CPU time during kernel build (like so:
> > "offcputime-bpfcc -Ku"), sample backtrace:
> > finish_task_switch.isra.0
> > __schedule
> > __cond_resched
> > lock_mm_and_find_vma
> > do_user_addr_fault
> > exc_page_fault
> > asm_exc_page_fault
> > - sh (4502)
>
> Now I'm awake, this backtrace really surprises me. Do we not check
> need_resched on entry? It seems terribly unlikely that need_resched
> gets set between entry and getting to this point, so I guess we must
> not.
>
> I suggest the version of the patch which puts might_sleep() before the
> mmap_read_trylock() is the right one to apply. It's basically what
> we've done forever, except that now we'll be rescheduling without the
> mmap lock held, which just seems like an overall win.
>
I can't sleep and your response made me curious, is that really safe
here?
As I wrote in another email, the routine is concerned with a case of the
kernel faulting on something it should not have. For a case like that I
find rescheduling to another thread to be most concerning.
That said I think I found a winner -- add need_resched() prior to
trylock.
This adds less work than you would have added with might_sleep (a func
call), still respects the preemption point, dodges exception table
checks in the common case and does not switch away if the there is
anything fishy going on.
Or just do that might_sleep.
I'm really buggering off the subject now.
====
mm: remove unintentional voluntary preemption in get_mmap_lock_carefully
Should the trylock succeed (and thus blocking was avoided), the routine
wants to ensure blocking was still legal to do. However, might_sleep()
used ends up calling __cond_resched() injecting a voluntary preemption
point with the freshly acquired lock.
__might_sleep() instead with the lock, but check for preemption prior to
taking it.
Found by checking off-CPU time during kernel build (like so:
"offcputime-bpfcc -Ku"), sample backtrace:
finish_task_switch.isra.0
__schedule
__cond_resched
lock_mm_and_find_vma
do_user_addr_fault
exc_page_fault
asm_exc_page_fault
- sh (4502)
10
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
mm/memory.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 1ec1ef3418bf..6dac9dbb7b59 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5258,8 +5258,8 @@ EXPORT_SYMBOL_GPL(handle_mm_fault);
static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
{
/* Even if this succeeds, make it clear we *might* have slept */
- if (likely(mmap_read_trylock(mm))) {
- might_sleep();
+ if (likely(!need_resched() && mmap_read_trylock(mm))) {
+ __might_sleep(__FILE__, __LINE__);
return true;
}
--
2.39.2
next prev parent reply other threads:[~2023-08-21 1:13 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-20 10:43 Mateusz Guzik
2023-08-20 11:36 ` Matthew Wilcox
2023-08-20 12:41 ` Mateusz Guzik
2023-08-20 12:46 ` Mateusz Guzik
2023-08-20 12:47 ` Linus Torvalds
2023-08-20 12:59 ` Linus Torvalds
2023-08-20 13:08 ` Mateusz Guzik
2023-08-20 13:00 ` Mateusz Guzik
2023-08-20 18:12 ` Matthew Wilcox
2023-08-21 1:13 ` Mateusz Guzik [this message]
2023-08-21 3:58 ` Matthew Wilcox
2023-08-21 4:55 ` Linus Torvalds
2023-08-21 5:38 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230821011303.hoeqjbmjaxajh255@f \
--to=mjguzik@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox