From: Yonghong Song <yhs@fb.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"kernel-team@fb.com" <kernel-team@fb.com>,
Luigi Rizzo <lrizzo@google.com>
Subject: Re: [PATCH mm] mm/mmap: relax mmap_assert_locked() for find_vma()
Date: Tue, 7 Sep 2021 21:39:17 -0700 [thread overview]
Message-ID: <95576066-0750-2f0f-9886-de00f62abe0e@fb.com> (raw)
In-Reply-To: <20210908030941.rfew4t63fw4s4bpz@revolver>
On 9/7/21 8:09 PM, Liam Howlett wrote:
> * Yonghong Song <yhs@fb.com> [210907 02:27]:
>> Current bpf-next bpf selftest "get_stack_raw_tp" triggered the warning:
>>
>> [ 1411.304463] WARNING: CPU: 3 PID: 140 at include/linux/mmap_lock.h:164 find_vma+0x47/0xa0
>> [ 1411.304469] Modules linked in: bpf_testmod(O) [last unloaded: bpf_testmod]
>> [ 1411.304476] CPU: 3 PID: 140 Comm: systemd-journal Tainted: G W O 5.14.0+ #53
>> [ 1411.304479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>> [ 1411.304481] RIP: 0010:find_vma+0x47/0xa0
>> [ 1411.304484] Code: de 48 89 ef e8 ba f5 fe ff 48 85 c0 74 2e 48 83 c4 08 5b 5d c3 48 8d bf 28 01 00 00 be ff ff ff ff e8 2d 9f d8 00 85 c0 75 d4 <0f> 0b 48 89 de 48 8
>> [ 1411.304487] RSP: 0018:ffffabd440403db8 EFLAGS: 00010246
>> [ 1411.304490] RAX: 0000000000000000 RBX: 00007f00ad80a0e0 RCX: 0000000000000000
>> [ 1411.304492] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
>> [ 1411.304494] RBP: ffff9cf5c2f50000 R08: ffff9cf5c3eb25d8 R09: 00000000fffffffe
>> [ 1411.304496] R10: 0000000000000001 R11: 00000000ef974e19 R12: ffff9cf5c39ae0e0
>> [ 1411.304498] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9cf5c39ae0e0
>> [ 1411.304501] FS: 00007f00ae754780(0000) GS:ffff9cf5fba00000(0000) knlGS:0000000000000000
>> [ 1411.304504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1411.304506] CR2: 000000003e34343c CR3: 0000000103a98005 CR4: 0000000000370ee0
>> [ 1411.304508] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 1411.304510] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 1411.304512] Call Trace:
>> [ 1411.304517] stack_map_get_build_id_offset+0x17c/0x260
>> [ 1411.304528] __bpf_get_stack+0x18f/0x230
>> [ 1411.304541] bpf_get_stack_raw_tp+0x5a/0x70
>> [ 1411.305752] RAX: 0000000000000000 RBX: 5541f689495641d7 RCX: 0000000000000000
>> [ 1411.305756] RDX: 0000000000000001 RSI: ffffffff9776b144 RDI: ffffffff977e1b0e
>> [ 1411.305758] RBP: ffff9cf5c02b2f40 R08: ffff9cf5ca7606c0 R09: ffffcbd43ee02c04
>> [ 1411.306978] bpf_prog_32007c34f7726d29_bpf_prog1+0xaf/0xd9c
>> [ 1411.307861] R10: 0000000000000001 R11: 0000000000000044 R12: ffff9cf5c2ef60e0
>> [ 1411.307865] R13: 0000000000000005 R14: 0000000000000000 R15: ffff9cf5c2ef6108
>> [ 1411.309074] bpf_trace_run2+0x8f/0x1a0
>> [ 1411.309891] FS: 00007ff485141700(0000) GS:ffff9cf5fae00000(0000) knlGS:0000000000000000
>> [ 1411.309896] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1411.311221] syscall_trace_enter.isra.20+0x161/0x1f0
>> [ 1411.311600] CR2: 00007ff48514d90e CR3: 0000000107114001 CR4: 0000000000370ef0
>> [ 1411.312291] do_syscall_64+0x15/0x80
>> [ 1411.312941] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 1411.313803] entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [ 1411.314223] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 1411.315082] RIP: 0033:0x7f00ad80a0e0
>> [ 1411.315626] Call Trace:
>> [ 1411.315632] stack_map_get_build_id_offset+0x17c/0x260
>>
>> To reproduce, first build `test_progs` binary:
>> make -C tools/testing/selftests/bpf -j60
>> and then run the binary at tools/testing/selftests/bpf directory:
>> ./test_progs -t get_stack_raw_tp
>>
>> The warning is due to commit 5b78ed24e8ec("mm/pagemap: add mmap_assert_locked() annotations to find_vma*()")
>> which added mmap_assert_locked() in find_vma() function. The mmap_assert_locked() function
>> asserts that mm->mmap_lock needs to be held. But this is not the case for
>> bpf_get_stack() or bpf_get_stackid() helper (kernel/bpf/stackmap.c), which
>> uses mmap_read_trylock_non_owner() instead. Since mm->mmap_lock is not held
>> in bpf_get_stack[id]() use case, the above warning is emitted during test run.
>>
>> This patch fixed the issue by replacing mmap_assert_locked() with rwsem_is_locked(&mm->mmap_lock)
>> in find_vma().
>
> I'm not a fan of removing the lockdep check. I'd rather see find_vma()
> do the lockdep and call another internal function which does exactly
> what you have below. Then you could call the one and everyone else
> could keep the code they have.
Okay, will do that in the next revision.
>
> More importantly, since this is the only user of the
> mmap_read_trylock_non_owner(), is there a build bot that could be used
> to catch errors introduced in this
> path?
We do have libbpf ci to test selftest against current *bpf_next* and
some past old kernels. In this case, I think it is linux-next first
introduced issue, and then the change went to net-next and then
bpf-next. So to catch this earlier, we need to do libbpf ci on linux-next
and/or net-next. We will discuss how we could incorporate this.
>
> Thanks,
> Liam
>
>>
>> Cc: Luigi Rizzo <lrizzo@google.com>
>> Fixes: 5b78ed24e8ec("mm/pagemap: add mmap_assert_locked() annotations to find_vma*()")
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>> mm/mmap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> Alternatively we could add a bool argument to find_vma to indicate whether
>> mm->mmap_lock has been held or not. But I would like to report and try the
>> simpler solution first.
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 88dcc5c25225..8c78b85475b1 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2275,7 +2275,7 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
>> struct rb_node *rb_node;
>> struct vm_area_struct *vma;
>>
>> - mmap_assert_locked(mm);
>> + VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm);
>> /* Check the cache first. */
>> vma = vmacache_find(mm, addr);
>> if (likely(vma))
>> --
>> 2.30.2
>>
prev parent reply other threads:[~2021-09-08 4:39 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-07 6:26 Yonghong Song
2021-09-08 3:09 ` Liam Howlett
2021-09-08 4:39 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95576066-0750-2f0f-9886-de00f62abe0e@fb.com \
--to=yhs@fb.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=liam.howlett@oracle.com \
--cc=linux-mm@kvack.org \
--cc=lrizzo@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox