From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFA8AC5475B for ; Mon, 11 Mar 2024 11:02:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C1BB6B0088; Mon, 11 Mar 2024 07:02:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 56EC86B0089; Mon, 11 Mar 2024 07:02:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45D8E6B008A; Mon, 11 Mar 2024 07:02:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 334C16B0088 for ; Mon, 11 Mar 2024 07:02:37 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E59E4A115F for ; Mon, 11 Mar 2024 11:02:36 +0000 (UTC) X-FDA: 81884469912.13.42BC673 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf04.hostedemail.com (Postfix) with ESMTP id B3CA44002A for ; Mon, 11 Mar 2024 11:02:33 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of changbin.du@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=changbin.du@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710154955; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gcidNvystXJxtfC0sxmBbDb3AOrGrNTytJ4nsg3MVhA=; b=PLIF4dNasbJzdWjbXfHU+Q3u56ew9I5y9YSgSPBvhP1P095R1SJsRVsdcFeG81QWplyTcN 5N4j3fCrZvFVtc5evnb5zchqCSPu4dL1H2gAn/MU6Z3s8uZxQwP9i6zLRsos0MzSrxiKHj deIsf9tmoaTkCNw7yHyu00R/pW7+rm4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf04.hostedemail.com: domain of changbin.du@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=changbin.du@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710154955; a=rsa-sha256; cv=none; b=1awEizFhF4ftATkYuCl++c/1VS8eZppkoJeFShdYVUtIMqqdLUNzwNC5Z8Gl9GBHEC9+xF aUamFD9SJlc6KUYGWHP5cGB63yP8fG0sjkHjHyOiaZCk5ypKXFjbvBH/2mw/Nkf4VGCLdl 334dvylC3QADVscpm6KsoYTiwT865H0= Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4TtYhm5LwNz1Q9Ws; Mon, 11 Mar 2024 19:00:24 +0800 (CST) Received: from kwepemd100011.china.huawei.com (unknown [7.221.188.204]) by mail.maildlp.com (Postfix) with ESMTPS id 0484414040F; Mon, 11 Mar 2024 19:02:29 +0800 (CST) Received: from M910t (10.110.54.157) by kwepemd100011.china.huawei.com (7.221.188.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 11 Mar 2024 19:02:28 +0800 Date: Mon, 11 Mar 2024 19:02:23 +0800 From: Changbin Du To: , Changbin Du CC: Marco Elver , Alexander Potapenko , Andrew Morton , , , Subject: Re: [BUG] kmsan: instrumentation recursion problems Message-ID: <20240311110223.nzsplk6a6lzxmzqi@M910t> References: <20240308043448.masllzeqwht45d4j@M910t> <20240311093036.44txy57hvhevybsu@M910t> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20240311093036.44txy57hvhevybsu@M910t> X-Originating-IP: [10.110.54.157] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemd100011.china.huawei.com (7.221.188.204) X-Rspam-User: X-Stat-Signature: m4aeendap1gnx5h65634bppx7mrq8cgk X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B3CA44002A X-HE-Tag: 1710154953-337959 X-HE-Meta: U2FsdGVkX19t5OGYWGjkuwaqCPU2MwKOn5/1YxTzPKy+yIg4wa6DyzXt/nb+Z6gKwLzOvkbw4Vpq/kLvAuGXl/MpZsmaIq7O8vMQdzaO6SgeDqkLIPXfHkyFynimAbsyy5cq65/gc91M6AzU5BBYTQQnmTdnOnQ2G4nty25knY9DbcE6czSVqhgzk3z9Va/H6adGU5PryYeUzesRhUpI6frJUUrffL1clbrDmloIMB87MH0G6KyPK0iGqGVq8BOmgS3D6oFQtxDzsaG57VsJmuph43yK475AvFzzfpn9Oej1gc28nilqqo3/0aHEcrYGQc1QBXIk8ojQGFvftZ/aM8QmuMf/IXY5R3Vx+BoR580BslnMCaBTOoCmjI8ynhXZdnthekehp4W4sOYs0Cg++/4BHipaAxQTfc+xCn4MAH5EBr7eFIealsp6F4Q4km700HhzE7UoK/Fn6dwIEBOW4k8WAHL4gKq1JouJGS1v+JnHWjjwllUOpoC6midbanwl7bv7LVOu+6LXALyo+a8h7qXeimYmSb0pFnQ3wnW/DGzew17ChItkA40cp3l+uirscLc5Sqjn4d3uonUV8PWX4X2W7S6HGE7uFrH4BwxVm9hRiQps+9QiI2AGmFTk0hkqeBNyeUqTZ22FrKj8PEn2ISzt0S98Ul+LnwT7AdrwI+CIVyxkloW0LNv/8VCgj5Hx1puyrlOFH9S3/NrIajAyzrF0uTYZRgbcPOaSQ4wuhHexEpD/S5Zw3yMBlsEqvId+LNkHcgjWGsBmDznh00vwO62aC6aUN2FsDUlsQ2yXLtDzciRN9Y7wk2dFEBUy+3yHeKV94UkF7aH0FpcPrXkFcoMM4k42UFhFHi/mtRokurH73c+thQl9LT4RkzaMrWENRVnhwg3Xbig4P91UZhyR1zv9+63DIl85j+QxQJ2ikCOrcKQWPRsxE3dMy/IcNyooQsu+gMJvMvt53LA4jjI uNo073Tj 22FrCSFVeIh6zyiWEWe0XgA2sJZarQuZiSM7PjbUVEvL3HZUQnhwTC32FrtlkHAKn5Z3tXzLonaVEwzZUU2dQujVrldBdJ64s5fLSEKWra6f+y5s3Tkm1MFw7IcKA7YYSbyYm2Djyfy0Iihl+G3nmYby5L2gvjYuHN8PWt1OZpRjwZ1H5bfvZ5f/WPa2m04MxidJ21pzjBoiqKQbPjk/VFyuCI7BSMnUnyqWLbbMdF29dv0GJKcjgqUqOysbNKxgm308p X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 11, 2024 at 05:30:36PM +0800, Changbin Du wrote: > On Fri, Mar 08, 2024 at 10:39:15AM +0100, Marco Elver wrote: > > On Fri, 8 Mar 2024 at 05:36, 'Changbin Du' via kasan-dev > > wrote: > > > > > > Hey, folks, > > > I found two instrumentation recursion issues on mainline kernel. > > > > > > 1. recur on preempt count. > > > __msan_metadata_ptr_for_load_4() -> kmsan_virt_addr_valid() -> preempt_disable() -> __msan_metadata_ptr_for_load_4() > > > > > > 2. recur in lockdep and rcu > > > __msan_metadata_ptr_for_load_4() -> kmsan_virt_addr_valid() -> pfn_valid() -> rcu_read_lock_sched() -> lock_acquire() -> rcu_is_watching() -> __msan_metadata_ptr_for_load_8() > > > > > > > > > Here is an unofficial fix, I don't know if it will generate false reports. > > > > > > $ git show > > > commit 7f0120b621c1cbb667822b0f7eb89f3c25868509 (HEAD -> master) > > > Author: Changbin Du > > > Date: Fri Mar 8 20:21:48 2024 +0800 > > > > > > kmsan: fix instrumentation recursions > > > > > > Signed-off-by: Changbin Du > > > > > > diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile > > > index 0db4093d17b8..ea925731fa40 100644 > > > --- a/kernel/locking/Makefile > > > +++ b/kernel/locking/Makefile > > > @@ -7,6 +7,7 @@ obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o > > > > > > # Avoid recursion lockdep -> sanitizer -> ... -> lockdep. > > > KCSAN_SANITIZE_lockdep.o := n > > > +KMSAN_SANITIZE_lockdep.o := n > > > > This does not result in false positives? > > This does result lots of false positives. > I saw a lot of reports but seems not related to this. > > [ 2.742743][ T0] BUG: KMSAN: uninit-value in unwind_next_frame+0x3729/0x48a0 > [ 2.744404][ T0] unwind_next_frame+0x3729/0x48a0 > [ 2.745623][ T0] arch_stack_walk+0x1d9/0x2a0 > [ 2.746838][ T0] stack_trace_save+0xb8/0x100 > [ 2.747928][ T0] set_track_prepare+0x88/0x120 > [ 2.749095][ T0] __alloc_object+0x602/0xbe0 > [ 2.750200][ T0] __create_object+0x3f/0x4e0 > [ 2.751332][ T0] pcpu_alloc+0x1e18/0x2b00 > [ 2.752401][ T0] mm_init+0x688/0xb20 > [ 2.753436][ T0] mm_alloc+0xf4/0x180 > [ 2.754510][ T0] poking_init+0x50/0x500 > [ 2.755594][ T0] start_kernel+0x3b0/0xbf0 > [ 2.756724][ T0] __pfx_reserve_bios_regions+0x0/0x10 > [ 2.758073][ T0] x86_64_start_kernel+0x92/0xa0 > [ 2.759320][ T0] secondary_startup_64_no_verify+0x176/0x17b > Above reports are triggered by KMEMLEAK and KFENCE. Now with below fix, I was able to run kmsan kernel with: CONFIG_DEBUG_KMEMLEAK=n CONFIG_KFENCE=n CONFIG_LOCKDEP=n KMEMLEAK and KFENCE generate too many false positives in unwinding code. LOCKDEP still introduces instrumenting recursions. > > > Does > > KMSAN_ENABLE_CHECKS_lockdep.o := n > > work as well? If it does, that is preferred because it makes sure > > there are no false positives if the lockdep code unpoisons data that > > is passed and used outside lockdep. > > > > lockdep has a serious impact on performance, and not sanitizing it > > with KMSAN is probably a reasonable performance trade-off. > > > Disabling checks is not working here. The recursion become this: > > __msan_metadata_ptr_for_load_4() -> kmsan_get_metadata() -> virt_to_page_or_null() -> pfn_valid() -> lock_acquire() -> __msan_unpoison_alloca() -> kmsan_get_metadata() > > > > ifdef CONFIG_FUNCTION_TRACER > > > CFLAGS_REMOVE_lockdep.o = $(CC_FLAGS_FTRACE) > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > index b2bccfd37c38..8935cc866e2d 100644 > > > --- a/kernel/rcu/tree.c > > > +++ b/kernel/rcu/tree.c > > > @@ -692,7 +692,7 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp) > > > * Make notrace because it can be called by the internal functions of > > > * ftrace, and making this notrace removes unnecessary recursion calls. > > > */ > > > -notrace bool rcu_is_watching(void) > > > +notrace __no_sanitize_memory bool rcu_is_watching(void) > > > > For all of these, does __no_kmsan_checks instead of __no_sanitize_memory work? > > Again, __no_kmsan_checks (function-only counterpart to > > KMSAN_ENABLE_CHECKS_.... := n) is preferred if it works as it avoids > > any potential false positives that would be introduced by not > > instrumenting. > > > This works because it is not unpoisoning local variables. > > > > { > > > bool ret; > > > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > > index 9116bcc90346..33aa4df8fd82 100644 > > > --- a/kernel/sched/core.c > > > +++ b/kernel/sched/core.c > > > @@ -5848,7 +5848,7 @@ static inline void preempt_latency_start(int val) > > > } > > > } > > > > > > -void preempt_count_add(int val) > > > +void __no_sanitize_memory preempt_count_add(int val) > > > { > > > #ifdef CONFIG_DEBUG_PREEMPT > > > /* > > > @@ -5880,7 +5880,7 @@ static inline void preempt_latency_stop(int val) > > > trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip()); > > > } > > > > > > -void preempt_count_sub(int val) > > > +void __no_sanitize_memory preempt_count_sub(int val) > > > { > > > #ifdef CONFIG_DEBUG_PREEMPT > > > > > > > > > -- > > > Cheers, > > > Changbin Du > > -- > Cheers, > Changbin Du -- Cheers, Changbin Du