From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB7E8C433F5 for ; Sat, 18 Sep 2021 08:07:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 60E3F61212 for ; Sat, 18 Sep 2021 08:07:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 60E3F61212 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9ECF66B0073; Sat, 18 Sep 2021 04:07:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99C006B0074; Sat, 18 Sep 2021 04:07:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88AA86B0075; Sat, 18 Sep 2021 04:07:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 775716B0073 for ; Sat, 18 Sep 2021 04:07:34 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 214D828DCE for ; Sat, 18 Sep 2021 08:07:34 +0000 (UTC) X-FDA: 78599964828.07.FE13BF3 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf12.hostedemail.com (Postfix) with ESMTP id 10DD110000B9 for ; Sat, 18 Sep 2021 08:07:32 +0000 (UTC) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4HBNh96Zq8z57Cl; Sat, 18 Sep 2021 16:06:49 +0800 (CST) Received: from dggpemm500009.china.huawei.com (7.185.36.225) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 18 Sep 2021 16:07:30 +0800 Received: from [10.174.179.24] (10.174.179.24) by dggpemm500009.china.huawei.com (7.185.36.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 18 Sep 2021 16:07:29 +0800 Subject: Re: [PATCH v2 2/3] kfence: maximize allocation wait timeout duration To: Marco Elver , Kefeng Wang References: <20210421105132.3965998-1-elver@google.com> <20210421105132.3965998-3-elver@google.com> <6c0d5f40-5067-3a59-65fa-6977b6f70219@huawei.com> CC: , , , , , , , , From: Liu Shixin Message-ID: Date: Sat, 18 Sep 2021 16:07:29 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm500009.china.huawei.com (7.185.36.225) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 10DD110000B9 Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf12.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=liushixin2@huawei.com X-Stat-Signature: ku5swfx5sbdt3eoh9b7ahyc6t4m6mepd X-HE-Tag: 1631952452-277891 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/9/16 16:49, Marco Elver wrote: > On Thu, 16 Sept 2021 at 03:20, Kefeng Wang wrote: >> Hi Marco, >> >> We found kfence_test will fails on ARM64 with this patch with/without >> CONFIG_DETECT_HUNG_TASK, >> >> Any thought ? > Please share log and instructions to reproduce if possible. Also, if > possible, please share bisection log that led you to this patch. > > I currently do not see how this patch would cause that, it only > increases the timeout duration. > > I know that under QEMU TCG mode, there are occasionally timeouts in > the test simply due to QEMU being extremely slow or other weirdness. > > . > Hi Marco, There are some of the results of the current test: 1. Using qemu-kvm on arm64 machine, all testcase can pass. 2. Using qemu-system-aarch64 on x86_64 machine, randomly some testcases fail. 3. Using qemu-system-aarch64 on x86_64, but removing the judgment of kfence_allocation_key in kfence_alloc(), all testcase can pass. I add some printing to the kernel and get very strange results. I add a new variable kfence_allocation_key_gate to track the state of kfence_allocation_key. As shown in the following code, theoretically, if kfence_allocation_key_gate is zero, then kfence_allocation_key must be enabled, so the value of variable error in kfence_alloc() should always be zero. In fact, all the passed testcases fit this point. But as shown in the following failed log, although kfence_allocation_key has been enabled, it's still check failed here. So I think static_key might be problematic in my qemu environment. The change of timeout is not a problem but caused us to observe this problem. I tried changing the wait_event to a loop. I set timeout to HZ and re-enable/disabled in each loop, then the failed testcase disappears. [ 3.463519] # Subtest: kfence [ 3.463629] 1..25 [ 3.465548] # test_out_of_bounds_read: test_alloc: size=128, gfp=cc0, policy=left, cache=0 [ 3.561001] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 3.561934] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 3.665449] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 13.464796] --------------kfence_allocation_key check failed 13839286 times---------------- [ 13.467482] kfence: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ 13.469166] # test_out_of_bounds_read: ASSERTION FAILED at mm/kfence/kfence_test.c:308 [ 13.469166] Expected false to be true, but is false [ 13.469166] [ 13.469166] failed to allocate from KFENCE [ 13.473592] not ok 1 - test_out_of_bounds_read diff --git a/include/linux/kfence.h b/include/linux/kfence.h index 3fe6dd8a18c1..e72889606e82 100644 --- a/include/linux/kfence.h +++ b/include/linux/kfence.h @@ -25,6 +25,7 @@ extern char *__kfence_pool; #ifdef CONFIG_KFENCE_STATIC_KEYS #include DECLARE_STATIC_KEY_FALSE(kfence_allocation_key); +extern atomic_t kfence_allocation_key_gate; #else #include extern atomic_t kfence_allocation_gate; @@ -116,12 +117,20 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags); */ static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { + static int error; #ifdef CONFIG_KFENCE_STATIC_KEYS - if (static_branch_unlikely(&kfence_allocation_key)) + if (static_branch_unlikely(&kfence_allocation_key)) { #else - if (unlikely(!atomic_read(&kfence_allocation_gate))) + if (unlikely(!atomic_read(&kfence_allocation_gate))) { #endif + if (error) { + pr_info("--------------kfence_allocation_key check failed %d times----------------\n", error); + error = 0; + } return __kfence_alloc(s, size, flags); + } + if (!atomic_read(&kfence_allocation_key_gate)) + error++; return NULL; } diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 7a97db8bc8e7..637c2efa6133 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -100,6 +100,7 @@ static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. * #ifdef CONFIG_KFENCE_STATIC_KEYS /* The static key to set up a KFENCE allocation. */ DEFINE_STATIC_KEY_FALSE(kfence_allocation_key); +atomic_t kfence_allocation_key_gate = ATOMIC_INIT(1); #endif /* Gates the allocation, ensuring only one succeeds in a given period. */ @@ -624,7 +625,9 @@ static void toggle_allocation_gate(struct work_struct *work) #ifdef CONFIG_KFENCE_STATIC_KEYS /* Enable static key, and await allocation to happen. */ static_branch_enable(&kfence_allocation_key); - + if (static_branch_unlikely(&kfence_allocation_key)) + pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~enabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"); + atomic_set(&kfence_allocation_key_gate, 0); if (sysctl_hung_task_timeout_secs) { /* * During low activity with no allocations we might wait a @@ -637,7 +640,10 @@ static void toggle_allocation_gate(struct work_struct *work) } /* Disable static key and reset timer. */ + atomic_set(&kfence_allocation_key_gate, 1); static_branch_disable(&kfence_allocation_key); + if (!static_branch_unlikely(&kfence_allocation_key)) + pr_info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~disabled~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n"); #endif queue_delayed_work(system_unbound_wq, &kfence_timer, msecs_to_jiffies(kfence_sample_interval)); thanks,