From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64E5EC05027 for ; Wed, 1 Feb 2023 14:37:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F26B06B0075; Wed, 1 Feb 2023 09:37:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFE856B0073; Wed, 1 Feb 2023 09:37:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC4276B0075; Wed, 1 Feb 2023 09:37:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CEB2B6B0072 for ; Wed, 1 Feb 2023 09:37:01 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 481A71C5F1F for ; Wed, 1 Feb 2023 14:37:01 +0000 (UTC) X-FDA: 80418975042.29.FF927C6 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf11.hostedemail.com (Postfix) with ESMTP id 4AFB740009 for ; Wed, 1 Feb 2023 14:36:58 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=IEI21STL; spf=pass (imf11.hostedemail.com: domain of "SRS0=bivn=55=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 145.40.68.75 as permitted sender) smtp.mailfrom="SRS0=bivn=55=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675262218; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s8qzZBqEKpnBrNoHRkucbBdt4NP9FlaWj5A98r/NWwg=; b=HLZfIJhGh4gjBXjkVXj57nBu8ZIqCwl25WtMUgXfft/duL6mqZoRWSo6vJ/OVAyOe2zvOA y4qpVZ14UlMme+2TKThkqluX8oXGmVZ5r6a5t84/vApFAJFiGB1fg1AIURTVAMn26qu8Tm mS/we0SAhoThMocxUJyYFfosYM5g4Uo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=IEI21STL; spf=pass (imf11.hostedemail.com: domain of "SRS0=bivn=55=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 145.40.68.75 as permitted sender) smtp.mailfrom="SRS0=bivn=55=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675262218; a=rsa-sha256; cv=none; b=Y7EzRV8n6ceQlnY49jrEKO/NNoEC1Q3l1aRiXSUKu0p0iCyQVNPjewi0V3bzi15aA/O/N4 R5iUsnAuHVFPifmK/SKCEU++0WJbqeQfXVPQZVAFZgft41s/LK64b7vf9JRhVlQT4kgblS 7HAmqQozpPWzKj2jkCbvvP31KuB3EIo= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C35C6B82184; Wed, 1 Feb 2023 14:36:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 788B8C4339E; Wed, 1 Feb 2023 14:36:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1675262215; bh=EpjNuddLZ0oyJXOpFsXevRvksJuGkUPm0MwRzNRkx5U=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=IEI21STL5fvN9f7o5cvtG5bWYaRRFaa8zcQlzevQ6q+el8REfbrc7TDL+tMhBv716 9NaO6kyn4OtiJlOyqCtn0tV2RujszrAbduznlurZ71vlRpV+uXU3jIn1XdI5qiRMBl i195juUqKAz7LtwRDOU5jTJnu+8hfXegxC4iA2BK6SBM0XxhR6c8OHk0gi6W95VSnu jnAfj9fXDCZwQTWOoPcMrCiTagPZdpwGJYdWujrwfXTWPxf4IpiZ1NzrIMsECcyKNY odGQlnOgkn60VDs6qAHhs2K9aZ+A/qnt4jT0BkEuNMwUYm3oG/GFjfIIUSmDmmyzBD 8gh3/WS2plzxw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 1B5A35C06D0; Wed, 1 Feb 2023 06:36:55 -0800 (PST) Date: Wed, 1 Feb 2023 06:36:55 -0800 From: "Paul E. McKenney" To: Yujie Liu Cc: "Liam R. Howlett" , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [linus:master] [maple_tree] 120b116208: INFO:task_blocked_for_more_than#seconds Message-ID: <20230201143655.GE2948950@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <202301310940.4a37c7af-yujie.liu@intel.com> <20230131202635.GA3019407@paulmck-ThinkPad-P17-Gen-1> <20230131204520.ad6cf4lvtw5uf27s@revolver> <20230131205221.GX2948950@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 4AFB740009 X-Stat-Signature: ji7ydbrrztdkq63nrgcrfp4znas1dk4g X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1675262218-783385 X-HE-Meta: U2FsdGVkX1+VSf34UwYRKrmihn+xmAH67iPmYxaSlfLz+4YQZzVtosAR87EJcHnFyWzxbSIyWjq/bXTwItxshhCwCvxmz1IM6hVbMfdG+jmqYZBiqaspItBpa00/NnC6Q/u8Zax8R7q5ZlE6WBcsf2+KTb1eP6AMDBqAxx2ZswJyXkD6iVB3h+opuqqzaYLaOg2r+I8shJ25R9kwHgzFDkpWENbR6N35fF1apByrDhzyl1VXAyr53FalExhs4Xg/muVQYfEC0Q8oNCvkChdbqgC9cvTFnFyWUzdrUkMo0pvLiz2slAqKDCgDVkehA7F59/cVHKR5T6xs4MFDfsawFQUx0fhfrxJoF+prLHMUMXJLQ5QoI8mhY686nUBd1fwoOnatA4wcTEdQQBFb9iG/94Y1x5biU/H+x46mPmzvqBNCB3il5Fr+L7nf+o2OGvWmodWYhOdEJpdOtSAZvFixAmGj107oHlH1NQcmTmTqihc9H1Z4iwIW2dhjhxMWg2aoakBODsG9GKDlzwinh+8Cl3BHriKvqj/fNf/Vf9B8yRvNHN55WtWDViDnHHEHgB7hehMNRdAb2LLTMkXr2G2MSrDe3K4m2p2KZSq/rTViCEKwB3wv0lGwErH3C6f+gOdIs5PzheiY/5KU3tNcYfOLCVCNArYw2+g1wlJ25z+PrE+ZwSxHxbX0kcwQPrvlzhPUv+8822ExaH/8IfoiFKkZpOJgTwbUmQaWoF07+TnX6QCtjHPNXdWIhzaPtJ9X+ungQah8DrpOwbh3GRh0pD61WWIFPFy+cfdSFft7p6DT4fsIbnlzvB+38/ihanPlnsmalVZEgulRFjsWezoALG7wRJrEH0gpJpffxVFucjLoCOjUNqY2g84jk9RfD0md71n6TvtWS3sBpWEv+YpK7l05kyfgi7rV6dMA3NW+DFW4xWwOciVoeUAwNDwnyA5mz8rJOjKNMVPvkPNecYW3SsI jwhHh5/O O4tvSN5zGaqxP8NrOKATdQeY+A8EvbvN3iKGGFvxECc8O3dVK//6WFcrbxSczG0QnBOnpf2mqk3h1x6ggX9DLFltGbHjw3XCOnHjpVLBT2Nu2OlTzTgqKm5P+Ji1mihXaWWUZn15KwAAsuFeCtLaxTNpCsguIE7OV75hnDDXqLP1BlONa6TOeS/bbFb28mhSOhMD57cotR8GpYGPSUFaHNF/ouQzymY8Ox5dO7i+qwP5bpHBCP4fBEWBsGbWzE/MCsLMyeGLkqWPMpbZAFe1bRxfMYohoEBaI+lbRVRHR2ugeay9PIr7TsIRbc/KE7FlUbejqU2j3KuqtkrTwf7luo2hpRnabdrgiNazvOZhulJKpi/DlWpD6wPGAvbJ7GF3Q7MmA8ZMGhkgmky0XIWnTKeQI2b/YR6k28nO5LCfquEyzHRV5ISx6W8tz0Js9+SCU7ljtSOlXHIgWmZLvxPhah7e0JdpRuvi7vshNLHGIniwGpcOpg8EmgZ23FyMS/F/ROIALGC69svnI53AcBro+lq51e5v4eUgAwSeXRMg7zjc73yOxqZNvZaaiN9dR2AUp6ItehXKvjwyPzMK52H4bI7L45FROT5nfmP82iBGdf6PAZx6X4nZWj2/MYYdzm2jk6+LAvJHW8VsueluoW3AEfhCJKpypnuM7Vhx2rRE6wDwsHGV1ZOzpmts3TUzNScdAG8PbByxOqy/ecCorUxWrfs++06R85DxZLxS5MX0k2d/gSuvhsd7Ezj2QTWe9pPJRYFJcInn9IyXTEdt3GktOxL0q/sl4fkXlj4QOHiDa+Jb1rUzgaVnPmvU51EptXN6TEpJ8Utal5trr14DtYtCzwYYkQnWiFjU5KZv81cWx2Fk8b9QjMgY7GPfLXIDLg2jL6dGK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 01, 2023 at 04:37:00PM +0800, Yujie Liu wrote: > Hi Paul, Hi Liam, > > On Tue, Jan 31, 2023 at 12:52:21PM -0800, Paul E. McKenney wrote: > > On Tue, Jan 31, 2023 at 03:45:20PM -0500, Liam R. Howlett wrote: > > > * Paul E. McKenney [230131 15:26]: > > > > On Tue, Jan 31, 2023 at 03:18:22PM +0800, kernel test robot wrote: > > > > > Hi Liam, > > > > > > > > > > We caught a "task blocked" dmesg in maple tree test. Not sure if this > > > > > is expected for maple tree test, so we are sending this report for > > > > > your information. Thanks. > > > > > > > > > > Greeting, > > > > > > > > > > FYI, we noticed INFO:task_blocked_for_more_than#seconds due to commit (built with clang-14): > > > > > > > > > > commit: 120b116208a0877227fc82e3f0df81e7a3ed4ab1 ("maple_tree: reorganize testing to restore module testing") > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > > > > > > > in testcase: boot > > > > > > > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > > > > > > > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > > > > > > > > > > > > > > [ 17.318428][ T1] calling maple_tree_seed+0x0/0x15d0 @ 1 > > > > > [ 17.319219][ T1] > > > > > [ 17.319219][ T1] TEST STARTING > > > > > [ 17.319219][ T1] > > > > > [ 999.249871][ T23] INFO: task rcu_scale_shutd:59 blocked for more than 491 seconds. > > > > > [ 999.253363][ T23] Not tainted 6.1.0-rc4-00003-g120b116208a0 #1 > > > > > [ 999.254249][ T23] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > > [ 999.255390][ T23] task:rcu_scale_shutd state:D stack:30968 pid:59 ppid:2 flags:0x00004000 > > > > > [ 999.256934][ T23] Call Trace: > > > > > [ 999.257418][ T23] > > > > > [ 999.257900][ T23] __schedule+0x169b/0x1f90 > > > > > [ 999.261677][ T23] schedule+0x151/0x300 > > > > > [ 999.262281][ T23] ? compute_real+0xe0/0xe0 > > > > > [ 999.263364][ T23] rcu_scale_shutdown+0xdd/0x130 > > > > > [ 999.264093][ T23] ? wake_bit_function+0x2c0/0x2c0 > > > > > [ 999.268985][ T23] kthread+0x309/0x3a0 > > > > > [ 999.269958][ T23] ? compute_real+0xe0/0xe0 > > > > > [ 999.270552][ T23] ? kthread_unuse_mm+0x200/0x200 > > > > > [ 999.271281][ T23] ret_from_fork+0x1f/0x30 > > > > > [ 999.272385][ T23] > > > > > [ 999.272865][ T23] > > > > > [ 999.272865][ T23] Showing all locks held in the system: > > > > > [ 999.273988][ T23] 2 locks held by swapper/0/1: > > > > > [ 999.274684][ T23] 1 lock held by khungtaskd/23: > > > > > [ 999.275400][ T23] #0: ffffffff88346e00 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x8/0x30 > > > > > [ 999.277171][ T23] > > > > > [ 999.277525][ T23] ============================================= > > > > > [ 999.277525][ T23] > > > > > [ 1049.050884][ T1] maple_tree: 12610686 of 12610686 tests passed > > > > > > > > > > > > > > > If you fix the issue, kindly add following tag > > > > > | Reported-by: kernel test robot > > > > > | Link: https://lore.kernel.org/oe-lkp/202301310940.4a37c7af-yujie.liu@intel.com > > > > > > > > Liam brought this to my attention on IRC, and it looks like the root > > > > cause is that the rcuscale code does not deal gracefully with grace > > > > periods that are in much excess of a second in duration. > > > > > > > > Now, it might well be worth looking into why the grace periods were taking > > > > that long, but if you were running Maple Tree stress tests concurrently > > > > with rcuscale, this might well be expected behavior. > > > > > > > > > > This could be simply cpu starvation causing no foward progress in your > > > tests with the number of concurrent running tests and "-smp 2". > > > > > > It's also worth noting that building in the rcu test module makes the > > > machine turn off once the test is complete. This can be seen in your > > > console message: > > > [ 13.254240][ T1] rcu-scale:--- Start of test: nreaders=2 nwriters=2 verbose=1 shutdown=1 > > > > > > so your machine may not have finished running through the array of tests > > > you have specified to build in - which is a lot. I'm not sure if this > > > is the best approach considering the load that produces on the system > > > and how difficult it is (was) to figure out which test is causing a > > > stall, or other issue. > > > > Agreed, both rcuscale and refscale when built in turn the machine off at > > the end of the test. For providing background stress for some other test > > (in this case Maple Tree tests), rcutorture, locktorture, or scftorture > > might be better choices. > > Thanks for looking into this. This is a boot test on a randconfig > kernel, and yes, it happend to select CONFIG_RCU_SCALE_TEST=y together > with CONFIG_TEST_MAPLE_TREE=y, leading to the situation in this case. > > I've tested the patch on same config, it does clear up the "task > blocked" log, though it still waits a long time at this step. The test > result is as follows: > > [ 18.397784][ T1] calling maple_tree_seed+0x0/0x15d0 @ 1 > [ 18.398646][ T1] > [ 18.398646][ T1] TEST STARTING > [ 18.398646][ T1] > [ 1266.450656][ T1] maple_tree: 12610686 of 12610686 tests passed > [ 1266.451749][ T1] initcall maple_tree_seed+0x0/0x15d0 returned 0 after 1248053116 usecs > ... > > ========================================================================================= > compiler/kconfig/rootfs/sleep/tbox_group/testcase: > clang-14/x86_64-randconfig-a006-20230116/yocto-x86_64-minimal-20190520.cgz/300/vm-snb/boot > > commit: > 120b116208a08 ("maple_tree: reorganize testing to restore module testing") > 4b1aafbdb208f ("rcuscale: Move shutdown from wait_event() to wait_event_idle()") > > 120b116208a08 4b1aafbdb208fb4e10bf7abff1a > ---------------- --------------------------- > fail:runs %reproduction fail:runs > | | | > 5:6 -83% :6 dmesg.INFO:task_blocked_for_more_than#seconds > > Tested-by: Yujie Liu Thank you very much! I will apply this on my next rebase. Thanx, Paul