linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Yujie Liu <yujie.liu@intel.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [linus:master] [maple_tree] 120b116208: INFO:task_blocked_for_more_than#seconds
Date: Wed, 1 Feb 2023 06:36:55 -0800	[thread overview]
Message-ID: <20230201143655.GE2948950@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <Y9okrO+GOYP2Gh4K@yujie-X299>

On Wed, Feb 01, 2023 at 04:37:00PM +0800, Yujie Liu wrote:
> Hi Paul, Hi Liam,
> 
> On Tue, Jan 31, 2023 at 12:52:21PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 31, 2023 at 03:45:20PM -0500, Liam R. Howlett wrote:
> > > * Paul E. McKenney <paulmck@kernel.org> [230131 15:26]:
> > > > On Tue, Jan 31, 2023 at 03:18:22PM +0800, kernel test robot wrote:
> > > > > Hi Liam,
> > > > > 
> > > > > We caught a "task blocked" dmesg in maple tree test. Not sure if this
> > > > > is expected for maple tree test, so we are sending this report for
> > > > > your information. Thanks.
> > > > > 
> > > > > Greeting,
> > > > > 
> > > > > FYI, we noticed INFO:task_blocked_for_more_than#seconds due to commit (built with clang-14):
> > > > > 
> > > > > commit: 120b116208a0877227fc82e3f0df81e7a3ed4ab1 ("maple_tree: reorganize testing to restore module testing")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > > 
> > > > > in testcase: boot
> > > > > 
> > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > > > > 
> > > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > > > > 
> > > > > 
> > > > > [   17.318428][    T1] calling  maple_tree_seed+0x0/0x15d0 @ 1
> > > > > [   17.319219][    T1] 
> > > > > [   17.319219][    T1] TEST STARTING
> > > > > [   17.319219][    T1] 
> > > > > [  999.249871][   T23] INFO: task rcu_scale_shutd:59 blocked for more than 491 seconds.
> > > > > [  999.253363][   T23]       Not tainted 6.1.0-rc4-00003-g120b116208a0 #1
> > > > > [  999.254249][   T23] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > [  999.255390][   T23] task:rcu_scale_shutd state:D stack:30968 pid:59    ppid:2      flags:0x00004000
> > > > > [  999.256934][   T23] Call Trace:
> > > > > [  999.257418][   T23]  <TASK>
> > > > > [  999.257900][   T23]  __schedule+0x169b/0x1f90
> > > > > [  999.261677][   T23]  schedule+0x151/0x300
> > > > > [  999.262281][   T23]  ? compute_real+0xe0/0xe0
> > > > > [  999.263364][   T23]  rcu_scale_shutdown+0xdd/0x130
> > > > > [  999.264093][   T23]  ? wake_bit_function+0x2c0/0x2c0
> > > > > [  999.268985][   T23]  kthread+0x309/0x3a0
> > > > > [  999.269958][   T23]  ? compute_real+0xe0/0xe0
> > > > > [  999.270552][   T23]  ? kthread_unuse_mm+0x200/0x200
> > > > > [  999.271281][   T23]  ret_from_fork+0x1f/0x30
> > > > > [  999.272385][   T23]  </TASK>
> > > > > [  999.272865][   T23] 
> > > > > [  999.272865][   T23] Showing all locks held in the system:
> > > > > [  999.273988][   T23] 2 locks held by swapper/0/1:
> > > > > [  999.274684][   T23] 1 lock held by khungtaskd/23:
> > > > > [  999.275400][   T23]  #0: ffffffff88346e00 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x8/0x30
> > > > > [  999.277171][   T23] 
> > > > > [  999.277525][   T23] =============================================
> > > > > [  999.277525][   T23] 
> > > > > [ 1049.050884][    T1] maple_tree: 12610686 of 12610686 tests passed
> > > > > 
> > > > > 
> > > > > If you fix the issue, kindly add following tag
> > > > > | Reported-by: kernel test robot <yujie.liu@intel.com>
> > > > > | Link: https://lore.kernel.org/oe-lkp/202301310940.4a37c7af-yujie.liu@intel.com
> > > > 
> > > > Liam brought this to my attention on IRC, and it looks like the root
> > > > cause is that the rcuscale code does not deal gracefully with grace
> > > > periods that are in much excess of a second in duration.
> > > > 
> > > > Now, it might well be worth looking into why the grace periods were taking
> > > > that long, but if you were running Maple Tree stress tests concurrently
> > > > with rcuscale, this might well be expected behavior.
> > > > 
> > > 
> > > This could be simply cpu starvation causing no foward progress in your
> > > tests with the number of concurrent running tests and "-smp 2".
> > > 
> > > It's also worth noting that building in the rcu test module makes the
> > > machine turn off once the test is complete.  This can be seen in your
> > > console message:
> > > [   13.254240][    T1] rcu-scale:--- Start of test: nreaders=2 nwriters=2 verbose=1 shutdown=1
> > > 
> > > so your machine may not have finished running through the array of tests
> > > you have specified to build in - which is a lot.  I'm not sure if this
> > > is the best approach considering the load that produces on the system
> > > and how difficult it is (was) to figure out which test is causing a
> > > stall, or other issue.
> > 
> > Agreed, both rcuscale and refscale when built in turn the machine off at
> > the end of the test.  For providing background stress for some other test
> > (in this case Maple Tree tests), rcutorture, locktorture, or scftorture
> > might be better choices.
> 
> Thanks for looking into this. This is a boot test on a randconfig
> kernel, and yes, it happend to select CONFIG_RCU_SCALE_TEST=y together
> with CONFIG_TEST_MAPLE_TREE=y, leading to the situation in this case.
> 
> I've tested the patch on same config, it does clear up the "task
> blocked" log, though it still waits a long time at this step. The test
> result is as follows:
> 
> [   18.397784][    T1] calling  maple_tree_seed+0x0/0x15d0 @ 1
> [   18.398646][    T1]
> [   18.398646][    T1] TEST STARTING
> [   18.398646][    T1]
> [ 1266.450656][    T1] maple_tree: 12610686 of 12610686 tests passed
> [ 1266.451749][    T1] initcall maple_tree_seed+0x0/0x15d0 returned 0 after 1248053116 usecs
> ...
> 
> =========================================================================================
> compiler/kconfig/rootfs/sleep/tbox_group/testcase:
>   clang-14/x86_64-randconfig-a006-20230116/yocto-x86_64-minimal-20190520.cgz/300/vm-snb/boot
> 
> commit:
>   120b116208a08 ("maple_tree: reorganize testing to restore module testing")
>   4b1aafbdb208f ("rcuscale: Move shutdown from wait_event() to wait_event_idle()")
> 
> 120b116208a08    4b1aafbdb208fb4e10bf7abff1a
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>           5:6          -83%            :6     dmesg.INFO:task_blocked_for_more_than#seconds
> 
> Tested-by: Yujie Liu <yujie.liu@intel.com>

Thank you very much!  I will apply this on my next rebase.

							Thanx, Paul


  reply	other threads:[~2023-02-01 14:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31  7:18 kernel test robot
2023-01-31 20:26 ` Paul E. McKenney
2023-01-31 20:45   ` Liam R. Howlett
2023-01-31 20:52     ` Paul E. McKenney
2023-02-01  8:37       ` Yujie Liu
2023-02-01 14:36         ` Paul E. McKenney [this message]
2023-02-01 14:43           ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230201143655.GE2948950@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=yujie.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox