From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3369CC4345F for ; Sat, 20 Apr 2024 14:02:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 620D36B0085; Sat, 20 Apr 2024 10:02:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AA216B0087; Sat, 20 Apr 2024 10:02:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44A886B0088; Sat, 20 Apr 2024 10:02:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 23DBA6B0085 for ; Sat, 20 Apr 2024 10:02:54 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8DDA9A24B3 for ; Sat, 20 Apr 2024 14:02:53 +0000 (UTC) X-FDA: 82030076226.13.7972B7C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 3E630A0009 for ; Sat, 20 Apr 2024 14:02:51 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fT8ruQ4G; spf=pass (imf25.hostedemail.com: domain of zlang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=zlang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713621771; a=rsa-sha256; cv=none; b=VeTXfzooWRqCaZjzO/pWc/Mj4kP1M1V7T7BbKEJrexjdzHvSzhLmzl6UWF0lkyM8nrs0Kk Nf1Z/jEFWQtq+EcJmAdnLtpKVHBXVOxRBK9H4y9fph74HMVKvKCywIp+GX9LYcdpVTpBoa +Ana/E0j/+EhmdMG2yWiLaps/XCCVHk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=fT8ruQ4G; spf=pass (imf25.hostedemail.com: domain of zlang@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=zlang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713621771; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9lfgJeEkDXQcf1eXq96/sWgcK53JbCs/dnLvPig9JJw=; b=fbt4AR7rfHkvYPbxUI7TxG2P6QQKJetc+3OrbKZOUnYe179bxUXvtAjbE2JTCM7qRXm06P FLBDv1YRxWjDA5dJ06x8dolmXyhS/wVVv51l0+8xitGG1nLfIlTgMfCaESAuT1kdHTvatI FhHB1MnWTbgZGSJo8U+RO8OA3A2NFvE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713621770; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9lfgJeEkDXQcf1eXq96/sWgcK53JbCs/dnLvPig9JJw=; b=fT8ruQ4Gj9NVw4KNiMsOBY6N1q8QOsqi+gTF6EUn+xskYhchVraDZZXZehzFG/TDAnz+Rh CJ2JtSlaNfT0JUwMNTXIVvCnU9pBvDYyuCd6rJO2h3qMyruoU//cha/nHB6Ex1qdPHF3J2 QoZMJGIWHfmrphY87Oa9hE9cWV8gk50= Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-549--rDzGDSRPACPc7WH36PKVw-1; Sat, 20 Apr 2024 10:02:49 -0400 X-MC-Unique: -rDzGDSRPACPc7WH36PKVw-1 Received: by mail-pl1-f198.google.com with SMTP id d9443c01a7336-1e8afd9e4ceso23201595ad.2 for ; Sat, 20 Apr 2024 07:02:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713621768; x=1714226568; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9lfgJeEkDXQcf1eXq96/sWgcK53JbCs/dnLvPig9JJw=; b=NlaYUAafmY9Ug206lBdWRH+vEXTdjP0f25v5M+XpJWDKyLN6VrV5b/x8AmHL0IVYjh s3aguHPv6ytATYDCRkLZmDW3RLF/HUZ7jylzQMsgagqv0zbkvRMlioPP/tVnMTxHe+O9 rXIDGTdEUI25lKCe6en+LqAkF0m9nxSg9xe/b/5dbeK8c0hSvFZSLY89A0xGO4lsyEIi RLZvKSarP20XO37TT6LWjvc4m27trSOPuvLAMQAKlxdgC9DZMcNTSJjqIG03JJ4wAtd9 4AnhGNid+0JpvZGTEetJFmMN9EwvevXqgEc2IAs4zejliCdAco8mHvpXFZ18wLI79lFl 7tYA== X-Forwarded-Encrypted: i=1; AJvYcCVIlVhzcUwWuzbJxL6ZFn6QaKQN+YOfSLUt9DVO/zx0QUSwXPigkdOLcsyCqi/OuC6fvFZu8GNJeTiOCDaaUwFDPu8= X-Gm-Message-State: AOJu0YyScHVkcMgFm/k4/N/K9m5GOBczYo340V+cwYSZjasmSBXPjuM4 ry6Qr6UpVWiRgewm96Atj95a3T2GocbR8+5K+SJ2XQ/wVU2DKNZyg7aZMTugX5ejn9Su1SXTzG9 oeb+8iCyUNy82EXwG4tCOW2hxskQTeqigKv+Krov4S6OjMMZb X-Received: by 2002:a17:903:22c5:b0:1dc:de65:623b with SMTP id y5-20020a17090322c500b001dcde65623bmr6365699plg.60.1713621767724; Sat, 20 Apr 2024 07:02:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+B4ecgGbiEbteqjqIcALgwRnV6lNA+X+RJ7IVvcGmdcjPMSR3Dzt2AzFx0d+Gyiael9DH1Q== X-Received: by 2002:a17:903:22c5:b0:1dc:de65:623b with SMTP id y5-20020a17090322c500b001dcde65623bmr6365648plg.60.1713621767054; Sat, 20 Apr 2024 07:02:47 -0700 (PDT) Received: from dell-per750-06-vm-08.rhts.eng.pek2.redhat.com ([43.228.180.230]) by smtp.gmail.com with ESMTPSA id e4-20020a17090301c400b001dd59b54f9fsm5095275plh.136.2024.04.20.07.02.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 20 Apr 2024 07:02:46 -0700 (PDT) Date: Sat, 20 Apr 2024 22:02:41 +0800 From: Zorro Lang To: Luis Chamberlain Cc: fstests@vger.kernel.org, kdevops@lists.linux.dev, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, willy@infradead.org, david@redhat.com, linmiaohe@huawei.com, muchun.song@linux.dev, osalvador@suse.de Subject: Re: [PATCH] fstests: add fsstress + compaction test Message-ID: <20240420140241.wez2x3zoirzlmat6@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> References: <20240418001356.95857-1-mcgrof@kernel.org> MIME-Version: 1.0 In-Reply-To: <20240418001356.95857-1-mcgrof@kernel.org> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3E630A0009 X-Stat-Signature: pp5ctmwe19onxpdxb7on6fnuh7k5zynb X-Rspam-User: X-HE-Tag: 1713621771-532740 X-HE-Meta: U2FsdGVkX1/uCQn2UiVJ00oCFl+oFYuicZlAAdg/BYAHJ4ujyokz5JEE5btyWu8NsVOqOFt40uE6v8hiq8Cj4fIdNvgrm0YQwBoMQshTC2fmtKKFGS5ZcCCEJ2mwy2AAmEdYqUbBZtoJbbewNn7fV1v9UZb7H9ORjMLGJ4I+C2eNAQ15WD0GJXvuzZ/qxTZ55oUt9OxobN7khoNBCB8UZgvAiO+WOTfU4wz6qCgqnDKjfAuQGl5MENdSmV6Vz/CSmFwy4oosYM6ZhaR/UsWZmeV+rKJn/TaSqwyiiAIm00ZVDAc382Wu5Y6ofgnhTch/5Dp+ZyyF2pPIC/meonDzuTXccKI5ZiuvWlqU86hWvQ0BOyAoVqUnvVvQATB+DDnkmNiNdcw0cIwXc0c6ioP6qIZ9vvXgNBuKnxnzMrgT/Er/qwW/V5ZHBF2EVRBPBlNMRWgX9x6/qEdcAQbRX/y4+f4vqaagOmA0PDrt2FzHmRHhZe97BALHAszuGsPIZ48rIixDMneGn8RT8W0i9nLh2o6QG85fjSUMO+qWqG3gdO/7YLN0hVGM8FHXk6D80VZE3gqwqSpdUZbuZYei4gGhecLBXNTcP5AXoAzsi/DoncfN0jy3e9hjV2juoH2I0M7E0BTlT67rFtHRqvS2/tTjvCiyZjRXpD3o6z6zy3LPsVdAKNVQdqSk30B9LSynwpap4kJEeUZVjUr4pwo57k6AXO4Jl/gUScXkVEe/yHnO6ibxOdblwx6TVfoDXbC20OcJ7rGsSEl6dq6fBJUGGO+v/IJy3AouXm5eQv6Q4M8sh+2OwxX9YlXXaPDjJYp3NyylIKe20yQ9Qo5bAB7fo4eU/ebRqcbXxuPgoFe1p8S4UsRup8WfiRp/3VgysJExxGm2TVFWPrUMJ4ZGkHm/iqEABPQx60+YgR+6Bo450xXckZQs0MxQHO8mA9+HS5Hg1UECqVWUIEIgV6G+5F5Le7w HluNq+dC l5xC+uo8Af48hxFcZ0Aqwllpvv5u9WCio2a6ACeGoapURHhB3S5Iwvb5DkukY3BdF1tuXJ0o76zdPJqahfLl5KuzBcJmL1zliEEYleNpqsGJx+V9oSkpkFMbMr2pYKfRT2fPT/W8440wAolWLKI1Z2JrPaDHuLIbpY94SyTQaVErvYPOz2wJ48bHj47KQYohJ5CHbOAq28iyZAnUfhThARTBKSF4TkgmKM6LaTBIjA1JdWNpi6+Ch6EDlR40mdNMZXqv5vioGNvYpC2JHfPoDpNHZxCIRK2BILf1/uZBCcQh3K7rlhR/5MZRdQGbXv+K6F6/cFoYfBs2uZFyFYDNFrgynDzJ057x4Hu4h/uXI3UIT99cVz8w18Ny1QQvM5BxWuN+Wi4DOk3zSXkuUbLOUL4je/qXbYbWrQIe32eh+MR7Jo3pyVSE969/0Ek1kLiLs81/Hr2B1UKymA2iM7o162NTCsQrMlZY/5X8r X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 17, 2024 at 05:13:56PM -0700, Luis Chamberlain wrote: > Running compaction while we run fsstress can crash older kernels as per > korg#218227 [0], the fix for that [0] has been posted [1] but that patch > is not yet on v6.9-rc4 and the patch requires changes for v6.9. > > Today I find that v6.9-rc4 is also hitting an unrecoverable hung task > between compaction and fsstress while running generic/476 on the > following kdevops test sections [2]: > > * xfs_nocrc > * xfs_nocrc_2k > * xfs_nocrc_4k > > Analyzing the trace I see the guest uses loopback block devices for the > fstests TEST_DEV, the loopback file uses sparsefiles on a btrfs > partition. The contention based on traces [3] [4] seems to be that we > have somehow have fsstress + compaction race on folio_wait_bit_common(). > > We have this happening: > > a) kthread compaction --> migrate_pages_batch() > --> folio_wait_bit_common() > b) workqueue on btrfs writeback wb_workfn --> extent_write_cache_pages() > --> folio_wait_bit_common() > c) workqueue on loopback loop_rootcg_workfn() --> filemap_fdatawrite_wbc() > --> folio_wait_bit_common() > d) kthread xfsaild --> blk_mq_submit_bio() --> wbt_wait() > > I tried to reproduce but couldn't easily do so, so I wrote this test > to help, and with this I have 100% failure rate so far out of 2 runs. > > Given we also have korg#218227 and that patch likely needing > backporting, folks will want a reproducer for this issue. This should > hopefully help with that case and this new separate issue. > > To reproduce with kdevops just: > > make defconfig-xfs_nocrc_2k -j $(nproc) > make -j $(nproc) > make fstests > make linux > make fstests-baseline TESTS=generic/733 > tail -f guestfs/*-xfs-nocrc-2k/console.log > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=218227 > [1] https://lore.kernel.org/all/7ee2bb8c-441a-418b-ba3a-d305f69d31c8@suse.cz/T/#u > [2] https://github.com/linux-kdevops/kdevops/blob/main/playbooks/roles/fstests/templates/xfs/xfs.config > [3] https://gist.github.com/mcgrof/4dfa3264f513ce6ca398414326cfab84 > [4] https://gist.github.com/mcgrof/f40a9f31a43793dac928ce287cfacfeb > > Signed-off-by: Luis Chamberlain > --- > > Note: kdevops uses its own fork of fstests which has this merged > already, so the above should just work. If it's your first time using > kdevops be sure to just read the README for the first time users: > > https://github.com/linux-kdevops/kdevops/blob/main/docs/kdevops-first-run.md > > common/rc | 7 ++++++ > tests/generic/744 | 56 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/744.out | 2 ++ > 3 files changed, 65 insertions(+) > create mode 100755 tests/generic/744 > create mode 100644 tests/generic/744.out > > diff --git a/common/rc b/common/rc > index b7b77ac1b46d..d4432f5ce259 100644 > --- a/common/rc > +++ b/common/rc > @@ -120,6 +120,13 @@ _require_hugepages() > _notrun "Kernel does not report huge page size" > } > > +# Requires CONFIG_COMPACTION > +_require_compaction() I'm not sure if we should name it as "_require_vm_compaction", does linux have other "compaction" or only memory compaction? > +{ > + if [ ! -f /proc/sys/vm/compact_memory ]; then > + _notrun "Need compaction enabled CONFIG_COMPACTION=y" > + fi > +} > # Get hugepagesize in bytes > _get_hugepagesize() > { > diff --git a/tests/generic/744 b/tests/generic/744 > new file mode 100755 > index 000000000000..2b3c0c7e92fb > --- /dev/null > +++ b/tests/generic/744 > @@ -0,0 +1,56 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Luis Chamberlain. All Rights Reserved. > +# > +# FS QA Test 744 > +# > +# fsstress + compaction test fsstress + memory compaction ? Looks like this case is copied from g/476, just add memory_compaction test. That makes sense to me from the test side. I'm a bit confused on your discussion about an old bug and a new bug(?) you just found. Looks like you're reporting a bug, and provide a test case to fstests@ by the way. Anyway, I think there's not objection on this test itself, right? And is this test for someone known bug or not? > +# > +. ./common/preamble > +_begin_fstest auto rw long_rw stress soak smoketest > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + $KILLALL_PROG -9 fsstress > /dev/null 2>&1 > +} > + > +# Import common functions. > + > +# real QA test starts here > + > +# Modify as appropriate. Useless comment~ > +_supported_fs generic > + > +_require_scratch > +_require_compaction > +_require_command "$KILLALL_PROG" "killall" > + > +echo "Silence is golden." > + > +_scratch_mkfs > $seqres.full 2>&1 > +_scratch_mount >> $seqres.full 2>&1 > + > +nr_cpus=$((LOAD_FACTOR * 4)) > +nr_ops=$((25000 * nr_cpus * TIME_FACTOR)) > +fsstress_args=(-w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus) > + > +# start a background getxattr loop for the existing xattr > +runfile="$tmp.getfattr" > +touch $runfile > +while [ -e $runfile ]; do > + echo 1 > /proc/sys/vm/compact_memory > + sleep 15 > +done & > +getfattr_pid=$! I didn't see any other place use this "getfattr_pid". Better to deal with it in _cleanup(). > + > +test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION") > + > +$FSSTRESS_PROG $FSSTRESS_AVOID "${fsstress_args[@]}" >> $seqres.full > + > +rm -f $runfile > +wait > /dev/null 2>&1 Better to do these things in _cleanup() function, make sure all background processes can be done in _cleanup. > + > +status=0 > +exit > diff --git a/tests/generic/744.out b/tests/generic/744.out > new file mode 100644 > index 000000000000..205c684fa995 > --- /dev/null > +++ b/tests/generic/744.out > @@ -0,0 +1,2 @@ > +QA output created by 744 > +Silence is golden > -- > 2.43.0 > >