From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79FA6C27C4F for ; Tue, 18 Jun 2024 14:10:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0673F8D002B; Tue, 18 Jun 2024 10:10:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 016E58D0028; Tue, 18 Jun 2024 10:10:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD2F88D002B; Tue, 18 Jun 2024 10:10:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C1C558D0028 for ; Tue, 18 Jun 2024 10:10:30 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4DCE34093B for ; Tue, 18 Jun 2024 14:10:30 +0000 (UTC) X-FDA: 82244194620.26.B938990 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id C7C6314002F for ; Tue, 18 Jun 2024 14:10:27 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iLPSFq0b; spf=pass (imf09.hostedemail.com: domain of zlang@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=zlang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718719819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I4kwzn7nwp0hrFl15C/WJa7gZ49NVjVQcqnka+zFjp8=; b=uAgZ7T3xHrQDcb6uSYDPQecYFIqS26RToaWCH830Yecw+EHu78IpJNRbPhCY34Az8eANWp 1nQWtsm4D4WJuTT4CvzEobQoXj4To7RoLj5q557v/wm1mBMJ9LXlQCWNWgcL3aTgqq3vWY LZz+Iw2TDDxHUYwyoLnmwTkbRA1p2fU= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iLPSFq0b; spf=pass (imf09.hostedemail.com: domain of zlang@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=zlang@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718719819; a=rsa-sha256; cv=none; b=4k4Xl/HnSaBryHt5DdWUi4Z+VyI4gZyCCw28fIkLu735Xkga0UeONKXgh5K4tNf0luVkyi 5WO5mWNLrxCFIp0WIv1HRxAiq/59/6HeNZJk7JzQAHbKcP5I2UAYqE+/KnA2+cVxnnGMVm e+MiV98xishMsyjNm3zceR6Dt7hpNE4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718719827; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=I4kwzn7nwp0hrFl15C/WJa7gZ49NVjVQcqnka+zFjp8=; b=iLPSFq0bXbHdf6gxBJkPdgQXccPeOLVNiiqpVP+/j2NbsFmYSvKMUftOKaaN44ZyNQepU0 B3Limzx4SU8rhwPQEtJt7uTBLP9tx8kNNvCdp7G1huByA1FSp9V1CXzGGOoeR0fY18mKpy uOJlrAkoMQhFBJIeBQaps83tsUPLRUY= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-694-kzCybz3lOC2KJJnRx6IkyA-1; Tue, 18 Jun 2024 10:10:22 -0400 X-MC-Unique: kzCybz3lOC2KJJnRx6IkyA-1 Received: by mail-pg1-f198.google.com with SMTP id 41be03b00d2f7-6c9b5e3dd67so5329600a12.0 for ; Tue, 18 Jun 2024 07:10:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718719822; x=1719324622; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=I4kwzn7nwp0hrFl15C/WJa7gZ49NVjVQcqnka+zFjp8=; b=Nz3nadRgKJjym4OJ43ijthkM8+I3rLaEJ9yoVn9ek/YW1ri7+mKQcaKJLkKvpoyhSN AZIbZGHqpfTmoxGbqXvuCFLOcIxdzN0meYPai9SxML8FqZ3RfH4CCZ5hth2InIM5H0Yy VkzTJxxXXFjuYnIxcIOyAhZH6qKYwVmGOCFLVcM6of4w+jAfwDmSoSEHLQVuiHn+xcie sjENFmnDADcWJWpL6SP3PxAGCmMrWCmw2tViqa+NHEfb0leWJBciJbKvQ83iK4x5mQDy Vj3iY8U2SnfMyIiuB7uH6KrCe4xsbWs32jx2vQZK7q7rHSltKuHoZC0k8Ed1MUcJOGLw 1cGw== X-Forwarded-Encrypted: i=1; AJvYcCWlqTQXdCth6whYLSMbKAtMSdC3Zla4pPd02UEMF6pUSF9AxKybkV0T2STbxnccBvQgjrsdOffRvMY5OTp2rMu/wq4= X-Gm-Message-State: AOJu0YyLNkFar37D5pqizUdDQre++hpPdx2aqnpUV1RfDCv/H+TyPF1I 3n8ayD9VNKcmB4r3x1MnqDTuBKCofiP4nLT7siWhH1oCzsCsRcsvpM9ZiQcqFPCjiKYD1YI1snM z+ckBXHMDcyty0lOB62FWE1NByHrWyJpSy73gQGtKBdcl++ay X-Received: by 2002:a17:902:e747:b0:1f7:317f:5434 with SMTP id d9443c01a7336-1f8627e3299mr156584955ad.40.1718719821572; Tue, 18 Jun 2024 07:10:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFv9qqMlG7VblsZwC7P+NfT7DBt1DHXNsixB82zrwIP85DFtLiWNRSo0cCXjVC8NYr5EAxLTQ== X-Received: by 2002:a17:902:e747:b0:1f7:317f:5434 with SMTP id d9443c01a7336-1f8627e3299mr156584335ad.40.1718719821007; Tue, 18 Jun 2024 07:10:21 -0700 (PDT) Received: from dell-per750-06-vm-08.rhts.eng.pek2.redhat.com ([43.228.180.230]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f855e55c3bsm97927985ad.57.2024.06.18.07.10.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jun 2024 07:10:20 -0700 (PDT) Date: Tue, 18 Jun 2024 22:10:13 +0800 From: Zorro Lang To: Luis Chamberlain Cc: patches@lists.linux.dev, fstests@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, ziy@nvidia.com, vbabka@suse.cz, seanjc@google.com, willy@infradead.org, david@redhat.com, hughd@google.com, linmiaohe@huawei.com, muchun.song@linux.dev, osalvador@suse.de, p.raghav@samsung.com, da.gomez@samsung.com, hare@suse.de, john.g.garry@oracle.com Subject: Re: [PATCH v2 5/5] fstests: add stress truncation + writeback test Message-ID: <20240618141013.pue6syikkp5dwj5q@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> References: <20240615002935.1033031-1-mcgrof@kernel.org> <20240615002935.1033031-6-mcgrof@kernel.org> MIME-Version: 1.0 In-Reply-To: <20240615002935.1033031-6-mcgrof@kernel.org> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C7C6314002F X-Stat-Signature: mqx7ig8n8ncpb6oo7tofuw7im77imkan X-Rspam-User: X-HE-Tag: 1718719827-841566 X-HE-Meta: U2FsdGVkX19k2iHefQuo9HBJTudxvW41vdi9df7RT0RcMCPW1gPAHxMCC8nhVD67cmd219VjH2SvqFI6K6dqcXd5nthNubW3rTXEfWAQsSx1J853DeBlPVSDIQh7s8sa5Q6STNZAhJkY3QzpUJcROIg50RWmoOoiV6/4DJAlHxI2ViODySuz7cBHGK9FAjgj2BtBSgvNUdGDywPrMy5Ji/rXaFFRnrQ/gEawKea9Y0W9v9wq/Jw4LfkAdRY2qozVcYFmH9NRG+ZQh/MqpS5ZdpGRCSS5xdImj4YKBozcst8myfgwa2ln7uRhEatd0VOJq92tb1bDOn6fGA1Cw1m0LwSArnv45pku0YMXv4Y8EkVz4YCvHAYgzRtlYI+KlefEIPsb6kWIPBLF8hTl+Vgy63Xdchcv5XO+9Xu03AX8ASK9k8Il0Iti8831Pce+1h15y+nD002Qi5mPgJ84j/y5Uvn5yItQIje6ZAV08yviByipvB3LCYpt0LTalEhroDeiIsH6d7vNqDp+lE3Gd+3oOyBiSmcH6PQX3cU3mfTdZiJ0gxCU+hoNo8Oe53GIFotF4GlpMkUDOv9vBwkE5G0icuq2eOXgJqBUi8eXHpjXrmnQn2v0hWJMBBhtPsebwpzpCjLCprLYAtzCs6QTTO7SFvofGu9hPlorCjWEvtQiVsYv6CKTAm889vfZq1ExSi2sDwjgIIsbDtNpO9Sv7d2YcZ2dP/oRAchxEKk5S+C8YWrxM9HGuCvnNzpkwxG+mH9bpK4GVx/prGWWr4glNZ8x92V6T6sjWCcDWoYOce+0b7m4RbNjWcSw69+3duYM7hjYp2QMHMGtKjFYxaNlu1mldH8LYvJMIGKvbbQbO5tuctlqaGr1zqnTJ2jhDDQOyZmyC2nbq1Hr0rtahPr2nefIj7xRB9PCY/zSnD/bzgzqkT+Goi1zne3zsbwXNT2MBF2N2N6tsBoGL+En8MT7mWx LVGCW4m/ uF/HZIfBqjiiiN4zojeP6Qis6H5igbW4+c/K6CDWJq7adqPVq/2cvsiylIVgCtdF+1SPBDrH7r+yjfL+XgyOY8ZtA1DQygGMQn2cbmKeB7I26zFn2PNvgcySElCsSDkDb/4DA+q391/qD1TmVgPAOYGjniMCTo1YJwcLelPf860YkxRC/DQUvD/qpJnQlHzdDkYZ4+HSdTmYg0DhWvTdNKCzh7T37pc4rnRMEDuN8q6Frx1+xqCBO9Az55yUIMdzIpAbNmCc1tYX/HL5/XHRMPIEZyUjXLEjppP7S3vH7VFStbtIJmBSQJXnmbYRs4jvmEOSlAwX99cPyvopQjtt+MEbH5OAhcS7ZdggQwB3j1htWgqKRcpLWOfBJZ0Cv1nYyS7+Je5jxny+3RmnibVipSfqUtP8kcD+fLJJGB4HYPl4uy+pP0Dh3wj9uRQIPN5hLNSj6QF0LHHnFxR8hnck0+xw9Lz1uEtdjB9LNl7fdS2+ZJqI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 14, 2024 at 05:29:34PM -0700, Luis Chamberlain wrote: > Stress test folio splits by using the new debugfs interface to a target > a new smaller folio order while triggering writeback at the same time. > > This is known to only creates a crash with min order enabled, so for example > with a 16k block sized XFS test profile, an xarray fix for that is merged > already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the > marks correctly when splitting an entry"). > > If inspecting more closely, you'll want to enable on your kernel boot: > > dyndbg='file mm/huge_memory.c +p' > > Since we want to race large folio splits we also augment the full test > output log $seqres.full with the test specific number of successful > splits from vmstat thp_split_page and thp_split_page_failed. The larger > the vmstat thp_split_page the more we stress test this path. > > This test reproduces a really hard to reproduce crash immediately. > > Signed-off-by: Luis Chamberlain > --- Good to me, Reviewed-by: Zorro Lang > common/rc | 14 ++++ > tests/generic/751 | 170 ++++++++++++++++++++++++++++++++++++++++++ > tests/generic/751.out | 2 + > 3 files changed, 186 insertions(+) > create mode 100755 tests/generic/751 > create mode 100644 tests/generic/751.out > > diff --git a/common/rc b/common/rc > index 30beef4e5c02..31ad30276ca6 100644 > --- a/common/rc > +++ b/common/rc > @@ -158,6 +158,20 @@ _require_vm_compaction() > _notrun "Need compaction enabled CONFIG_COMPACTION=y" > fi > } > + > +# Requires CONFIG_DEBUGFS and truncation knobs > +_require_split_huge_pages_knob() > +{ > + if [ ! -f $DEBUGFS_MNT/split_huge_pages ]; then > + _notrun "Needs CONFIG_DEBUGFS and split_huge_pages" > + fi > +} > + > +_split_huge_pages_all() > +{ > + echo 1 > $DEBUGFS_MNT/split_huge_pages > +} > + > # Get hugepagesize in bytes > _get_hugepagesize() > { > diff --git a/tests/generic/751 b/tests/generic/751 > new file mode 100755 > index 000000000000..ac0ca2f07443 > --- /dev/null > +++ b/tests/generic/751 > @@ -0,0 +1,170 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved. > +# > +# FS QA Test No. 751 > +# > +# stress page cache truncation + writeback > +# > +# This aims at trying to reproduce a difficult to reproduce bug found with > +# min order. The issue was root caused to an xarray bug when we split folios > +# to another order other than 0. This functionality is used to support min > +# order. The crash: > +# > +# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df > +# Crash excerpt is as follows: > +# > +# BUG: kernel NULL pointer dereference, address: 0000000000000036 > +# #PF: supervisor read access in kernel mode > +# #PF: error_code(0x0000) - not-present page > +# PGD 0 P4D 0 > +# Oops: 0000 [#1] PREEMPT SMP NOPTI > +# CPU: 7 PID: 2190 Comm: kworker/u38:5 Not tainted 6.9.0-rc5+ #14 > +# Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > +# Workqueue: writeback wb_workfn (flush-7:5) > +# RIP: 0010:filemap_get_folios_tag+0xa9/0x200 > +# Call Trace: > +# > +# writeback_iter+0x17d/0x310 > +# write_cache_pages+0x42/0xa0 > +# iomap_writepages+0x33/0x50 > +# xfs_vm_writepages+0x63/0x90 [xfs] > +# do_writepages+0xcc/0x260 > +# __writeback_single_inode+0x3d/0x340 > +# writeback_sb_inodes+0x1ed/0x4b0 > +# __writeback_inodes_wb+0x4c/0xe0 > +# wb_writeback+0x267/0x2d0 > +# wb_workfn+0x2a4/0x440 > +# process_one_work+0x189/0x3b0 > +# worker_thread+0x273/0x390 > +# kthread+0xda/0x110 > +# ret_from_fork+0x2d/0x50 > +# ret_from_fork_asm+0x1a/0x30 > +# > +# > +# This may also find future truncation bugs in the future, as truncating any > +# mapped file through the collateral of using echo 1 > split_huge_pages will > +# always respect the min order. Truncating to a larger order then is excercised > +# when this test is run against any filesystem LBS profile or an LBS device. > +# > +# If you're enabling this and want to check underneath the hood you may want to > +# enable: > +# > +# dyndbg='file mm/huge_memory.c +p' > +# > +# This tests aims at increasing the rate of successful truncations so we want > +# to increase the value of thp_split_page in $seqres.full. Using echo 1 > > +# split_huge_pages is extremely aggressive, and even accounts for anonymous > +# memory on a system, however we accept that tradeoff for the efficiency of > +# doing the work in-kernel for any mapped file too. Our general goal here is to > +# race with folio truncation + writeback. > + > +. ./common/preamble > + > +_begin_fstest auto long_rw stress soak smoketest > + > +# Override the default cleanup function. > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + rm -f $runfile > + kill -9 $split_huge_pages_files_pid > /dev/null 2>&1 > +} > + > +fio_config=$tmp.fio > +fio_out=$tmp.fio.out > +fio_err=$tmp.fio.err > + > +# real QA test starts here > +_supported_fs generic > +_require_test > +_require_scratch > +_require_debugfs > +_require_split_huge_pages_knob > +_require_command "$KILLALL_PROG" "killall" > +_fixed_by_git_commit kernel 2a0774c2886d \ > + "XArray: set the marks correctly when splitting an entry" > + > +proc_vmstat() > +{ > + awk -v name="$1" '{if ($1 ~ name) {print($2)}}' /proc/vmstat | head -1 > +} > + > +# we need buffered IO to force truncation races with writeback in the > +# page cache > +cat >$fio_config < +[force_large_large_folio_parallel_writes] > +ignore_error=ENOSPC > +nrfiles=10 > +direct=0 > +bs=4M > +group_reporting=1 > +filesize=1GiB > +readwrite=write > +fallocate=none > +numjobs=$(nproc) > +directory=$SCRATCH_MNT > +runtime=100*${TIME_FACTOR} > +time_based > +EOF > + > +_require_fio $fio_config > + > +echo "Silence is golden" > + > +_scratch_mkfs >>$seqres.full 2>&1 > +_scratch_mount >> $seqres.full 2>&1 > + > +# used to let our loops know when to stop > +runfile="$tmp.keep.running.loop" > +touch $runfile > + > +# The background ops are out of bounds, the goal is to race with fsstress. > + > +# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE > +# for large folios. > +while [ -e $runfile ]; do > + _split_huge_pages_all >/dev/null 2>&1 > +done & > +split_huge_pages_files_pid=$! > + > +split_count_before=0 > +split_count_failed_before=0 > + > +if grep -q thp_split_page /proc/vmstat; then > + split_count_before=$(proc_vmstat thp_split_page) > + split_count_failed_before=$(proc_vmstat thp_split_page_failed) > +else > + echo "no thp_split_page in /proc/vmstat" >> $seqres.full > +fi > + > +# we blast away with large writes to force large folio writes when > +# possible. > +echo -e "Running fio with config:\n" >> $seqres.full > +cat $fio_config >> $seqres.full > +$FIO_PROG $fio_config --alloc-size=$(( $(nproc) * 8192 )) \ > + --output=$fio_out 2> $fio_err > +FIO_ERR=$? > + > +rm -f $runfile > + > +wait > /dev/null 2>&1 > + > +if grep -q thp_split_page /proc/vmstat; then > + split_count_after=$(proc_vmstat thp_split_page) > + split_count_failed_after=$(proc_vmstat thp_split_page_failed) > + thp_split_page=$((split_count_after - split_count_before)) > + thp_split_page_failed=$((split_count_failed_after - split_count_failed_before)) > + > + echo "vmstat thp_split_page: $thp_split_page" >> $seqres.full > + echo "vmstat thp_split_page_failed: $thp_split_page_failed" >> $seqres.full > +fi > + > +# exitall_on_error=ENOSPC does not work as it should, so we need this eyesore > +if [[ $FIO_ERR -ne 0 ]] && ! grep -q "No space left on device" $fio_err; then > + _fail "fio failed with err: $FIO_ERR" > +fi > + > +status=0 > +exit > diff --git a/tests/generic/751.out b/tests/generic/751.out > new file mode 100644 > index 000000000000..6479fa6f1404 > --- /dev/null > +++ b/tests/generic/751.out > @@ -0,0 +1,2 @@ > +QA output created by 751 > +Silence is golden > -- > 2.43.0 > >