From: Zorro Lang <zlang@redhat.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: patches@lists.linux.dev, fstests@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org,
ziy@nvidia.com, vbabka@suse.cz, seanjc@google.com,
willy@infradead.org, david@redhat.com, hughd@google.com,
linmiaohe@huawei.com, muchun.song@linux.dev, osalvador@suse.de,
p.raghav@samsung.com, da.gomez@samsung.com, hare@suse.de,
john.g.garry@oracle.com
Subject: Re: [PATCH v2 5/5] fstests: add stress truncation + writeback test
Date: Tue, 18 Jun 2024 22:10:13 +0800 [thread overview]
Message-ID: <20240618141013.pue6syikkp5dwj5q@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> (raw)
In-Reply-To: <20240615002935.1033031-6-mcgrof@kernel.org>
On Fri, Jun 14, 2024 at 05:29:34PM -0700, Luis Chamberlain wrote:
> Stress test folio splits by using the new debugfs interface to a target
> a new smaller folio order while triggering writeback at the same time.
>
> This is known to only creates a crash with min order enabled, so for example
> with a 16k block sized XFS test profile, an xarray fix for that is merged
> already. This issue is fixed by kernel commit 2a0774c2886d ("XArray: set the
> marks correctly when splitting an entry").
>
> If inspecting more closely, you'll want to enable on your kernel boot:
>
> dyndbg='file mm/huge_memory.c +p'
>
> Since we want to race large folio splits we also augment the full test
> output log $seqres.full with the test specific number of successful
> splits from vmstat thp_split_page and thp_split_page_failed. The larger
> the vmstat thp_split_page the more we stress test this path.
>
> This test reproduces a really hard to reproduce crash immediately.
>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
Good to me,
Reviewed-by: Zorro Lang <zlang@redhat.com>
> common/rc | 14 ++++
> tests/generic/751 | 170 ++++++++++++++++++++++++++++++++++++++++++
> tests/generic/751.out | 2 +
> 3 files changed, 186 insertions(+)
> create mode 100755 tests/generic/751
> create mode 100644 tests/generic/751.out
>
> diff --git a/common/rc b/common/rc
> index 30beef4e5c02..31ad30276ca6 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -158,6 +158,20 @@ _require_vm_compaction()
> _notrun "Need compaction enabled CONFIG_COMPACTION=y"
> fi
> }
> +
> +# Requires CONFIG_DEBUGFS and truncation knobs
> +_require_split_huge_pages_knob()
> +{
> + if [ ! -f $DEBUGFS_MNT/split_huge_pages ]; then
> + _notrun "Needs CONFIG_DEBUGFS and split_huge_pages"
> + fi
> +}
> +
> +_split_huge_pages_all()
> +{
> + echo 1 > $DEBUGFS_MNT/split_huge_pages
> +}
> +
> # Get hugepagesize in bytes
> _get_hugepagesize()
> {
> diff --git a/tests/generic/751 b/tests/generic/751
> new file mode 100755
> index 000000000000..ac0ca2f07443
> --- /dev/null
> +++ b/tests/generic/751
> @@ -0,0 +1,170 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 Luis Chamberlain. All Rights Reserved.
> +#
> +# FS QA Test No. 751
> +#
> +# stress page cache truncation + writeback
> +#
> +# This aims at trying to reproduce a difficult to reproduce bug found with
> +# min order. The issue was root caused to an xarray bug when we split folios
> +# to another order other than 0. This functionality is used to support min
> +# order. The crash:
> +#
> +# https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df
> +# Crash excerpt is as follows:
> +#
> +# BUG: kernel NULL pointer dereference, address: 0000000000000036
> +# #PF: supervisor read access in kernel mode
> +# #PF: error_code(0x0000) - not-present page
> +# PGD 0 P4D 0
> +# Oops: 0000 [#1] PREEMPT SMP NOPTI
> +# CPU: 7 PID: 2190 Comm: kworker/u38:5 Not tainted 6.9.0-rc5+ #14
> +# Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> +# Workqueue: writeback wb_workfn (flush-7:5)
> +# RIP: 0010:filemap_get_folios_tag+0xa9/0x200
> +# Call Trace:
> +# <TASK>
> +# writeback_iter+0x17d/0x310
> +# write_cache_pages+0x42/0xa0
> +# iomap_writepages+0x33/0x50
> +# xfs_vm_writepages+0x63/0x90 [xfs]
> +# do_writepages+0xcc/0x260
> +# __writeback_single_inode+0x3d/0x340
> +# writeback_sb_inodes+0x1ed/0x4b0
> +# __writeback_inodes_wb+0x4c/0xe0
> +# wb_writeback+0x267/0x2d0
> +# wb_workfn+0x2a4/0x440
> +# process_one_work+0x189/0x3b0
> +# worker_thread+0x273/0x390
> +# kthread+0xda/0x110
> +# ret_from_fork+0x2d/0x50
> +# ret_from_fork_asm+0x1a/0x30
> +# </TASK>
> +#
> +# This may also find future truncation bugs in the future, as truncating any
> +# mapped file through the collateral of using echo 1 > split_huge_pages will
> +# always respect the min order. Truncating to a larger order then is excercised
> +# when this test is run against any filesystem LBS profile or an LBS device.
> +#
> +# If you're enabling this and want to check underneath the hood you may want to
> +# enable:
> +#
> +# dyndbg='file mm/huge_memory.c +p'
> +#
> +# This tests aims at increasing the rate of successful truncations so we want
> +# to increase the value of thp_split_page in $seqres.full. Using echo 1 >
> +# split_huge_pages is extremely aggressive, and even accounts for anonymous
> +# memory on a system, however we accept that tradeoff for the efficiency of
> +# doing the work in-kernel for any mapped file too. Our general goal here is to
> +# race with folio truncation + writeback.
> +
> +. ./common/preamble
> +
> +_begin_fstest auto long_rw stress soak smoketest
> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> + cd /
> + rm -f $tmp.*
> + rm -f $runfile
> + kill -9 $split_huge_pages_files_pid > /dev/null 2>&1
> +}
> +
> +fio_config=$tmp.fio
> +fio_out=$tmp.fio.out
> +fio_err=$tmp.fio.err
> +
> +# real QA test starts here
> +_supported_fs generic
> +_require_test
> +_require_scratch
> +_require_debugfs
> +_require_split_huge_pages_knob
> +_require_command "$KILLALL_PROG" "killall"
> +_fixed_by_git_commit kernel 2a0774c2886d \
> + "XArray: set the marks correctly when splitting an entry"
> +
> +proc_vmstat()
> +{
> + awk -v name="$1" '{if ($1 ~ name) {print($2)}}' /proc/vmstat | head -1
> +}
> +
> +# we need buffered IO to force truncation races with writeback in the
> +# page cache
> +cat >$fio_config <<EOF
> +[force_large_large_folio_parallel_writes]
> +ignore_error=ENOSPC
> +nrfiles=10
> +direct=0
> +bs=4M
> +group_reporting=1
> +filesize=1GiB
> +readwrite=write
> +fallocate=none
> +numjobs=$(nproc)
> +directory=$SCRATCH_MNT
> +runtime=100*${TIME_FACTOR}
> +time_based
> +EOF
> +
> +_require_fio $fio_config
> +
> +echo "Silence is golden"
> +
> +_scratch_mkfs >>$seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +# used to let our loops know when to stop
> +runfile="$tmp.keep.running.loop"
> +touch $runfile
> +
> +# The background ops are out of bounds, the goal is to race with fsstress.
> +
> +# Force folio split if possible, this seems to be screaming for MADV_NOHUGEPAGE
> +# for large folios.
> +while [ -e $runfile ]; do
> + _split_huge_pages_all >/dev/null 2>&1
> +done &
> +split_huge_pages_files_pid=$!
> +
> +split_count_before=0
> +split_count_failed_before=0
> +
> +if grep -q thp_split_page /proc/vmstat; then
> + split_count_before=$(proc_vmstat thp_split_page)
> + split_count_failed_before=$(proc_vmstat thp_split_page_failed)
> +else
> + echo "no thp_split_page in /proc/vmstat" >> $seqres.full
> +fi
> +
> +# we blast away with large writes to force large folio writes when
> +# possible.
> +echo -e "Running fio with config:\n" >> $seqres.full
> +cat $fio_config >> $seqres.full
> +$FIO_PROG $fio_config --alloc-size=$(( $(nproc) * 8192 )) \
> + --output=$fio_out 2> $fio_err
> +FIO_ERR=$?
> +
> +rm -f $runfile
> +
> +wait > /dev/null 2>&1
> +
> +if grep -q thp_split_page /proc/vmstat; then
> + split_count_after=$(proc_vmstat thp_split_page)
> + split_count_failed_after=$(proc_vmstat thp_split_page_failed)
> + thp_split_page=$((split_count_after - split_count_before))
> + thp_split_page_failed=$((split_count_failed_after - split_count_failed_before))
> +
> + echo "vmstat thp_split_page: $thp_split_page" >> $seqres.full
> + echo "vmstat thp_split_page_failed: $thp_split_page_failed" >> $seqres.full
> +fi
> +
> +# exitall_on_error=ENOSPC does not work as it should, so we need this eyesore
> +if [[ $FIO_ERR -ne 0 ]] && ! grep -q "No space left on device" $fio_err; then
> + _fail "fio failed with err: $FIO_ERR"
> +fi
> +
> +status=0
> +exit
> diff --git a/tests/generic/751.out b/tests/generic/751.out
> new file mode 100644
> index 000000000000..6479fa6f1404
> --- /dev/null
> +++ b/tests/generic/751.out
> @@ -0,0 +1,2 @@
> +QA output created by 751
> +Silence is golden
> --
> 2.43.0
>
>
prev parent reply other threads:[~2024-06-18 14:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-15 0:29 [PATCH v2 0/5] fstests: add some new LBS inspired tests Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 1/5] common: move mread() to generic helper _mread() Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 2/5] fstests: add mmap page boundary tests Luis Chamberlain
2024-06-18 14:07 ` Zorro Lang
2024-06-15 0:29 ` [PATCH v2 3/5] fstests: add fsstress + compaction test Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 4/5] _require_debugfs(): simplify and fix for debian Luis Chamberlain
2024-06-15 0:29 ` [PATCH v2 5/5] fstests: add stress truncation + writeback test Luis Chamberlain
2024-06-18 14:10 ` Zorro Lang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240618141013.pue6syikkp5dwj5q@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com \
--to=zlang@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=da.gomez@samsung.com \
--cc=david@redhat.com \
--cc=fstests@vger.kernel.org \
--cc=hare@suse.de \
--cc=hughd@google.com \
--cc=john.g.garry@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=p.raghav@samsung.com \
--cc=patches@lists.linux.dev \
--cc=seanjc@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox