From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Eryu Guan <eguan@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
fstests@vger.kernel.org, Xiong Zhou <xzhou@redhat.com>,
jmoyer@redhat.com, Christoph Hellwig <hch@lst.de>,
Dan Williams <dan.j.williams@intel.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-nvdimm@lists.01.org,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/2] dax: add regression test for stale mmap reads
Date: Tue, 25 Apr 2017 14:39:11 -0600 [thread overview]
Message-ID: <20170425203911.GC11773@linux.intel.com> (raw)
In-Reply-To: <20170425112738.GV26397@eguan.usersys.redhat.com>
On Tue, Apr 25, 2017 at 07:27:39PM +0800, Eryu Guan wrote:
> On Mon, Apr 24, 2017 at 11:49:32AM -0600, Ross Zwisler wrote:
> > This adds a regression test for the following kernel patch:
> >
> > dax: fix data corruption due to stale mmap reads
> >
>
> Seems that this patch hasn't been merged into linus tree, thus 4.11-rc8
> kernel should fail this test, but it passed for me, tested with 4.11-rc8
> kernel on both ext4 and xfs, with both brd devices and pmem devices
> created from "memmap=10G!5G memmap=15G!15G" kernel boot command line.
> Did I miss anything?
>
> # ./check -s ext4_pmem_4k generic/427
> SECTION -- ext4_pmem_4k
Ooh, I didn't add this 'ext4_pmem_4k' section goodness, and it's not present
in the xfstests/master that I was using. Do you have patches to add that?
> RECREATING -- ext4 on /dev/pmem0
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 hp-dl360g9-15 4.11.0-rc8.kasan
> MKFS_OPTIONS -- -b 4096 /dev/pmem1
> MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/pmem1 /scratch
>
> generic/427 1s ... 1s
> Ran: generic/427
> Passed all 1 tests
Your memmap params look fine. I tested with BRD and PMEM, and with EXT4 and
XFS, and all combinations failed for me as expected with v4.11-rc8.
One issue could have been that the test file already existed when the test was
run. I wasn't removing it between runs earlier, but I've fixed that for v2.
Another issue I guess could have been that the hole that we got back from the
filesystem was smaller than 2MiB? Can you try running v2 (which I'll post in
a second) against a TEST_DEV made with one of the following:
ext4: mkfs.ext4 -b 4096 -E stride=512 -F $TEST_DEV
xfs: mkfs.xfs -f -d su=2m,sw=1 $TEST_DEV
This helps us get 2MiB sized and aligned allocations so we can fault in PMDs,
but I'm not sure whether or not it would matter for holes.
> Some comments inline.
>
> > The above patch fixes an issue where users of DAX can suffer data
> > corruption from stale mmap reads via the following sequence:
> >
> > - open an mmap over a 2MiB hole
> >
> > - read from a 2MiB hole, faulting in a 2MiB zero page
> >
> > - write to the hole with write(3p). The write succeeds but we incorrectly
> > leave the 2MiB zero page mapping intact.
> >
> > - via the mmap, read the data that was just written. Since the zero page
> > mapping is still intact we read back zeroes instead of the new data.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > ---
> > .gitignore | 1 +
> > src/Makefile | 2 +-
> > src/t_dax_stale_pmd.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > tests/generic/427 | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > tests/generic/427.out | 2 ++
> > tests/generic/group | 1 +
> > 6 files changed, 129 insertions(+), 1 deletion(-)
> > create mode 100644 src/t_dax_stale_pmd.c
> > create mode 100755 tests/generic/427
> > create mode 100644 tests/generic/427.out
> >
> > diff --git a/.gitignore b/.gitignore
> > index ded4a61..9664dc9 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -134,6 +134,7 @@
> > /src/renameat2
> > /src/t_rename_overwrite
> > /src/t_mmap_dio
> > +/src/t_dax_stale_pmd
> >
> > # dmapi/ binaries
> > /dmapi/src/common/cmd/read_invis
> > diff --git a/src/Makefile b/src/Makefile
> > index abfd873..7e22b50 100644
> > --- a/src/Makefile
> > +++ b/src/Makefile
> > @@ -12,7 +12,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> > godown resvtest writemod makeextents itrash rename \
> > multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> > t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> > - holetest t_truncate_self t_mmap_dio af_unix
> > + holetest t_truncate_self t_mmap_dio af_unix t_dax_stale_pmd
> >
> > LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> > preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> > diff --git a/src/t_dax_stale_pmd.c b/src/t_dax_stale_pmd.c
> > new file mode 100644
> > index 0000000..d0016eb
> > --- /dev/null
> > +++ b/src/t_dax_stale_pmd.c
> > @@ -0,0 +1,56 @@
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <libgen.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <sys/mman.h>
> > +#include <sys/stat.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#define MiB(a) ((a)*1024*1024)
> > +
> > +void err_exit(char *op)
> > +{
> > + fprintf(stderr, "%s: %s\n", op, strerror(errno));
> > + exit(1);
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > + volatile int a __attribute__((__unused__));
> > + char *buffer = "HELLO WORLD!";
> > + char *data;
> > + int fd;
> > +
> > + if (argc < 2) {
> > + printf("Usage: %s <pmem file>\n", basename(argv[0]));
> > + exit(0);
> > + }
> > +
> > + fd = open(argv[1], O_RDWR);
> > + if (fd < 0)
> > + err_exit("fd");
> ^^^^ Nitpick, the "op" should be "open"?
> > +
> > + data = mmap(NULL, MiB(2), PROT_READ, MAP_SHARED, fd, MiB(2));
> > +
> > + /*
> > + * This faults in a 2MiB zero page to satisfy the read.
> > + * 'a' is volatile so this read doesn't get optimized out.
> > + */
> > + a = data[0];
> > +
> > + pwrite(fd, buffer, strlen(buffer), MiB(2));
> > +
> > + /*
> > + * Try and use the mmap to read back the data we just wrote with
> > + * pwrite(). If the kernel bug is present the mapping from the 2MiB
> > + * zero page will still be intact, and we'll read back zeros instead.
> > + */
> > + if (strncmp(buffer, data, strlen(buffer)))
> > + err_exit("strncmp mismatch!");
>
> strncmp doesn't set errno, this err_exit message might be confusing:
> "strncmp mismatch!: Success"
Ah, thanks, fixed in v2.
> > +
> > + close(fd);
> > + return 0;
> > +}
> > diff --git a/tests/generic/427 b/tests/generic/427
> > new file mode 100755
> > index 0000000..baf1099
> > --- /dev/null
> > +++ b/tests/generic/427
> > @@ -0,0 +1,68 @@
> > +#! /bin/bash
> > +# FS QA Test 427
> > +#
> > +# This is a regression test for kernel patch:
> > +# dax: fix data corruption due to stale mmap reads
> > +# created by Ross Zwisler <ross.zwisler@linux.intel.com>
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017 Intel Corporation. All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1 # failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > + cd /
> > + rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +# remove previous $seqres.full before test
> > +rm -f $seqres.full
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch_dax
>
> I don't think dax is a requirement here, this test could run on normal
> block device without "-o dax" option too. It won't hurt to run with more
> test configurations. And test on nvdimm device with dax mount option
> could be one of the test configs, e.g.
>
> TEST_DEV=/dev/pmem0
> SCRATCH_DEV=/dev/pmem1
> MOUNT_OPTIONS="-o dax"
> ...
Yep, agreed, fixed in v2.
> > +_require_test_program "t_dax_stale_pmd"
> > +_require_user
>
> _require_xfs_io_command "falloc"
>
> So test _notrun on ext2/3.
Fixed in v2.
> > +
> > +# real QA test starts here
> > +_scratch_mkfs >>$seqres.full 2>&1
> > +_scratch_mount "-o dax"
>
> Same here, dax is not required.
Fixed in v2.
>
> > +
> > +$XFS_IO_PROG -f -c "falloc 0 4M" $SCRATCH_MNT/testfile >> $seqres.full 2>&1
> > +chmod 0644 $SCRATCH_MNT/testfile
> > +chown $qa_user $SCRATCH_MNT/testfile
>
> Any specific reason to use $qa_user to run this test? Comments would be
> great.
Nope, just cargo-culting my way through my first xfstest. :) I've removed
this for v2.
> Thanks,
> Eryu
Thanks for the review!
> > +
> > +_user_do "src/t_dax_stale_pmd $SCRATCH_MNT/testfile"
> > +
> > +# success, all done
> > +echo "Silence is golden"
> > +status=0
> > +exit
> > diff --git a/tests/generic/427.out b/tests/generic/427.out
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-04-25 20:39 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-14 14:07 [PATCH 0/4] Properly invalidate data in the cleancache Andrey Ryabinin
2017-04-14 14:07 ` [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO Andrey Ryabinin
2017-04-18 19:38 ` Ross Zwisler
2017-04-19 15:11 ` Andrey Ryabinin
2017-04-19 19:28 ` Ross Zwisler
2017-04-20 14:35 ` Jan Kara
2017-04-20 14:44 ` Jan Kara
2017-04-20 19:14 ` Ross Zwisler
2017-04-21 3:44 ` [PATCH 1/2] dax: prevent invalidation of mapped DAX entries Ross Zwisler
2017-04-21 3:44 ` [PATCH 2/2] dax: fix data corruption due to stale mmap reads Ross Zwisler
2017-04-25 11:10 ` Jan Kara
2017-04-25 22:59 ` Ross Zwisler
2017-04-26 8:52 ` Jan Kara
2017-04-26 22:52 ` Ross Zwisler
2017-04-27 7:26 ` Jan Kara
2017-05-01 22:38 ` Ross Zwisler
2017-05-04 9:12 ` Jan Kara
2017-05-01 22:59 ` Dan Williams
2017-04-24 17:49 ` [PATCH 1/2] xfs: fix incorrect argument count check Ross Zwisler
2017-04-24 17:49 ` [PATCH 2/2] dax: add regression test for stale mmap reads Ross Zwisler
2017-04-25 11:27 ` Eryu Guan
2017-04-25 20:39 ` Ross Zwisler [this message]
2017-04-26 3:42 ` Eryu Guan
2017-04-25 10:10 ` [PATCH 1/2] dax: prevent invalidation of mapped DAX entries Jan Kara
2017-05-01 16:54 ` Ross Zwisler
2017-04-18 22:46 ` [PATCH 1/4] fs: fix data invalidation in the cleancache during direct IO Andrew Morton
2017-04-19 15:15 ` Andrey Ryabinin
2017-04-14 14:07 ` [PATCH 2/4] fs/block_dev: always invalidate cleancache in invalidate_bdev() Andrey Ryabinin
2017-04-18 18:51 ` Nikolay Borisov
2017-04-19 13:22 ` Andrey Ryabinin
2017-04-14 14:07 ` [PATCH 3/4] mm/truncate: bail out early from invalidate_inode_pages2_range() if mapping is empty Andrey Ryabinin
2017-04-14 14:07 ` [PATCH 4/4] mm/truncate: avoid pointless cleancache_invalidate_inode() calls Andrey Ryabinin
2017-04-18 15:24 ` [PATCH 0/4] Properly invalidate data in the cleancache Konrad Rzeszutek Wilk
2017-04-24 16:41 ` [PATCH v2 " Andrey Ryabinin
2017-04-24 16:41 ` [PATCH v2 1/4] fs: fix data invalidation in the cleancache during direct IO Andrey Ryabinin
2017-04-25 8:25 ` Jan Kara
2017-04-24 16:41 ` [PATCH v2 2/4] fs/block_dev: always invalidate cleancache in invalidate_bdev() Andrey Ryabinin
2017-04-25 8:34 ` Jan Kara
2017-04-24 16:41 ` [PATCH v2 3/4] mm/truncate: bail out early from invalidate_inode_pages2_range() if mapping is empty Andrey Ryabinin
2017-04-25 8:37 ` Jan Kara
2017-04-24 16:41 ` [PATCH v2 4/4] mm/truncate: avoid pointless cleancache_invalidate_inode() calls Andrey Ryabinin
2017-04-25 8:41 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170425203911.GC11773@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=eguan@redhat.com \
--cc=fstests@vger.kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=xzhou@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox