From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 995DDC3ABB2 for ; Wed, 28 May 2025 10:35:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0B3B6B0082; Wed, 28 May 2025 06:35:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CBAC66B0083; Wed, 28 May 2025 06:35:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD05A6B0088; Wed, 28 May 2025 06:35:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9B4E16B0082 for ; Wed, 28 May 2025 06:35:05 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 10973160528 for ; Wed, 28 May 2025 10:35:05 +0000 (UTC) X-FDA: 83491958970.10.97A53C9 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 3D86940013 for ; Wed, 28 May 2025 10:35:03 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748428503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qcNkCzA3+2tlBp9E3fjT+7g2BMf2oSzkLe7awo4R6PU=; b=AphcsEIbfwapr1LGjJg895erbvZRBaoPBakrrAi5lvK/mnL7/dj8PUQfgs4Bq8kfU2me7E Hs831OxpP7Atc9mJMKiIKH77Yuo3k37UyM/ZyieuIUifN6LbtcJ5xu32nNVNGoXNkbYYZW 1lJv3yLz+/QNzi82kiMdPcsBJ4JAQg8= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748428503; a=rsa-sha256; cv=none; b=ob0U1G3Ay38nKK5hI8vwp6cJ2XN/9DtTPJGbBBMfqZthW4UDPeDeYhkWh1nLYTCp1aPpuO 9MY4F3vVjLivsckQluaaZQE9hbv6Vpiww2vFR5qAd1qsWza5HSmagnJhfqJblAlGEoo4Vp 4hEXIppXPKNSX15ZjoZ/pFDSdDz9RU0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 08B89150C; Wed, 28 May 2025 03:34:46 -0700 (PDT) Received: from [10.57.94.142] (unknown [10.57.94.142]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BF7383F673; Wed, 28 May 2025 03:35:00 -0700 (PDT) Message-ID: <232960c2-81db-47ca-a337-38c4bce5f997@arm.com> Date: Wed, 28 May 2025 11:34:59 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem Content-Language: en-GB To: David Hildenbrand , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Andrew Morton , Shuah Khan , Lorenzo Stoakes , Ingo Molnar , Peter Xu , Dev Jain , Aishwarya TCV References: <20250509153033.952746-1-david@redhat.com> From: Ryan Roberts In-Reply-To: <20250509153033.952746-1-david@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3D86940013 X-Stat-Signature: pbokoucp9wxaj5hgkhy94ybjtbw1u7ha X-Rspam-User: X-HE-Tag: 1748428503-616678 X-HE-Meta: U2FsdGVkX18OLlLFMNX5En3n0cm9D+j1gaML/BKGFp4T4iv3FBQ8mMgtvlGqMoo6Xel+XvimXvwVS2/mS6VV4NXLA6F/z5oUkTfMZdKSFesmMWzPUzUQUh4vDoRmWxhOFe2DmB2aUr7j932QoYPKSY5JVj43lMxx3LdlW1gvm3Ydza5GbYF/zVSbJVFvF6pl+1qRt9dPCWfnXu1C+UXbR6tyjoIdwuHeWvaIvYtMetDD23LNxzSSxn4aRVnBaPybNgm6dWNSKHIvrWifkq8OqSxuve0g8F+63TVPc+wP65/x9pmaq8ELFwIGoPrMzbAj4ApPkCfaM37EPDEP5J8RP2/pV0sjRPdS8K9JQylMRL4hZsJ3uXt8S4s/DfEYCjwl277h5hErTU8qBWo8++iDYs1GEF52yl4NoLwdYXFvLF34CyDhJHBnwn6d7HdgkA4JdBlnkWnCeMpZsn2hHuqGfk4KtZJMj6hCVs2ByiWXqTtU90RODLhT/xcEa1pdD7n4uwCDq0Fv1ij6XxgRHvRnyOrwUPvgJhMbhpAp5gFj0goxH5AID0rt7UcCzRp+ApYpnhEt+nuYcwqB5zgVEY8JgpIK1cUB6KXybCawwqYntMNfCLOVB7KVrhcuNErAMjEL4BRMNw4jM9qveSvn0dYP2wcbSWRT/gAhgYr8ZIbs7CSdkr0rQYBkoDi3OqfqnShn5SbPUW0xzMgnQDY18kJUAxAG8sZNKbxvGiKAtPXT5ogsd3yETOhrPBaUiLQCid5bs/h/wQBcCux6pmlnksjU6FLvBxjcHtIS8iSAxrEXcBVWjCYGL9DFeM1SMRfxsKmv1MEcvr+gxlH/D4B9Uqjon2Y+v4OIvBnGJHyxU0RVy2xrYjx2zsR9AgHZdNa6ZHDNjJsTdqlg6r0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi David, On 09/05/2025 16:30, David Hildenbrand wrote: > Let's test some basic functionality using /dev/mem. These tests will > implicitly cover some PAT (Page Attribute Handling) handling on x86. > > These tests will only run when /dev/mem access to the first two pages > in physical address space is possible and allowed; otherwise, the tests > are skipped. We are seeing really horrible RAS errors with this test when run on arm64 tx2 machine. Based solely on reviewing the code, I think the problem is that tx2 doesn't have anything at phys address 0, so test_read_access() is trying to put trasactions out to a bad address on the bus. tx2 /proc/iomem: $ sudo cat /proc/iomem 30000000-37ffffff : PCI ECAM 38000000-3fffffff : PCI ECAM 40000000-5fffffff : PCI Bus 0000:00 ... Whereas my x86 box has some reserved memory: $ sudo cat /proc/iomem 00000000-00000fff : Reserved 00001000-0003dfff : System RAM ... I think perhaps the only safe way to handle this is to parse /proc/iomem for a region of "System RAM" that is at least 2 pages then use that for your read tests. This would also solve the hypothetical issue of reading something that has read size effects. I also spotted a few nits while reading the code... > > On current x86-64 with PAT inside a VM, all tests pass: > > TAP version 13 > 1..6 > # Starting 6 tests from 1 test cases. > # RUN pfnmap.madvise_disallowed ... > # OK pfnmap.madvise_disallowed > ok 1 pfnmap.madvise_disallowed > # RUN pfnmap.munmap_split ... > # OK pfnmap.munmap_split > ok 2 pfnmap.munmap_split > # RUN pfnmap.mremap_fixed ... > # OK pfnmap.mremap_fixed > ok 3 pfnmap.mremap_fixed > # RUN pfnmap.mremap_shrink ... > # OK pfnmap.mremap_shrink > ok 4 pfnmap.mremap_shrink > # RUN pfnmap.mremap_expand ... > # OK pfnmap.mremap_expand > ok 5 pfnmap.mremap_expand > # RUN pfnmap.fork ... > # OK pfnmap.fork > ok 6 pfnmap.fork > # PASSED: 6 / 6 tests passed. > # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0 > > However, we are able to trigger: > > [ 27.888251] x86/PAT: pfnmap:1790 freeing invalid memtype [mem 0x00000000-0x00000fff] > > There are probably more things worth testing in the future, such as > MAP_PRIVATE handling. But this set of tests is sufficient to cover most of > the things we will rework regarding PAT handling. > > Cc: Andrew Morton > Cc: Shuah Khan > Cc: Lorenzo Stoakes > Cc: Ingo Molnar > Cc: Peter Xu > Cc: Dev Jain > Signed-off-by: David Hildenbrand > --- > > Hopefully I didn't miss any review feedback. > > v1 -> v2: > * Rewrite using kselftest_harness, which simplifies a lot of things > * Add to .gitignore and run_vmtests.sh > * Register signal handler on demand > * Use volatile trick to force a read (not factoring out FORCE_READ just yet) > * Drop mprotect() test case > * Add some more comments why we test certain things > * Use NULL for mmap() first parameter instead of 0 > * Smaller fixes > > --- > tools/testing/selftests/mm/.gitignore | 1 + > tools/testing/selftests/mm/Makefile | 1 + > tools/testing/selftests/mm/pfnmap.c | 196 ++++++++++++++++++++++ > tools/testing/selftests/mm/run_vmtests.sh | 4 + > 4 files changed, 202 insertions(+) > create mode 100644 tools/testing/selftests/mm/pfnmap.c > > diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore > index 91db34941a143..824266982aa36 100644 > --- a/tools/testing/selftests/mm/.gitignore > +++ b/tools/testing/selftests/mm/.gitignore > @@ -20,6 +20,7 @@ mremap_test > on-fault-limit > transhuge-stress > pagemap_ioctl > +pfnmap > *.tmp* > protection_keys > protection_keys_32 > diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile > index ad4d6043a60f0..ae6f994d3add7 100644 > --- a/tools/testing/selftests/mm/Makefile > +++ b/tools/testing/selftests/mm/Makefile > @@ -84,6 +84,7 @@ TEST_GEN_FILES += mremap_test > TEST_GEN_FILES += mseal_test > TEST_GEN_FILES += on-fault-limit > TEST_GEN_FILES += pagemap_ioctl > +TEST_GEN_FILES += pfnmap > TEST_GEN_FILES += thuge-gen > TEST_GEN_FILES += transhuge-stress > TEST_GEN_FILES += uffd-stress > diff --git a/tools/testing/selftests/mm/pfnmap.c b/tools/testing/selftests/mm/pfnmap.c > new file mode 100644 > index 0000000000000..8a9d19b6020c7 > --- /dev/null > +++ b/tools/testing/selftests/mm/pfnmap.c > @@ -0,0 +1,196 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Basic VM_PFNMAP tests relying on mmap() of '/dev/mem' > + * > + * Copyright 2025, Red Hat, Inc. > + * > + * Author(s): David Hildenbrand > + */ > +#define _GNU_SOURCE > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../kselftest_harness.h" > +#include "vm_util.h" > + > +static sigjmp_buf sigjmp_buf_env; > + > +static void signal_handler(int sig) > +{ > + siglongjmp(sigjmp_buf_env, -EFAULT); > +} > + > +static int test_read_access(char *addr, size_t size, size_t pagesize) > +{ > + size_t offs; > + int ret; > + > + if (signal(SIGSEGV, signal_handler) == SIG_ERR) > + return -EINVAL; > + > + ret = sigsetjmp(sigjmp_buf_env, 1); > + if (!ret) { > + for (offs = 0; offs < size; offs += pagesize) > + /* Force a read that the compiler cannot optimize out. */ > + *((volatile char *)(addr + offs)); > + } > + if (signal(SIGSEGV, signal_handler) == SIG_ERR) I think you mean: if (signal(SIGSEGV, SIG_DFL) == SIG_ERR) Otherwise your signal_handler will remain installed, right? > + return -EINVAL; > + > + return ret; > +} > + > +FIXTURE(pfnmap) > +{ > + size_t pagesize; > + int dev_mem_fd; > + char *addr1; > + size_t size1; > + char *addr2; > + size_t size2; > +}; > + > +FIXTURE_SETUP(pfnmap) > +{ > + self->pagesize = getpagesize(); > + > + self->dev_mem_fd = open("/dev/mem", O_RDONLY); > + if (self->dev_mem_fd < 0) > + SKIP(return, "Cannot open '/dev/mem'\n"); > + > + /* We'll require the first two pages throughout our tests ... */ > + self->size1 = self->pagesize * 2; > + self->addr1 = mmap(NULL, self->size1, PROT_READ, MAP_SHARED, > + self->dev_mem_fd, 0); > + if (self->addr1 == MAP_FAILED) > + SKIP(return, "Cannot mmap '/dev/mem'\n"); > + > + /* ... and want to be able to read from them. */ > + if (test_read_access(self->addr1, self->size1, self->pagesize)) > + SKIP(return, "Cannot read-access mmap'ed '/dev/mem'\n"); > + > + self->size2 = 0; > + self->addr2 = MAP_FAILED; Do you need to init all the params in FIXTURE(pfnmap) to their "safe" values before any possible early returns? We don't want FIXTURE_TEARDOWN(pfnmap) getting confused. Thanks, Ryan > +} > + > +FIXTURE_TEARDOWN(pfnmap) > +{ > + if (self->addr2 != MAP_FAILED) > + munmap(self->addr2, self->size2); > + if (self->addr1 != MAP_FAILED) > + munmap(self->addr1, self->size1); > + if (self->dev_mem_fd >= 0) > + close(self->dev_mem_fd); > +} > + > +TEST_F(pfnmap, madvise_disallowed) > +{ > + int advices[] = { > + MADV_DONTNEED, > + MADV_DONTNEED_LOCKED, > + MADV_FREE, > + MADV_WIPEONFORK, > + MADV_COLD, > + MADV_PAGEOUT, > + MADV_POPULATE_READ, > + MADV_POPULATE_WRITE, > + }; > + int i; > + > + /* All these advices must be rejected. */ > + for (i = 0; i < ARRAY_SIZE(advices); i++) { > + EXPECT_LT(madvise(self->addr1, self->pagesize, advices[i]), 0); > + EXPECT_EQ(errno, EINVAL); > + } > +} > + > +TEST_F(pfnmap, munmap_split) > +{ > + /* > + * Unmap the first page. This munmap() call is not really expected to > + * fail, but we might be able to trigger other internal issues. > + */ > + ASSERT_EQ(munmap(self->addr1, self->pagesize), 0); > + > + /* > + * Remap the first page while the second page is still mapped. This > + * makes sure that any PAT tracking on x86 will allow for mmap()'ing > + * a page again while some parts of the first mmap() are still > + * around. > + */ > + self->size2 = self->pagesize; > + self->addr2 = mmap(NULL, self->pagesize, PROT_READ, MAP_SHARED, > + self->dev_mem_fd, 0); > + ASSERT_NE(self->addr2, MAP_FAILED); > +} > + > +TEST_F(pfnmap, mremap_fixed) > +{ > + char *ret; > + > + /* Reserve a destination area. */ > + self->size2 = self->size1; > + self->addr2 = mmap(NULL, self->size2, PROT_READ, MAP_ANON | MAP_PRIVATE, > + -1, 0); > + ASSERT_NE(self->addr2, MAP_FAILED); > + > + /* mremap() over our destination. */ > + ret = mremap(self->addr1, self->size1, self->size2, > + MREMAP_FIXED | MREMAP_MAYMOVE, self->addr2); > + ASSERT_NE(ret, MAP_FAILED); > +} > + > +TEST_F(pfnmap, mremap_shrink) > +{ > + char *ret; > + > + /* Shrinking is expected to work. */ > + ret = mremap(self->addr1, self->size1, self->size1 - self->pagesize, 0); > + ASSERT_NE(ret, MAP_FAILED); > +} > + > +TEST_F(pfnmap, mremap_expand) > +{ > + /* > + * Growing is not expected to work, and getting it right would > + * be challenging. So this test primarily serves as an early warning > + * that something that probably should never work suddenly works. > + */ > + self->size2 = self->size1 + self->pagesize; > + self->addr2 = mremap(self->addr1, self->size1, self->size2, MREMAP_MAYMOVE); > + ASSERT_EQ(self->addr2, MAP_FAILED); > +} > + > +TEST_F(pfnmap, fork) > +{ > + pid_t pid; > + int ret; > + > + /* fork() a child and test if the child can access the pages. */ > + pid = fork(); > + ASSERT_GE(pid, 0); > + > + if (!pid) { > + EXPECT_EQ(test_read_access(self->addr1, self->size1, > + self->pagesize), 0); > + exit(0); > + } > + > + wait(&ret); > + if (WIFEXITED(ret)) > + ret = WEXITSTATUS(ret); > + else > + ret = -EINVAL; > + ASSERT_EQ(ret, 0); > +} > + > +TEST_HARNESS_MAIN > diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh > index 188b125bf1f6b..dddd1dd8af145 100755 > --- a/tools/testing/selftests/mm/run_vmtests.sh > +++ b/tools/testing/selftests/mm/run_vmtests.sh > @@ -63,6 +63,8 @@ separated by spaces: > test soft dirty page bit semantics > - pagemap > test pagemap_scan IOCTL > +- pfnmap > + tests for VM_PFNMAP handling > - cow > test copy-on-write semantics > - thp > @@ -472,6 +474,8 @@ fi > > CATEGORY="pagemap" run_test ./pagemap_ioctl > > +CATEGORY="pfnmap" run_test ./pfnmap > + > # COW tests > CATEGORY="cow" run_test ./cow >