linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: wang lian <lianux.mm@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 2/3] selftests/mm: add check_folio_orders() helper.
Date: Mon, 11 Aug 2025 14:39:08 -0400	[thread overview]
Message-ID: <B13F65A9-B001-4494-A060-23D95055553F@nvidia.com> (raw)
In-Reply-To: <20250809201836.jegaanplfcjak44f@master>

On 9 Aug 2025, at 16:18, Wei Yang wrote:

> On Fri, Aug 08, 2025 at 03:01:43PM -0400, Zi Yan wrote:
>> The helper gathers an folio order statistics of folios within a virtual
>> address range and checks it against a given order list. It aims to provide
>> a more precise folio order check instead of just checking the existence of
>> PMD folios.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>> .../selftests/mm/split_huge_page_test.c       |   4 +-
>> tools/testing/selftests/mm/vm_util.c          | 133 ++++++++++++++++++
>> tools/testing/selftests/mm/vm_util.h          |   7 +
>> 3 files changed, 141 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c
>> index cb364c5670c6..5ab488fab1cd 100644
>> --- a/tools/testing/selftests/mm/split_huge_page_test.c
>> +++ b/tools/testing/selftests/mm/split_huge_page_test.c
>> @@ -34,8 +34,6 @@ uint64_t pmd_pagesize;
>> #define PID_FMT_OFFSET "%d,0x%lx,0x%lx,%d,%d"
>> #define PATH_FMT "%s,0x%lx,0x%lx,%d"
>>
>> -#define PFN_MASK     ((1UL<<55)-1)
>> -#define KPF_THP      (1UL<<22)
>> #define GET_ORDER(nr_pages)    (31 - __builtin_clz(nr_pages))
>>
>> int is_backed_by_thp(char *vaddr, int pagemap_file, int kpageflags_file)
>> @@ -49,7 +47,7 @@ int is_backed_by_thp(char *vaddr, int pagemap_file, int kpageflags_file)
>>
>> 		if (kpageflags_file) {
>> 			pread(kpageflags_file, &page_flags, sizeof(page_flags),
>> -				(paddr & PFN_MASK) * sizeof(page_flags));
>> +				PAGEMAP_PFN(paddr) * sizeof(page_flags));
>>
>
> is_backed_by_thp() shares similar logic as get_page_flags(), I am thinking we can
> leverage get_page_flags() here.

I was lazy for this one. I will use check_folio_orders() in the next version.

>
>> 			return !!(page_flags & KPF_THP);
>> 		}
>> diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c
>> index 6a239aa413e2..41d50b74b2f6 100644
>> --- a/tools/testing/selftests/mm/vm_util.c
>> +++ b/tools/testing/selftests/mm/vm_util.c
>> @@ -338,6 +338,139 @@ int detect_hugetlb_page_sizes(size_t sizes[], int max)
>> 	return count;
>> }
>>
>> +static int get_page_flags(char *vaddr, int pagemap_file, int kpageflags_file,
>> +			  uint64_t *flags)
>> +{
>
> Nit.
>
> In vm_util.c, we usually name the file descriptor as xxx_fd.

OK. I can rename them.
>
>> +	unsigned long pfn;
>> +	size_t count;
>> +
>> +	pfn = pagemap_get_pfn(pagemap_file, vaddr);
>> +	/*
>> +	 * Treat non-present page as a page without any flag, so that
>> +	 * gather_folio_orders() just record the current folio order.
>> +	 */
>> +	if (pfn == -1UL) {
>> +		*flags = 0;
>> +		return 0;
>> +	}
>> +
>> +	count = pread(kpageflags_file, flags, sizeof(*flags),
>> +		      pfn * sizeof(*flags));
>> +
>> +	if (count != sizeof(*flags))
>> +		return -1;
>> +
>> +	return 0;
>> +}
>> +
>
> Maybe a simple document here would be helpful.

Will do.

>
>> +static int gather_folio_orders(char *vaddr_start, size_t len,
>> +			       int pagemap_file, int kpageflags_file,
>> +			       int orders[], int nr_orders)
>> +{
>> +	uint64_t page_flags = 0;
>> +	int cur_order = -1;
>> +	char *vaddr;
>> +
>> +	if (!pagemap_file || !kpageflags_file)
>> +		return -1;
>> +	if (nr_orders <= 0)
>> +		return -1;
>> +
>> +	for (vaddr = vaddr_start; vaddr < vaddr_start + len; ) {
>> +		char *next_folio_vaddr;
>> +		int status;
>> +
>> +		if (get_page_flags(vaddr, pagemap_file, kpageflags_file, &page_flags))
>> +			return -1;
>> +
>> +		/* all order-0 pages with possible false postive (non folio) */
>> +		if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) {
>> +			orders[0]++;
>> +			vaddr += psize();
>> +			continue;
>> +		}
>> +
>> +		/* skip non thp compound pages */
>> +		if (!(page_flags & KPF_THP)) {
>> +			vaddr += psize();
>> +			continue;
>> +		}
>> +
>> +		/* vpn points to part of a THP at this point */
>> +		if (page_flags & KPF_COMPOUND_HEAD)
>> +			cur_order = 1;
>> +		else {
>> +			/* not a head nor a tail in a THP? */
>> +			if (!(page_flags & KPF_COMPOUND_TAIL))
>> +				return -1;
>> +			continue;
>> +		}
>> +
>> +		next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));
>> +
>> +		if (next_folio_vaddr >= vaddr_start + len)
>> +			break;
>
> Would we skip order 1 folio at the last position?
>
> For example, vaddr_start is 0x2000, len is 0x2000 and the folio at vaddr_start
> is an order 1 folio, whose size is exactly 0x2000.
>
> Then we will get next_folio_vaddr == vaddr_start + len.
>
> Could that happen?

No. After the loop, there is code checking cur_order and updating orders[].

>
>> +
>> +		while (!(status = get_page_flags(next_folio_vaddr, pagemap_file,
>> +						 kpageflags_file,
>> +						 &page_flags))) {
>> +			/* next compound head page or order-0 page */
>> +			if ((page_flags & KPF_COMPOUND_HEAD) ||
>> +			    !(page_flags & (KPF_COMPOUND_HEAD |
>> +			      KPF_COMPOUND_TAIL))) {
>
> Maybe we can put them into one line.

Sure.

>
>> +				if (cur_order < nr_orders) {
>> +					orders[cur_order]++;
>> +					cur_order = -1;
>> +					vaddr = next_folio_vaddr;
>> +				}
>> +				break;
>> +			}
>> +
>> +			/* not a head nor a tail in a THP? */
>> +			if (!(page_flags & KPF_COMPOUND_TAIL))
>> +				return -1;
>> +
>> +			cur_order++;
>> +			next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));
>> +		}
>
> The while loop share similar logic as the outer for loop. Is it possible
> reduce some duplication?

Outer loop is to filter order-0 and non head pages and while loop is
to find current THP/mTHP orders. It would be messy to combine them.
But feel free to provide ideas if you see a way.

>
>> +
>> +		if (status)
>> +			return status;
>> +	}
>> +	if (cur_order > 0 && cur_order < nr_orders)
>> +		orders[cur_order]++;
>> +	return 0;
>> +}
>> +
>> +int check_folio_orders(char *vaddr_start, size_t len, int pagemap_file,
>> +			int kpageflags_file, int orders[], int nr_orders)
>> +{
>> +	int *vaddr_orders;
>> +	int status;
>> +	int i;
>> +
>> +	vaddr_orders = (int *)malloc(sizeof(int) * nr_orders);
>> +
>
> I took a look into thp_setting.h, where defines an array with NR_ORDERS
> element which is 20. Maybe we can leverage it here, since we don't expect the
> order to be larger.
>

20 is too large for current use. We can revisit this when the function
gets more users.

>> +	if (!vaddr_orders)
>> +		ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders");
>> +
>> +	memset(vaddr_orders, 0, sizeof(int) * nr_orders);
>> +	status = gather_folio_orders(vaddr_start, len, pagemap_file,
>> +				     kpageflags_file, vaddr_orders, nr_orders);
>> +	if (status)
>> +		return status;
>> +
>> +	status = 0;
>> +	for (i = 0; i < nr_orders; i++)
>> +		if (vaddr_orders[i] != orders[i]) {
>> +			ksft_print_msg("order %d: expected: %d got %d\n", i,
>> +				       orders[i], vaddr_orders[i]);
>> +			status = -1;
>> +		}
>> +
>> +	return status;
>> +}
>> +
>> /* If `ioctls' non-NULL, the allowed ioctls will be returned into the var */
>> int uffd_register_with_ioctls(int uffd, void *addr, uint64_t len,
>> 			      bool miss, bool wp, bool minor, uint64_t *ioctls)
>> diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
>> index 1843ad48d32b..02e3f1e7065b 100644
>> --- a/tools/testing/selftests/mm/vm_util.h
>> +++ b/tools/testing/selftests/mm/vm_util.h
>> @@ -18,6 +18,11 @@
>> #define PM_SWAP                       BIT_ULL(62)
>> #define PM_PRESENT                    BIT_ULL(63)
>>
>> +#define KPF_COMPOUND_HEAD             BIT_ULL(15)
>> +#define KPF_COMPOUND_TAIL             BIT_ULL(16)
>> +#define KPF_THP                       BIT_ULL(22)
>> +
>> +
>> /*
>>  * Ignore the checkpatch warning, we must read from x but don't want to do
>>  * anything with it in order to trigger a read page fault. We therefore must use
>> @@ -85,6 +90,8 @@ bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size);
>> int64_t allocate_transhuge(void *ptr, int pagemap_fd);
>> unsigned long default_huge_page_size(void);
>> int detect_hugetlb_page_sizes(size_t sizes[], int max);
>> +int check_folio_orders(char *vaddr_start, size_t len, int pagemap_file,
>> +			int kpageflags_file, int orders[], int nr_orders);
>>
>> int uffd_register(int uffd, void *addr, uint64_t len,
>> 		  bool miss, bool wp, bool minor);
>> -- 
>> 2.47.2
>
> -- 
> Wei Yang
> Help you, Help me


Best Regards,
Yan, Zi


  reply	other threads:[~2025-08-11 18:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-08 19:01 [PATCH v2 0/3] Better split_huge_page_test result check Zi Yan
2025-08-08 19:01 ` [PATCH v2 1/3] mm/huge_memory: add new_order and offset to split_huge_pages*() pr_debug Zi Yan
2025-08-09 18:45   ` Wei Yang
2025-08-10 16:55   ` Donet Tom
2025-08-11  1:32   ` wang lian
2025-08-11  6:45   ` Baolin Wang
2025-08-11  6:58   ` Barry Song
2025-08-11  7:55   ` David Hildenbrand
2025-08-08 19:01 ` [PATCH v2 2/3] selftests/mm: add check_folio_orders() helper Zi Yan
2025-08-09 20:18   ` Wei Yang
2025-08-11 18:39     ` Zi Yan [this message]
2025-08-11 21:28       ` Wei Yang
2025-08-10 16:49   ` Donet Tom
2025-08-11 18:40     ` Zi Yan
2025-08-11  7:52   ` Baolin Wang
2025-08-11 18:41     ` Zi Yan
2025-08-08 19:01 ` [PATCH v2 3/3] selftests/mm: check after-split folio orders in split_huge_page_test Zi Yan
2025-08-10 16:53   ` Donet Tom
2025-08-11 18:20     ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B13F65A9-B001-4494-A060-23D95055553F@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=lianux.mm@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox