linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	wang lian <lianux.mm@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v3 2/4] selftests/mm: add check_folio_orders() helper.
Date: Wed, 13 Aug 2025 21:12:44 +0000	[thread overview]
Message-ID: <20250813211244.ikequq4kvgs65mpp@master> (raw)
In-Reply-To: <20250812155512.926011-3-ziy@nvidia.com>

On Tue, Aug 12, 2025 at 11:55:10AM -0400, Zi Yan wrote:
[...]
>+/*
>+ * gather_folio_orders - scan through [vaddr_start, len) and record folio orders
>+ * @vaddr_start: start vaddr
>+ * @len: range length
>+ * @pagemap_fd: file descriptor to /proc/<pid>/pagemap
>+ * @kpageflags_fd: file descriptor to /proc/kpageflags
>+ * @orders: output folio order array
>+ * @nr_orders: folio order array size
>+ *
>+ * gather_folio_orders() scan through [vaddr_start, len) and check all folios
>+ * within the range and record their orders. All order-0 pages will be recorded.

I feel a little confused about the description here. Especially on the
behavior when the range is not aligned on folio boundary. 

See following code at 1) and 2).

>+ * Non-present vaddr is skipped.
>+ *
>+ *
>+ * Return: 0 - no error, -1 - unhandled cases
>+ */
>+static int gather_folio_orders(char *vaddr_start, size_t len,
>+			       int pagemap_fd, int kpageflags_fd,
>+			       int orders[], int nr_orders)
>+{
>+	uint64_t page_flags = 0;
>+	int cur_order = -1;
>+	char *vaddr;
>+
>+	if (!pagemap_fd || !kpageflags_fd)
>+		return -1;

If my understanding is correct, we use open() to get a file descriptor.

On error it returns -1. And 0 is a possible valid value, but usually used by
stdin. The code may work in most cases, but seems not right.

>+	if (nr_orders <= 0)
>+		return -1;
>+

Maybe we want to check orders[] here too?

>+	for (vaddr = vaddr_start; vaddr < vaddr_start + len;) {
>+		char *next_folio_vaddr;
>+		int status;
>+
>+		status = get_page_flags(vaddr, pagemap_fd, kpageflags_fd,
>+					&page_flags);
>+		if (status < 0)
>+			return -1;
>+
>+		/* skip non present vaddr */
>+		if (status == 1) {
>+			vaddr += psize();
>+			continue;
>+		}
>+
>+		/* all order-0 pages with possible false postive (non folio) */

Do we still false positive case? Non-present page returns 1, which is handled
above.

>+		if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) {
>+			orders[0]++;
>+			vaddr += psize();
>+			continue;
>+		}
>+
>+		/* skip non thp compound pages */
>+		if (!(page_flags & KPF_THP)) {
>+			vaddr += psize();
>+			continue;
>+		}
>+
>+		/* vpn points to part of a THP at this point */
>+		if (page_flags & KPF_COMPOUND_HEAD)
>+			cur_order = 1;
>+		else {
>+			/* not a head nor a tail in a THP? */
>+			if (!(page_flags & KPF_COMPOUND_TAIL))
>+				return -1;

When reaches here, we know (page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL)).
So we have at least one of it set.

Looks not possible to hit it?

>+
>+			vaddr += psize();
>+			continue;

1)

In case vaddr points to the middle of a large folio, this will skip this folio
and count from next one.

>+		}
>+
>+		next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));
>+
>+		if (next_folio_vaddr >= vaddr_start + len)
>+			break;
>+
>+		while ((status = get_page_flags(next_folio_vaddr, pagemap_fd,
>+						 kpageflags_fd,
>+						 &page_flags)) >= 0) {
>+			/*
>+			 * non present vaddr, next compound head page, or
>+			 * order-0 page
>+			 */
>+			if (status == 1 ||
>+			    (page_flags & KPF_COMPOUND_HEAD) ||
>+			    !(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) {
>+				if (cur_order < nr_orders) {
>+					orders[cur_order]++;
>+					cur_order = -1;
>+					vaddr = next_folio_vaddr;
>+				}
>+				break;
>+			}
>+
>+			/* not a head nor a tail in a THP? */
>+			if (!(page_flags & KPF_COMPOUND_TAIL))
>+				return -1;
>+
>+			cur_order++;
>+			next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));

2)

If (vaddr_start + len) points to the middle of a large folio and folio is more
than order 1 size, we may continue the loop and still count this last folio.
Because we don't check next_folio_vaddr and (vaddr_start + len).

A simple chart of these case.

          vaddr_start                   +     len
               |                               |
               v                               v
     +---------------------+              +-----------------+
     |folio 1              |              |folio 2          |
     +---------------------+              +-----------------+

folio 1 is not counted, but folio 2 is counted.

So at 1) and 2) handles the boundary differently. Not sure this is designed
behavior. If so I think it would be better to record in document, otherwise
the behavior is not obvious to user.

>+		}
>+
>+		if (status < 0)
>+			return status;
>+	}
>+	if (cur_order > 0 && cur_order < nr_orders)
>+		orders[cur_order]++;

Another boundary case here.

If we come here because (next_folio_vaddr >= vaddr_start + len) in the for
loop instead of the while loop. This means we found the folio head at vaddr,
but the left range (vaddr_start + len - vaddr) is less than or equal to order
1 page size.

But we haven't detected the real end of this folio. If this folio is more than
order 1 size, we still count it an order 1 folio.

>+	return 0;
>+}
>+
>+int check_folio_orders(char *vaddr_start, size_t len, int pagemap_fd,
>+			int kpageflags_fd, int orders[], int nr_orders)
>+{
>+	int *vaddr_orders;
>+	int status;
>+	int i;
>+
>+	vaddr_orders = (int *)malloc(sizeof(int) * nr_orders);
>+
>+	if (!vaddr_orders)
>+		ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders");
>+
>+	memset(vaddr_orders, 0, sizeof(int) * nr_orders);
>+	status = gather_folio_orders(vaddr_start, len, pagemap_fd,
>+				     kpageflags_fd, vaddr_orders, nr_orders);
>+	if (status)
>+		goto out;
>+
>+	status = 0;
>+	for (i = 0; i < nr_orders; i++)
>+		if (vaddr_orders[i] != orders[i]) {
>+			ksft_print_msg("order %d: expected: %d got %d\n", i,
>+				       orders[i], vaddr_orders[i]);
>+			status = -1;
>+		}
>+
>+out:
>+	free(vaddr_orders);
>+	return status;
>+}

-- 
Wei Yang
Help you, Help me


  parent reply	other threads:[~2025-08-13 21:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-12 15:55 [PATCH v3 0/4] Better split_huge_page_test result check Zi Yan
2025-08-12 15:55 ` [PATCH v3 1/4] mm/huge_memory: add new_order and offset to split_huge_pages*() pr_debug Zi Yan
2025-08-12 15:55 ` [PATCH v3 2/4] selftests/mm: add check_folio_orders() helper Zi Yan
2025-08-13  3:38   ` wang lian
2025-08-14 17:50     ` Zi Yan
2025-08-13 21:12   ` Wei Yang [this message]
2025-08-13 21:56     ` Zi Yan
2025-08-12 15:55 ` [PATCH v3 3/4] selftests/mm: reimplement is_backed_by_thp() with more precise check Zi Yan
2025-08-13 21:41   ` Wei Yang
2025-08-13 21:58     ` Zi Yan
2025-08-12 15:55 ` [PATCH v3 4/4] selftests/mm: check after-split folio orders in split_huge_page_test Zi Yan
2025-08-14  9:16   ` Wei Yang
2025-08-14 13:35     ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250813211244.ikequq4kvgs65mpp@master \
    --to=richard.weiyang@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=lianux.mm@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox