linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	David Hildenbrand <david@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] mm: wrap __find_buddy_pfn() with a necessary buddy page validation.
Date: Fri, 01 Apr 2022 14:56:00 -0400	[thread overview]
Message-ID: <ADC113C0-F731-4835-AE3E-87C2302877B5@nvidia.com> (raw)
In-Reply-To: <CAHk-=wifQimj2d6npq-wCi5onYPjzQg4vyO4tFcPJJZr268cRw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7650 bytes --]

On 1 Apr 2022, at 14:29, Linus Torvalds wrote:

> On Fri, Apr 1, 2022 at 11:11 AM Zi Yan <zi.yan@sent.com> wrote:
>>
>> Whenever the buddy of a page is found from __find_buddy_pfn(),
>> page_is_buddy() should be used to check its validity. Add a helper
>> function find_buddy_page_pfn() to find the buddy page and do the check
>> together.
>
> Well, this certainly looks nicer, except now:
>
>> +extern struct page *find_buddy_page_pfn(struct page *page, unsigned int order);
>
> that 'pfn' no longer makes sense in the name, and
>
>> @@ -1075,11 +1118,11 @@ static inline void __free_one_page(struct page *page,
>>                                                                 migratetype);
>>                         return;
>>                 }
>> -               buddy_pfn = __find_buddy_pfn(pfn, order);
>> -               buddy = page + (buddy_pfn - pfn);
>>
>> -               if (!page_is_buddy(page, buddy, order))
>> +               buddy = find_buddy_page_pfn(page, order);
>> +               if (!buddy)
>>                         goto done_merging;
>> +               buddy_pfn = page_to_pfn(buddy);
>
> This case now does two "page_to_pfn()" calls (one inside
> find_buddy_page_pfn(), and one explicitly on the buddy).
>
> And those page_to_pfn() things can actually be fairly expensive. It
> *looks* like just a subtraction, but it's a pointer subtraction that
> can end up generating a divide by a non-power-of-two size.
>
> NORMALLY we try very hard to make 'sizeof struct page' be exactly 8
> words, but I do not believe this is actually guaranteed.
>
> And yeah, the divide-by-a-constant can be turned into a multiply, but
> even that is not necessarily always cheap.
>
> Now, two out of three use-cases didn't actually want the buddy_pfn(),
> but this one use-case does look like it might be performance-critical
> and a problem.

Yeah, I should have listened to your initial suggestion. Thanks for the
explanation.

How about the patch below? If it looks good, I will send v3.

diff --git a/mm/internal.h b/mm/internal.h
index 876e66237c89..754a666e5785 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -211,28 +211,8 @@ struct alloc_context {
 	bool spread_dirty_pages;
 };

-/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- *     B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- *     P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
- */
-static inline unsigned long
-__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
-{
-	return page_pfn ^ (1 << order);
-}
+extern struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+				unsigned int order, unsigned long *buddy_pfn);

 extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ea106146686..8195b4f64ec5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -998,6 +998,57 @@ static inline void del_page_from_free_list(struct page *page, struct zone *zone,
 	zone->free_area[order].nr_free--;
 }

+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ *     B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ *     B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ *     P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
+ */
+static inline unsigned long
+__find_buddy_pfn(unsigned long page_pfn, unsigned int order)
+{
+	return page_pfn ^ (1 << order);
+}
+
+/*
+ * Find the buddy of @page and validate it.
+ * @page: The input page
+ * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
+ *       function is used in the performance-critical __free_one_page().
+ * @order: The order of the page
+ * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
+ *             page_to_pfn().
+ *
+ * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
+ * not the same as @page. The validation is necessary before use it.
+ *
+ * Return: the found buddy page or NULL if not found.
+ */
+struct page *find_buddy_page_pfn(struct page *page, unsigned long pfn,
+			unsigned int order, unsigned long *buddy_pfn)
+{
+	struct page *buddy;
+
+	*buddy_pfn = __find_buddy_pfn(pfn, order);
+	buddy = page + (*buddy_pfn - pfn);
+
+	if (page_is_buddy(page, buddy, order))
+		return buddy;
+	return NULL;
+}
+
 /*
  * If this is not the largest possible page, check if the buddy
  * of the next-highest order is free. If it is, it's possible
@@ -1010,18 +1061,17 @@ static inline bool
 buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
 		   struct page *page, unsigned int order)
 {
-	struct page *higher_page, *higher_buddy;
-	unsigned long combined_pfn;
+	struct page *higher_page;
+	unsigned long higher_page_pfn;

 	if (order >= MAX_ORDER - 2)
 		return false;

-	combined_pfn = buddy_pfn & pfn;
-	higher_page = page + (combined_pfn - pfn);
-	buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1);
-	higher_buddy = higher_page + (buddy_pfn - combined_pfn);
+	higher_page_pfn = buddy_pfn & pfn;
+	higher_page = page + (higher_page_pfn - pfn);

-	return page_is_buddy(higher_page, higher_buddy, order + 1);
+	return find_buddy_page_pfn(higher_page, higher_page_pfn, order + 1,
+			&buddy_pfn) != NULL;
 }

 /*
@@ -1075,10 +1125,9 @@ static inline void __free_one_page(struct page *page,
 								migratetype);
 			return;
 		}
-		buddy_pfn = __find_buddy_pfn(pfn, order);
-		buddy = page + (buddy_pfn - pfn);

-		if (!page_is_buddy(page, buddy, order))
+		buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
+		if (!buddy)
 			goto done_merging;

 		if (unlikely(order >= pageblock_order)) {
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f67c4c70f17f..4fbd27ba91c6 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -70,7 +70,7 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	unsigned long flags, nr_pages;
 	bool isolated_page = false;
 	unsigned int order;
-	unsigned long pfn, buddy_pfn;
+	unsigned long buddy_pfn;
 	struct page *buddy;

 	zone = page_zone(page);
@@ -89,11 +89,9 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 	if (PageBuddy(page)) {
 		order = buddy_order(page);
 		if (order >= pageblock_order && order < MAX_ORDER - 1) {
-			pfn = page_to_pfn(page);
-			buddy_pfn = __find_buddy_pfn(pfn, order);
-			buddy = page + (buddy_pfn - pfn);
-
-			if (!is_migrate_isolate_page(buddy)) {
+			buddy = find_buddy_page_pfn(page, page_to_pfn(page),
+						    order, &buddy_pfn);
+			if (buddy && !is_migrate_isolate_page(buddy)) {
 				isolated_page = !!__isolate_free_page(page, order);
 				/*
 				 * Isolating a free page in an isolated pageblock

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  reply	other threads:[~2022-04-01 18:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-01 18:11 [PATCH v2 1/2] mm: page_alloc: simplify pageblock migratetype check in __free_one_page() Zi Yan
2022-04-01 18:11 ` [PATCH v2 2/2] mm: wrap __find_buddy_pfn() with a necessary buddy page validation Zi Yan
2022-04-01 18:29   ` Linus Torvalds
2022-04-01 18:56     ` Zi Yan [this message]
2022-04-01 19:01       ` Linus Torvalds
2022-04-01 19:25         ` Zi Yan
2022-04-01 22:33           ` Vlastimil Babka
2022-04-01 22:14 ` [PATCH v2 1/2] mm: page_alloc: simplify pageblock migratetype check in __free_one_page() Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ADC113C0-F731-4835-AE3E-87C2302877B5@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox