From: Ankur Arora <ankur.a.arora@oracle.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Cc: akpm@linux-foundation.org, david@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
mjguzik@gmail.com, luto@kernel.org, peterz@infradead.org,
acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de,
willy@infradead.org, raghavendra.kt@amd.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
ankur.a.arora@oracle.com
Subject: [PATCH v5 13/14] mm: memory: support clearing page-extents
Date: Wed, 9 Jul 2025 17:59:25 -0700 [thread overview]
Message-ID: <20250710005926.1159009-14-ankur.a.arora@oracle.com> (raw)
In-Reply-To: <20250710005926.1159009-1-ankur.a.arora@oracle.com>
folio_zero_user() is constrained to clear in a page-at-a-time
fashion because it supports CONFIG_HIGHMEM which means that kernel
mappings for pages in a folio are not guaranteed to be contiguous.
We don't have this problem when running under configurations with
CONFIG_CLEAR_PAGE_EXTENT (implies !CONFIG_HIGHMEM), so zero in
longer page-extents.
This is expected to be faster because the processor can now optimize
the clearing based on the knowledge of the extent.
However, clearing in larger chunks can have two other problems:
- cache locality when clearing small folios (< MAX_ORDER_NR_PAGES)
(larger folios don't have any expectation of cache locality).
- preemption latency when clearing large folios.
Handle the first by splitting the clearing in three parts: the
faulting page and its immediate locality, its left and right
regions; the local neighbourhood is cleared last.
The second problem is relevant only when running under cooperative
preemption models. Limit the worst case preemption latency by clearing
in architecture specified ARCH_CLEAR_PAGE_EXTENT units.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
mm/memory.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 85 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index b0cda5aab398..c52806270375 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -7034,6 +7034,7 @@ static inline int process_huge_page(
return 0;
}
+#ifndef CONFIG_CLEAR_PAGE_EXTENT
static void clear_gigantic_page(struct folio *folio, unsigned long addr_hint,
unsigned int nr_pages)
{
@@ -7058,7 +7059,10 @@ static int clear_subpage(unsigned long addr, int idx, void *arg)
/**
* folio_zero_user - Zero a folio which will be mapped to userspace.
* @folio: The folio to zero.
- * @addr_hint: The address will be accessed or the base address if uncelar.
+ * @addr_hint: The address accessed by the user or the base address.
+ *
+ * folio_zero_user() uses clear_gigantic_page() or process_huge_page() to
+ * do page-at-a-time zeroing because it needs to handle CONFIG_HIGHMEM.
*/
void folio_zero_user(struct folio *folio, unsigned long addr_hint)
{
@@ -7070,6 +7074,86 @@ void folio_zero_user(struct folio *folio, unsigned long addr_hint)
process_huge_page(addr_hint, nr_pages, clear_subpage, folio);
}
+#else /* CONFIG_CLEAR_PAGE_EXTENT */
+
+static void clear_pages_resched(void *addr, int npages)
+{
+ int i, remaining;
+
+ if (preempt_model_preemptible()) {
+ clear_pages(addr, npages);
+ goto out;
+ }
+
+ for (i = 0; i < npages/ARCH_CLEAR_PAGE_EXTENT; i++) {
+ clear_pages(addr + i * ARCH_CLEAR_PAGE_EXTENT * PAGE_SIZE,
+ ARCH_CLEAR_PAGE_EXTENT);
+ cond_resched();
+ }
+
+ remaining = npages % ARCH_CLEAR_PAGE_EXTENT;
+
+ if (remaining)
+ clear_pages(addr + i * ARCH_CLEAR_PAGE_EXTENT * PAGE_SHIFT,
+ remaining);
+out:
+ cond_resched();
+}
+
+/*
+ * folio_zero_user - Zero a folio which will be mapped to userspace.
+ * @folio: The folio to zero.
+ * @addr_hint: The address accessed by the user or the base address.
+ *
+ * Uses architectural support for clear_pages() to zero page extents
+ * instead of clearing page-at-a-time.
+ *
+ * Clearing of small folios (< MAX_ORDER_NR_PAGES) is split in three parts:
+ * pages in the immediate locality of the faulting page, and its left, right
+ * regions; the local neighbourhood cleared last in order to keep cache
+ * lines of the target region hot.
+ *
+ * For larger folios we assume that there is no expectation of cache locality
+ * and just do a straight zero.
+ */
+void folio_zero_user(struct folio *folio, unsigned long addr_hint)
+{
+ unsigned long base_addr = ALIGN_DOWN(addr_hint, folio_size(folio));
+ const long fault_idx = (addr_hint - base_addr) / PAGE_SIZE;
+ const struct range pg = DEFINE_RANGE(0, folio_nr_pages(folio) - 1);
+ const int width = 2; /* number of pages cleared last on either side */
+ struct range r[3];
+ int i;
+
+ if (folio_nr_pages(folio) > MAX_ORDER_NR_PAGES) {
+ clear_pages_resched(page_address(folio_page(folio, 0)), folio_nr_pages(folio));
+ return;
+ }
+
+ /*
+ * Faulting page and its immediate neighbourhood. Cleared at the end to
+ * ensure it sticks around in the cache.
+ */
+ r[2] = DEFINE_RANGE(clamp_t(s64, fault_idx - width, pg.start, pg.end),
+ clamp_t(s64, fault_idx + width, pg.start, pg.end));
+
+ /* Region to the left of the fault */
+ r[1] = DEFINE_RANGE(pg.start,
+ clamp_t(s64, r[2].start-1, pg.start-1, r[2].start));
+
+ /* Region to the right of the fault: always valid for the common fault_idx=0 case. */
+ r[0] = DEFINE_RANGE(clamp_t(s64, r[2].end+1, r[2].end, pg.end+1),
+ pg.end);
+
+ for (i = 0; i <= 2; i++) {
+ int npages = range_len(&r[i]);
+
+ if (npages > 0)
+ clear_pages_resched(page_address(folio_page(folio, r[i].start)), npages);
+ }
+}
+#endif /* CONFIG_CLEAR_PAGE_EXTENT */
+
static int copy_user_gigantic_page(struct folio *dst, struct folio *src,
unsigned long addr_hint,
struct vm_area_struct *vma,
--
2.43.5
next prev parent reply other threads:[~2025-07-10 1:00 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 0:59 [PATCH v5 00/14] mm: folio_zero_user: clearing of page-extents Ankur Arora
2025-07-10 0:59 ` [PATCH v5 01/14] perf bench mem: Remove repetition around time measurement Ankur Arora
2025-07-15 20:04 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 02/14] perf bench mem: Defer type munging of size to float Ankur Arora
2025-07-15 20:05 ` Namhyung Kim
2025-07-16 2:17 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 03/14] perf bench mem: Move mem op parameters into a structure Ankur Arora
2025-07-15 20:06 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 04/14] perf bench mem: Pull out init/fini logic Ankur Arora
2025-07-15 20:09 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 05/14] perf bench mem: Switch from zalloc() to mmap() Ankur Arora
2025-07-15 20:09 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 06/14] perf bench mem: Allow mapping of hugepages Ankur Arora
2025-07-15 20:12 ` Namhyung Kim
2025-07-16 2:32 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 07/14] perf bench mem: Allow chunking on a memory region Ankur Arora
2025-07-15 20:17 ` Namhyung Kim
2025-07-16 2:34 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 08/14] perf bench mem: Refactor mem_options Ankur Arora
2025-07-15 20:18 ` Namhyung Kim
2025-07-10 0:59 ` [PATCH v5 09/14] perf bench mem: Add mmap() workloads Ankur Arora
2025-07-15 20:20 ` Namhyung Kim
2025-07-16 2:40 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 10/14] x86/mm: Simplify clear_page_* Ankur Arora
2025-07-11 11:47 ` David Hildenbrand
2025-07-11 17:26 ` Ankur Arora
2025-07-11 19:03 ` David Hildenbrand
2025-07-11 19:24 ` Ankur Arora
2025-07-11 19:27 ` David Hildenbrand
2025-07-10 0:59 ` [PATCH v5 11/14] x86/clear_page: Introduce clear_pages() Ankur Arora
2025-07-10 0:59 ` [PATCH v5 12/14] mm: add config option for clearing page-extents Ankur Arora
2025-07-10 7:58 ` Andrew Morton
2025-07-10 16:31 ` Ankur Arora
2025-07-11 11:39 ` David Hildenbrand
2025-07-11 17:25 ` Ankur Arora
2025-07-11 19:14 ` David Hildenbrand
2025-07-11 19:35 ` Ankur Arora
2025-07-11 11:40 ` David Hildenbrand
2025-07-11 17:32 ` Ankur Arora
2025-07-11 19:26 ` David Hildenbrand
2025-07-11 19:42 ` Ankur Arora
2025-07-14 20:35 ` Ankur Arora
2025-07-15 20:59 ` David Hildenbrand
2025-07-10 0:59 ` Ankur Arora [this message]
2025-07-11 11:44 ` [PATCH v5 13/14] mm: memory: support " David Hildenbrand
2025-07-11 13:27 ` Raghavendra K T
2025-07-11 17:39 ` Ankur Arora
2025-07-15 22:08 ` David Hildenbrand
2025-07-16 3:19 ` Ankur Arora
2025-07-16 8:03 ` David Hildenbrand
2025-07-16 17:54 ` Ankur Arora
2025-07-10 0:59 ` [PATCH v5 14/14] x86/clear_pages: Support clearing of page-extents Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250710005926.1159009-14-ankur.a.arora@oracle.com \
--to=ankur.a.arora@oracle.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox