From: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
To: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Xu Xin <xu.xin16@zte.com.cn>,
Chengming Zhou <chengming.zhou@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Subject: [PATCH 0/3] ksm: perform a range-walk to jump over holes in break_ksm
Date: Tue, 28 Oct 2025 10:19:42 -0300 [thread overview]
Message-ID: <20251028131945.26445-1-pedrodemargomes@gmail.com> (raw)
When unmerging an address range, unmerge_ksm_pages function walks every
page address in the specified range to locate ksm pages. This becomes
highly inefficient when scanning large virtual memory areas that contain
mostly unmapped regions, causing the process to get blocked for several
minutes.
This patch makes break_ksm, function called by unmerge_ksm_pages for
every page in an address range, perform a range walk, allowing it to skip
over entire unmapped holes in a VMA, avoiding unnecessary lookups.
As pointed by David Hildenbrand in [1], unmerge_ksm_pages() is called
from:
* ksm_madvise() through madvise(MADV_UNMERGEABLE). There are not a lot
of users of that function.
* __ksm_del_vma() through ksm_del_vmas(). Effectively called when
disabling KSM for a process either through the sysctl or from s390x gmap
code when enabling storage keys for a VM.
Consider the following test program which creates a 32 TiB mapping in
the virtual address space but only populates a single page:
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
/* 32 TiB */
const size_t size = 32ul * 1024 * 1024 * 1024 * 1024;
int main() {
char *area = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_NORESERVE | MAP_PRIVATE | MAP_ANON, -1, 0);
if (area == MAP_FAILED) {
perror("mmap() failed\n");
return -1;
}
/* Populate a single page such that we get an anon_vma. */
*area = 0;
/* Enable KSM. */
madvise(area, size, MADV_MERGEABLE);
madvise(area, size, MADV_UNMERGEABLE);
return 0;
}
Without this patch, this program takes 9 minutes to finish, while with
this patch it finishes in less then 5 seconds.
[1] https://lore.kernel.org/linux-mm/e0886fdf-d198-4130-bd9a-be276c59da37@redhat.com/
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Pedro Demarchi Gomes (3):
Revert "mm/ksm: convert break_ksm() from walk_page_range_vma() to
folio_walk"
ksm: perform a range-walk in break_ksm
ksm: replace function unmerge_ksm_pages with break_ksm
mm/ksm.c | 142 ++++++++++++++++++++++++++++++++++---------------------
1 file changed, 88 insertions(+), 54 deletions(-)
--
2.43.0
next reply other threads:[~2025-10-28 13:22 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 13:19 Pedro Demarchi Gomes [this message]
2025-10-28 13:19 ` [PATCH 1/3] Revert "mm/ksm: convert break_ksm() from walk_page_range_vma() to folio_walk" Pedro Demarchi Gomes
2025-10-29 14:34 ` David Hildenbrand
2025-10-30 11:59 ` Pedro Demarchi Gomes
2025-10-28 13:19 ` [PATCH 2/3] ksm: perform a range-walk in break_ksm Pedro Demarchi Gomes
2025-10-29 14:45 ` David Hildenbrand
2025-10-30 12:29 ` Pedro Demarchi Gomes
2025-10-28 13:19 ` [PATCH 3/3] ksm: replace function unmerge_ksm_pages with break_ksm Pedro Demarchi Gomes
2025-10-29 14:46 ` David Hildenbrand
2025-10-30 12:44 ` Pedro Demarchi Gomes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251028131945.26445-1-pedrodemargomes@gmail.com \
--to=pedrodemargomes@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=xu.xin16@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox