From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Xu <peterx@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Matthew Wilcox <willy@infradead.org>,
Olivier Dion <odion@efficios.com>,
linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/2] SKSM: Synchronous Kernel Samepage Merging
Date: Mon, 3 Mar 2025 15:01:38 -0500 [thread overview]
Message-ID: <72810548-b917-49b7-b7ef-043c6b395d31@efficios.com> (raw)
In-Reply-To: <Z8I5iU6y_nVmCZk6@x1.local>
On 2025-02-28 17:32, Peter Xu wrote:
> On Fri, Feb 28, 2025 at 12:53:02PM -0500, Mathieu Desnoyers wrote:
>> On 2025-02-28 11:32, Peter Xu wrote:
>>> On Fri, Feb 28, 2025 at 09:59:00AM -0500, Mathieu Desnoyers wrote:
>>>> For the VM use-case, I wonder if we could just add a userfaultfd
>>>> "COW" event that would notify userspace when a COW happens ?
>>>
>>> I don't know what's the best for KSM and how well this will work, but we
>>> have such event for years.. See UFFDIO_REGISTER_MODE_WP:
>>>
>>> https://man7.org/linux/man-pages/man2/userfaultfd.2.html
>>
>> userfaultfd UFFDIO_REGISTER only seems to work if I pass an address
>> resulting from a mmap mapping, but returns EINVAL if I pass a
>> page-aligned address which sits within a private file mapping
>> (e.g. executable data).
>
> Yes, so far sync traps only supports RAM-based file systems, or anonymous.
> Generic private file mappings (that stores executables and libraries) are
> not yet supported.
>
>>
>> Also, I notice that do_wp_page() only calls handle_userfault
>> VM_UFFD_WP when vm_fault flags does not have FAULT_FLAG_UNSHARE
>> set.
>
> AFAICT that's expected, unshare should only be set on reads, never writes.
> So uffd-wp shouldn't trap any of those.
>
>>
>> AFAIU, as it stands now userfaultfd would not help tracking COW faults
>> caused by stores to private file mappings. Am I missing something ?
>
> I think you're right. So we have UFFD_FEATURE_WP_ASYNC that should work on
> most mappings. That one is async, though, so more like soft-dirty. It
> might be doable to try making it sync too without a lot of changes based on
> how async tracking works.
I'm looking more closely at admin-guide/mm/pagemap.rst and it appears to
be a good fit. Here is what I have in mind to replace the ksmd scanning
thread for the VM use-case by a purely user-space driven scanning:
Within qemu or similar user-space process:
1) Track guest memory with the userfaultfd UFFD_FEATURE_WP_ASYNC feature and
UFFDIO_REGISTER_MODE_WP mode.
2) Protect user-space memory with the PAGEMAP_SCAN ioctl PM_SCAN_WP_MATCHING flag
to detect memory which stays invariant for a long time.
3) Use the PAGEMAP_SCAN ioctl with PAGE_IS_WRITTEN to detect which pages are written to.
Keep track of memory which is frequently modified, so it can be left alone and
not write-protected nor merged anymore.
4) Whenever pages stay invariant for a given lapse of time, merge them with the new
madvise(2) KSM_MERGE behavior.
Let me know if that makes sense.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
next prev parent reply other threads:[~2025-03-03 20:01 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-28 2:30 Mathieu Desnoyers
2025-02-28 2:30 ` [RFC PATCH 1/2] mm: Introduce " Mathieu Desnoyers
2025-02-28 2:30 ` [RFC PATCH 2/2] selftests/kskm: Introduce SKSM basic test Mathieu Desnoyers
2025-02-28 2:51 ` [RFC PATCH 0/2] SKSM: Synchronous Kernel Samepage Merging Linus Torvalds
2025-02-28 3:03 ` Mathieu Desnoyers
2025-02-28 5:17 ` Linus Torvalds
2025-02-28 13:59 ` David Hildenbrand
2025-02-28 14:59 ` Sean Christopherson
2025-02-28 15:10 ` David Hildenbrand
2025-02-28 15:19 ` David Hildenbrand
2025-02-28 21:38 ` Mathieu Desnoyers
2025-02-28 21:45 ` David Hildenbrand
2025-02-28 21:49 ` Mathieu Desnoyers
2025-02-28 15:01 ` Mathieu Desnoyers
2025-02-28 15:18 ` David Hildenbrand
2025-02-28 14:59 ` Mathieu Desnoyers
2025-02-28 16:32 ` Peter Xu
2025-02-28 17:53 ` Mathieu Desnoyers
2025-02-28 22:32 ` Peter Xu
2025-03-01 15:44 ` Mathieu Desnoyers
2025-03-03 15:01 ` Peter Xu
2025-03-03 16:36 ` David Hildenbrand
2025-03-03 20:01 ` Mathieu Desnoyers [this message]
2025-03-03 20:45 ` Peter Xu
2025-03-03 20:49 ` David Hildenbrand
2025-03-05 14:06 ` Mathieu Desnoyers
2025-03-05 19:22 ` David Hildenbrand
2025-02-28 15:34 ` David Hildenbrand
2025-02-28 15:38 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=72810548-b917-49b7-b7ef-043c6b395d31@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=odion@efficios.com \
--cc=peterx@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox