From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DAB1C282C6 for ; Mon, 3 Mar 2025 20:45:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD4ED280001; Mon, 3 Mar 2025 15:45:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B85186B0095; Mon, 3 Mar 2025 15:45:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4C68280001; Mon, 3 Mar 2025 15:45:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8955E6B0093 for ; Mon, 3 Mar 2025 15:45:25 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F243AC0FEC for ; Mon, 3 Mar 2025 20:45:24 +0000 (UTC) X-FDA: 83181420168.26.FD30FD9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf23.hostedemail.com (Postfix) with ESMTP id B175F140009 for ; Mon, 3 Mar 2025 20:45:22 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="GKa/KyrS"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741034722; a=rsa-sha256; cv=none; b=12BTbwHbfK1VaLr/Ks2k/ZMqXvyF/jtvTCF/dwquKM4GkJ6JosESbjQU61RorBVJypHoe+ trSpXrCjAhLET/Pupsn7OgfdilIOEe7e0vpNu3c/+340nAQ6Ph8auMs/quheNHWmX8+n6k vBMwXsBZYFyvf/Q7HABKhdd3+nMjNtU= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="GKa/KyrS"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741034722; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yOa8kLE6/JcsUiXWrc6BUEi3i8IaVFrG4gnUMbJudMc=; b=bFNvfM78vONHnfbgI3OIG7DQ44PwU21ZkF3x8M5CYLOsV79naGZyG9TnXDmP9FEynDpMuU 2S+8X4uY9rnZAxAdS8fnQD/ZdQ82iqIpPPBOrf5zVCH1CbJJMNgyK6cYBw6paGGWWrkXVJ ZDydheaHekeF59BItW3DjK1BZoJ1Ch8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741034722; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yOa8kLE6/JcsUiXWrc6BUEi3i8IaVFrG4gnUMbJudMc=; b=GKa/KyrSHuGfCezsa3HPkS512NUr7UK127ymP3TvP+ZfgHMl11rHAapMoHN0QlG7ONgoTp 61in2XikjPDXVT+CIPP6RLtaa00cq3+c7Gl9qOHGsNYebcoXztjvp8mX3oxmhL3BhZ4QUI vVa3qI0PyD/RjXMyN4wZ6q2aEo5OyDM= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-554-UtkV17wZMraPCqRQqV4wuw-1; Mon, 03 Mar 2025 15:45:20 -0500 X-MC-Unique: UtkV17wZMraPCqRQqV4wuw-1 X-Mimecast-MFC-AGG-ID: UtkV17wZMraPCqRQqV4wuw_1741034720 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6e8c58184e9so30440686d6.0 for ; Mon, 03 Mar 2025 12:45:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741034720; x=1741639520; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yOa8kLE6/JcsUiXWrc6BUEi3i8IaVFrG4gnUMbJudMc=; b=eK+418awjNZvbI3YjHi61vFIuHuagLK0n2Xa1osxm74tcYHR7uNNl6QbmCtkQ1eNnu q2eqwqmj24xVz3vZeBIrSoeobmUOY0YBof0U9R0krnm/qBqwo7pCSz1EIo/tIgJuNO7o KjRqG22N2osRP4AC3ZFageCHxOPF3HsF6lGyw5B6TTjNZeGGYRRitTs/NLFgztj6PSsL r5KjDDbwFdhoKppsJnWUB7ZwydSoZULWm93Rgt13PinXYzbZk8UZKx5iUXZ4+LZ9Kv3K Kg8FRKGq/TUdAd+IyrSvwEYtU1jjFwdKH7yPIIv+589V8iTLPCa9YzkqwfmyeO0L+I3d 7TOQ== X-Forwarded-Encrypted: i=1; AJvYcCWL9H3BaFNfVY5c2VXUP/daEgIfSAE8i/PIJcwyzuZg3iKX3sW11g37Euq0VpcmGu/yk1GEb5ho1Q==@kvack.org X-Gm-Message-State: AOJu0YyWJuNN2zakgGlq+KgcKBgmXTGJNaSwB7QrG+EX9CQLNcWyMmZ4 Ivg86QZtOiLQBUGOapiEeR7RYoyGwb+0MCXthin2LQdQp2DISDUbFQqGPloXW9sOKSGBtqWbf3T QIvwOqAZg2DALTunf1XuUZWe0SdxzoJxQx1L/y98q/wO5BhaT X-Gm-Gg: ASbGnctAO9Y1w343aZXpKRHICZGhh91nA44YGapD4zsYVAqNJqmnvjPEgn61bbvzcJC o5Hze+oD/7JbAi9Ky3smgSR7ECYl38j1ul93B+SHXzDSA20Hgu1ScKxshWQbaPh/JprxkRI09wF J8mbx74aTxZL5ehHl0nOPVtFZzMeUL91FP4ISVv1U1owYeBvaX5830Fh2kUOJvfpc+0HA/I4Okq ad4zMORpRK+jeVRvzbBm4a1xmX4ZNWgMeaxGXTyJY0BcixTAe3NfE+g1GwceeBsNYulIuYUzIY/ kch0YfQ= X-Received: by 2002:a05:6214:1d28:b0:6e4:8774:3743 with SMTP id 6a1803df08f44-6e8a0d9f2a0mr219107946d6.41.1741034720278; Mon, 03 Mar 2025 12:45:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IHS7c+Jcth0ymjCrBovitR0r7qO6DrrfMff/miVtt7KtbKo7B0dh6BtjcpnebNZHWabydFfTg== X-Received: by 2002:a05:6214:1d28:b0:6e4:8774:3743 with SMTP id 6a1803df08f44-6e8a0d9f2a0mr219107646d6.41.1741034719930; Mon, 03 Mar 2025 12:45:19 -0800 (PST) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e897634d8dsm57205356d6.10.2025.03.03.12.45.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 12:45:18 -0800 (PST) Date: Mon, 3 Mar 2025 15:45:15 -0500 From: Peter Xu To: Mathieu Desnoyers Cc: Linus Torvalds , Andrew Morton , linux-kernel@vger.kernel.org, Matthew Wilcox , Olivier Dion , linux-mm@kvack.org Subject: Re: [RFC PATCH 0/2] SKSM: Synchronous Kernel Samepage Merging Message-ID: References: <20250228023043.83726-1-mathieu.desnoyers@efficios.com> <8524caa9-e1f6-4411-b86b-d9457ddb8007@efficios.com> <60f148db-7586-4154-a909-d433bad39794@efficios.com> <72810548-b917-49b7-b7ef-043c6b395d31@efficios.com> MIME-Version: 1.0 In-Reply-To: <72810548-b917-49b7-b7ef-043c6b395d31@efficios.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: hzbH4McjHdBRdW-J1BFcxyP-w-W7uFHXAs4JXS77mcQ_1741034720 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B175F140009 X-Stat-Signature: hfnf5gzaupg86q8rjexybw6bqx5w19ys X-HE-Tag: 1741034722-214815 X-HE-Meta: U2FsdGVkX18l/ABbLiwwolpgy+meap+PrrQ2WIUPDSyxQtRjm2NOV0uQY7rBlbp4qhW5pvx8J02IDxrRj64N/T7cLw4tlhfv8EJUmKs+bgX8ON7QY9RuBA6wbClJUVypQv2RACi3fPeYc7FlAYGCGhOQMU3beCZ19v638d+NjVowsrBJ6g27taWJx4TpQOYVIN4L3YEq9PwoBJVLpFyjCOKqnXraJ+nMc5AW5eJb/0hdzJlga4D6vhLTVrqx1lFUr6i9dd0m0Qp9PxDPFf6QuNOAWhrLSomdDY0E+MWYgw+OOTnV/nyR952cCPzNePWk6NYhgvHLQrz0SkmXNi5pVKkXVsSHPLEy6zRH1R9Tbtqweb+F4veAXygI0hR6MEowzb5eYywosyVR7JTPxKqrawrZQy2FopE7kAFOuYV2N5xmrVNx5oT7yo6nugGFQ0C+IpIkfCvmpEJXhPP+xMo1gizteQKJtO3UgVuyLrtJEmVfko6K6clBPszQxJ9mAI0XN/pUYkGLLiHm8cbZt8qwYwlYQzJ/Lka2rpQQYnwt4F3bJY066SXIQ1uu9zBQfA7ibzDLXUR9y3e1zKLIE8Yk6DijYnmj2bEYn/44pjo0va4H/ntOkpwFNBxMzXSnqEIrVx9N6LILSgg93jLwXXvp+Tn+Dy8ciRq9fJRnsSxNePCIvalj1F25F9WCD4zb7uoGYYKtKe7Xj9a5XLSeD5JDD+J+GNQWZeGP8NhlK7/q4KDGzkxdcpt6yVZwKTkwUxgP5qTWo+nwH5SP0dzxdrSTq4JMPyzUl8XfBEF3hW/ngHklPQY1PmCyijKJ2xxfT8VvmqhCCl3Q3noAynfozlXYTneZrc5IKK1XOHThrOViag2d9JPzzRqwUE8hPto+fwff9yvmkvBLyx8L8GvFAl1pNXyecinVoJwSOx40QkgPX681fj4pfMlDtxwuxGhtJL6i/Goyl87EDLb5iiBSiAZ ep+vU883 wmFbXQD9yqXo91ebjuDAY7ELaRlNuI6mq5mf1kDf4FXgOKvJhFPhSiz0Pxbwc876o5TwfmA1hvqJqG0umTSMTub7acX6ZxqP0uiGS1CcG2E8EUmQeVj+V+aTd9jMBvXit8uJ9rW0ko9tAULK08cR+AthKCmaTRIDt+OqTB1CnjPUlXXD5KdsTsXu7fXFxjwPDVlUy+Uyj242u02Bv7y0QXhX4j5Cqh8fiIhMw+zEe3TR0bZ3ECjunfmKGUH7vo899wxeeDp1qi+UJbkiNJH2OLwuYX2kRuzpt3J05SnnRY4T3pVhWmFsXRHhCWmUtyz1Dzf/7bZecW/7I1ScXfVVWtWP48a9qMpdDRMvpABc/HfiFvNboa9wWvG4Hf0S1vwqBsuaxLmVmNj2JpLG5FKUpa+YtTvqPJk2GHohzjDPkARs/wtRk80/OP2A89Eh64Oey3lkqVFC1Exo1hqhYx898RlB5+sOnPEnQjNIUnMKOuVgWB8qPlrqJgQI3oytQkdKwT8AE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 03, 2025 at 03:01:38PM -0500, Mathieu Desnoyers wrote: > On 2025-02-28 17:32, Peter Xu wrote: > > On Fri, Feb 28, 2025 at 12:53:02PM -0500, Mathieu Desnoyers wrote: > > > On 2025-02-28 11:32, Peter Xu wrote: > > > > On Fri, Feb 28, 2025 at 09:59:00AM -0500, Mathieu Desnoyers wrote: > > > > > For the VM use-case, I wonder if we could just add a userfaultfd > > > > > "COW" event that would notify userspace when a COW happens ? > > > > > > > > I don't know what's the best for KSM and how well this will work, but we > > > > have such event for years.. See UFFDIO_REGISTER_MODE_WP: > > > > > > > > https://man7.org/linux/man-pages/man2/userfaultfd.2.html > > > > > > userfaultfd UFFDIO_REGISTER only seems to work if I pass an address > > > resulting from a mmap mapping, but returns EINVAL if I pass a > > > page-aligned address which sits within a private file mapping > > > (e.g. executable data). > > > > Yes, so far sync traps only supports RAM-based file systems, or anonymous. > > Generic private file mappings (that stores executables and libraries) are > > not yet supported. > > > > > > > > Also, I notice that do_wp_page() only calls handle_userfault > > > VM_UFFD_WP when vm_fault flags does not have FAULT_FLAG_UNSHARE > > > set. > > > > AFAICT that's expected, unshare should only be set on reads, never writes. > > So uffd-wp shouldn't trap any of those. > > > > > > > > AFAIU, as it stands now userfaultfd would not help tracking COW faults > > > caused by stores to private file mappings. Am I missing something ? > > > > I think you're right. So we have UFFD_FEATURE_WP_ASYNC that should work on > > most mappings. That one is async, though, so more like soft-dirty. It > > might be doable to try making it sync too without a lot of changes based on > > how async tracking works. > > I'm looking more closely at admin-guide/mm/pagemap.rst and it appears to > be a good fit. Here is what I have in mind to replace the ksmd scanning > thread for the VM use-case by a purely user-space driven scanning: > > Within qemu or similar user-space process: > > 1) Track guest memory with the userfaultfd UFFD_FEATURE_WP_ASYNC feature and > UFFDIO_REGISTER_MODE_WP mode. > > 2) Protect user-space memory with the PAGEMAP_SCAN ioctl PM_SCAN_WP_MATCHING flag > to detect memory which stays invariant for a long time. > > 3) Use the PAGEMAP_SCAN ioctl with PAGE_IS_WRITTEN to detect which pages are written to. > Keep track of memory which is frequently modified, so it can be left alone and > not write-protected nor merged anymore. > > 4) Whenever pages stay invariant for a given lapse of time, merge them with the new > madvise(2) KSM_MERGE behavior. > > Let me know if that makes sense. I can't speak of how KSM should go from there, but from userfault tracking POV, that makes sense to me. Thanks, -- Peter Xu