From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8DD6C83F0A for ; Wed, 2 Jul 2025 17:08:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 488F96B00C1; Wed, 2 Jul 2025 13:08:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4398D6B00C2; Wed, 2 Jul 2025 13:08:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 328716B00C3; Wed, 2 Jul 2025 13:08:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1AD0E6B00C1 for ; Wed, 2 Jul 2025 13:08:55 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D731780261 for ; Wed, 2 Jul 2025 17:08:54 +0000 (UTC) X-FDA: 83619959388.12.7FAEDB9 Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) by imf05.hostedemail.com (Postfix) with ESMTP id AF428100012 for ; Wed, 2 Jul 2025 17:08:52 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=pt9hHihJ; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf05.hostedemail.com: domain of "prvs=2713af552=kalyazin@amazon.co.uk" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=2713af552=kalyazin@amazon.co.uk" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751476133; a=rsa-sha256; cv=none; b=x20l9Ty5hu4OXv6YUb6CW39Qv7ooZlzKSmWMWNDuCatsAG6LYMfaTlDRkZME3ino0VtAOs G7+ebCSwMimLiG+NVrlT2zlLmMUrRWMMGZNOx0wKET+sqdxNABnvkk3xG5pSsAOWBuc9vr D7mU5jORzFzIDBrNKR7iEPfrt88bZg4= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=pt9hHihJ; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf05.hostedemail.com: domain of "prvs=2713af552=kalyazin@amazon.co.uk" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=2713af552=kalyazin@amazon.co.uk" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751476133; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H3hxr2Fy/BBV4dpcyVkROyueSBD9Z+ON9jAG+cUHzRk=; b=DD3/8m5AYJtf0uYN+6+yMPsWjfyGIOhIjabCX2LAafOkDwca96yl6D5eor3cxToq3ANTWY x29P6bjRQo2UwA5JOjv/FKrECrM1gxTF+Nx8AvO66J6yXPHVDdQxweiuCfsrlX42pui7oA n4jo6dDRgxoli00r2aVS6U6x3CjN0DM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2; t=1751476133; x=1783012133; h=message-id:date:mime-version:reply-to:subject:to:cc: references:from:in-reply-to:content-transfer-encoding; bh=H3hxr2Fy/BBV4dpcyVkROyueSBD9Z+ON9jAG+cUHzRk=; b=pt9hHihJoBdqwGfFS15gswhk+vqaayIOkYYbHblhPGGZV5OFIzMXiOKP KpLQ2C5NLX0lWEHOpOuBDvFG8A1sKOUx1WF4itoEIvYI5N6J7f0IyuasT XrzD4YtXO5cToDW1GS3+Chxtq0rnYwrvLFWoeyJrrwChAq7hc8Pbjr2Lp 9Tda22QnywGhyAN37HYGfdutJ9MH8vYpuDnfFfylrGN4SiXnZymQSX1yH DTpDB3DDtQJAb9MAIbKDDJDDaoa1BvK0sPlBbJsTJlHkRcfzLia8sf3hW W2J59+Q6AEd5GY7igLIjCapLtUjITd114MJpWt3gvIiW96md3ecO1Zw6L g==; X-IronPort-AV: E=Sophos;i="6.16,281,1744070400"; d="scan'208";a="421152674" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jul 2025 17:08:49 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.17.79:5972] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.32.240:2525] with esmtp (Farcaster) id 344cae8f-1feb-414e-987a-7036a54f2bb1; Wed, 2 Jul 2025 17:08:48 +0000 (UTC) X-Farcaster-Flow-ID: 344cae8f-1feb-414e-987a-7036a54f2bb1 Received: from EX19D022EUC002.ant.amazon.com (10.252.51.137) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 2 Jul 2025 17:08:47 +0000 Received: from [192.168.3.69] (10.106.83.11) by EX19D022EUC002.ant.amazon.com (10.252.51.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 2 Jul 2025 17:08:46 +0000 Message-ID: Date: Wed, 2 Jul 2025 18:08:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: Subject: Re: [PATCH v2 1/4] mm: Introduce vm_uffd_ops API To: Lorenzo Stoakes , Suren Baghdasaryan CC: Peter Xu , , , Vlastimil Babka , Muchun Song , Mike Rapoport , Hugh Dickins , Andrew Morton , James Houghton , "Liam R . Howlett" , "Michal Hocko" , David Hildenbrand , "Andrea Arcangeli" , Oscar Salvador , "Axel Rasmussen" , Ujwal Kundur References: <20250627154655.2085903-1-peterx@redhat.com> <20250627154655.2085903-2-peterx@redhat.com> <982f4f94-f0bf-45dd-9003-081b76e57027@lucifer.local> Content-Language: en-US From: Nikita Kalyazin Autocrypt: addr=kalyazin@amazon.com; keydata= xjMEY+ZIvRYJKwYBBAHaRw8BAQdA9FwYskD/5BFmiiTgktstviS9svHeszG2JfIkUqjxf+/N JU5pa2l0YSBLYWx5YXppbiA8a2FseWF6aW5AYW1hem9uLmNvbT7CjwQTFggANxYhBGhhGDEy BjLQwD9FsK+SyiCpmmTzBQJnrNfABQkFps9DAhsDBAsJCAcFFQgJCgsFFgIDAQAACgkQr5LK IKmaZPOpfgD/exazh4C2Z8fNEz54YLJ6tuFEgQrVQPX6nQ/PfQi2+dwBAMGTpZcj9Z9NvSe1 CmmKYnYjhzGxzjBs8itSUvWIcMsFzjgEY+ZIvRIKKwYBBAGXVQEFAQEHQCqd7/nb2tb36vZt ubg1iBLCSDctMlKHsQTp7wCnEc4RAwEIB8J+BBgWCAAmFiEEaGEYMTIGMtDAP0Wwr5LKIKma ZPMFAmes18AFCQWmz0MCGwwACgkQr5LKIKmaZPNTlQEA+q+rGFn7273rOAg+rxPty0M8lJbT i2kGo8RmPPLu650A/1kWgz1AnenQUYzTAFnZrKSsXAw5WoHaDLBz9kiO5pAK In-Reply-To: <982f4f94-f0bf-45dd-9003-081b76e57027@lucifer.local> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.106.83.11] X-ClientProxiedBy: EX19D013EUB002.ant.amazon.com (10.252.51.109) To EX19D022EUC002.ant.amazon.com (10.252.51.137) X-Stat-Signature: m9bzzududnasn4zp6ukopx1jzsacfgpu X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: AF428100012 X-Rspam-User: X-HE-Tag: 1751476132-342911 X-HE-Meta: U2FsdGVkX18ftCG3GO7z+yvPBlyCjyzy22PMGQjiwNKzzuayrRRcpixWx2/0bSskOFPwyKKO5Xte+IuwW7NkikrH3QTQHtDsu7WIgtMdZz613sYZYCti4uSNKyxDSvjRHdPWCMLc3C0jwx2b8SGABfYq19FaTJZ9CWtysuf0l2lyDgoeuE2b22tpAEt3rxL4V4oxkO2orlCp9oUY3jDoAhW/KA01CvrVyThpMqlo+/gm2imwO/4L9xc3zxT4ZZRKzCx8Fo87evODJiIZmv/qbofJ04o7e5+AKJtoK+4ifQc3uYbrocx2eANv4Po2qRF2atb4ZByhPJF9yeaIdaEKIT/6K8QoDU6fxq6kT1pCqCulKp6tndotI3RjlLMDqqGBt2nlpSE3ZleBQ9tIzkrBcy9Ieu5pEZL1HWQSZoIlS767fOFZoGN5B6nhY6oG6ZyIBQZBPObfgjq7wMuDVP6F44LC6LRTh8iS4ChvxGUl0zxLTYoyEOtQ6dfH2iXQQQinLzuL9n8WuRAcI+20HZW0ugOjoY288EUY3rRGpz6anucpuYfIaI+MdSEyj7UZ3KXdTF2bh9x6/yPox2wimp+Lg0RuIsqnnh3ORTlxorxxF9e6LQR+/1a4iwmcjDzomV8HTdGTiHoKt+7EJKpOHl6kTyk5t547XzpMGnRkTaY6X8rRAT1utigOPfHClPcvm5gAyfrT0lrnSXeKLdhEVsu3C+meNjY45FEpW46O40LuGMHzyqzglgifxcPIoKoSw0tlG3chhPca4oq/tnwAGyvnHn4IZtH62xJZOHzHE/CLnr2Ffu+WnTc7yg+4IHeB+ycs5D+6vRuE7n+kbRnUZhnUEgQQZeCO8AgELzE9H1UaJ39v+dTvd5ueC8+7/BTqHRXI4uvw62WXIccN9NLQr0YBFdQnklQhxiKEksj967bI4svxOM86il3pO0P9+1G5505g4jvDQAaOIM8gVi9R6tG 5GZwtza1 DK5+JnWsCQ9eoDolzUr6Ug/95GZCjKKqeCFDC+aZ6ffK5ITIwOqdj+Mvkejsqeps+O+xTxOauyne2y/anPPelHFowbpflCMBnQ+1gbY092NHsaiJ9HyhlyIYKfVUD5ayt9luKBz4Rz+y1poBw2jumpqeWuXFlDBDecivzxpA3LkTwhzlQdfXv8vyKoRRyInCWgZChb+TF1CiHUGWuhYVjzIyZItFU0JWcru65slBHgq9JJR8E3L0+BF2Ypnb/et3D45pZ5PStQnHdX0dyQPqXFDy8nRS/8cU8yuj3RT7sQa3IWPqbGY3V7Y4cg9vO2YmE0/oZ691gjH8qk+4JWjEWeRhBthaPoXHDMoWQMXhQGSczJpe2fs5fMSignatqNZfAe8FICFuk6+2xDiXzW2WEdFbdl02ayzm6BsMMWQN1lAZBOMrwsfaGxKA/FrweMKb6kDdHoPC3xWzVb8aVjTSm6ygxoCX+RmdGKCBMolYuVdEA4RNfAqJFTG15ngbPwsVBWrKXdJ12t0AzBHXST+bS4pJI1jQ8kcaQhgO8stjD6qWVvffPYBokb4r7Z9eFKqQp1mTG9KLHI6PcjUGnNCs30ec4Sw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 02/07/2025 16:56, Lorenzo Stoakes wrote: > On Tue, Jul 01, 2025 at 10:04:28AM -0700, Suren Baghdasaryan wrote: >> On Mon, Jun 30, 2025 at 3:16 AM Lorenzo Stoakes >> wrote: >>> This feels like you're trying to put mm functionality outside of mm? >> >> To second that, two things stick out for me here: >> 1. uffd_copy and uffd_get_folio seem to be at different abstraction >> levels. uffd_copy is almost the entire copy operation for VM_SHARED >> VMAs while uffd_get_folio is a small part of the continue operation. >> 2. shmem_mfill_atomic_pte which becomes uffd_copy for shmem in the >> last patch is quite a complex function which itself calls some IMO >> pretty internal functions like mfill_atomic_install_pte(). Expecting >> modules to implement such functionality seems like a stretch to me but >> maybe this is for some specialized modules which are written by mm >> experts only? > > To echo what Liam said - I don't think we can truly rely on expertise here > (we make enough mistakes in core mm for that to be a dubious proposition > even tere :) and even if experts were involved, having core mm > functionality outside of core mm carries significant risk - we are > constantly changing things, including assumptions around sensitive topics > such as locking (think VMA locking) - having code elsewhere significantly > increases the risk of missing things. > > I am also absolutely, to be frank, not going to accept us EXPORT()'ing > anything core. > > Page table manipulation really must rely in core mm and arch code only, it > is easily some of the most subtle, confusing and dangerous code in mm (I > have spent subtantial hours banging my head against it recently), and again > - subject to constant change. > > But to come back to Liam's comments and to reiterate what I was referring > to earlier, even permitting drivers to have access to VMAs is _highly_ > problematic and has resulted in very real bugs and subtle issues that took > many hours, much stress + gnashing of teeth to adress. The main target of this change is the implementation of UFFD for KVM/guest_memfd (examples: [1], [2]) to avoid bringing KVM-specific code into the mm codebase. We usually mean KVM by the "drivers" in this context, and it is already somewhat "knowledgeable" of the mm. I don't think there are existing use cases for other drivers to implement this at the moment. Although I can't see new exports in this series, there is now a way to limit exports to particular modules [3]. Would it help if we only do it for KVM initially (if/when actually needed)? [1] https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/ [2] https://lore.kernel.org/all/7666ee96-6f09-4dc1-8cb2-002a2d2a29cf@amazon.com/ [3] https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git/commit/?h=kbuild&id=707f853d7fa3ce323a6875487890c213e34d81a0 Thanks, Nikita > > The very thing of: > > xxx > > yyy > > Means that between xxx and yyy we can make literally no assumptions about > what just happened to all handed off state. A single instance of this has > caused mayhem, if we did this in such a way as to affect the _many_ uffd > hooks we could have a realy serious problem. > > So - what seems really positive about this series is the _generalisation_ > and _abstraction_ of uffd functionality. > > That is something I appreciate and I think uffd sorely needs, in fact if we > could find a way to not need to do: > > if (some_uffd_predicate()) > some_uffd_specific_fn(); > > That'd be incredible. > > So I think the answer here is to do something like this, and to keep all > the mm-specific code in core mm. > > Thanks, Lorenzo