From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9720C7EE30 for ; Thu, 26 Jun 2025 16:09:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E47A6B00A2; Thu, 26 Jun 2025 12:09:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 494C66B00A3; Thu, 26 Jun 2025 12:09:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 383EA8D0001; Thu, 26 Jun 2025 12:09:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 23BBF6B00A2 for ; Thu, 26 Jun 2025 12:09:57 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B8798C0298 for ; Thu, 26 Jun 2025 16:09:56 +0000 (UTC) X-FDA: 83598037992.16.48F6633 Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) by imf01.hostedemail.com (Postfix) with ESMTP id 94B334000F for ; Thu, 26 Jun 2025 16:09:54 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=MdKWiRmp; spf=pass (imf01.hostedemail.com: domain of "prvs=265574e2d=kalyazin@amazon.co.uk" designates 52.119.213.154 as permitted sender) smtp.mailfrom="prvs=265574e2d=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750954194; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=g3xFeXZI3QqRMs02H2fge/YdYUqqiszTKp89blpgsUg=; b=WUVz6j4MATJx6hVZ3pTo8CvcGt5ie8SqgZprwhXfzG2ZUEO18OxXRlSpRsVF1UWgWxiAvo rXSoJyHDu6MnTHq/q6bOuZslub01iaHFstnTDSLSVjRswX/G7yTthji5bB9Uk4khQu+oD1 g6RRxMexvx4kyTOIBt3ZGKfFxyEArY0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750954194; a=rsa-sha256; cv=none; b=e8llR56kdfAig/EhZv2lYi1CgCQYPnPX1n6pGg2PcgG027E/QwHxoKKmE5Ma9cMyG/C+tb cbqNpjWvNxGmLrhhb6Ml5f1zBfaFuadE2l7u5flJKqnsQMsbH2JRRMdY35qdtZMP2iEUXX tXZjquqg4laAFjM+DNAvKBc4qx+b1G0= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=MdKWiRmp; spf=pass (imf01.hostedemail.com: domain of "prvs=265574e2d=kalyazin@amazon.co.uk" designates 52.119.213.154 as permitted sender) smtp.mailfrom="prvs=265574e2d=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2; t=1750954194; x=1782490194; h=message-id:date:mime-version:reply-to:subject:to:cc: references:from:in-reply-to:content-transfer-encoding; bh=g3xFeXZI3QqRMs02H2fge/YdYUqqiszTKp89blpgsUg=; b=MdKWiRmpjGcuccD5sVzJMsNijJbiosUMa76NxYavwH4kCFhQ0Xhcrn+/ GhrPlWgOPggKOO9uzmrMuMZY6MMNa8/hrUa76uXZAnhpayBywSL5LFp0i p8TLncVq08OtC05Ui1ff9oJ6Kkpc5empxWISFPjRKdG5+6xN1BV/rMTmM CIm9nb+wwhrwKQl5JYk5cr3YJ8EVH9Xb67/bGkt6MDwlwZlzL11BzIh8B QkLjLL5DxTTINmX9nZ6+DXrNjvMUEQG5VFhWJgaOCATmIZd/aSnVFAd57 Xf15hCLnji09MgcJrfagukBWZ6X/0ZqaTZqc9SwZQcMbUf0SrvicPskxt Q==; X-IronPort-AV: E=Sophos;i="6.16,268,1744070400"; d="scan'208";a="312994400" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2025 16:09:51 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.17.79:13317] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.47.53:2525] with esmtp (Farcaster) id 5ba97137-2dfa-4eb8-9ed5-afeceeeb58e6; Thu, 26 Jun 2025 16:09:49 +0000 (UTC) X-Farcaster-Flow-ID: 5ba97137-2dfa-4eb8-9ed5-afeceeeb58e6 Received: from EX19D022EUC002.ant.amazon.com (10.252.51.137) by EX19MTAEUA001.ant.amazon.com (10.252.50.50) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Thu, 26 Jun 2025 16:09:49 +0000 Received: from [192.168.30.146] (10.106.83.33) by EX19D022EUC002.ant.amazon.com (10.252.51.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Thu, 26 Jun 2025 16:09:48 +0000 Message-ID: <7666ee96-6f09-4dc1-8cb2-002a2d2a29cf@amazon.com> Date: Thu, 26 Jun 2025 17:09:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: Subject: Re: [PATCH 0/4] mm/userfaultfd: modulize memory types To: Peter Xu CC: , , Hugh Dickins , Oscar Salvador , Michal Hocko , David Hildenbrand , Muchun Song , Andrea Arcangeli , Ujwal Kundur , Suren Baghdasaryan , "Andrew Morton" , Vlastimil Babka , "Liam R . Howlett" , James Houghton , Mike Rapoport , Lorenzo Stoakes , Axel Rasmussen References: <20250620190342.1780170-1-peterx@redhat.com> <114133f5-0282-463d-9d65-3143aa658806@amazon.com> Content-Language: en-US From: Nikita Kalyazin Autocrypt: addr=kalyazin@amazon.com; keydata= xjMEY+ZIvRYJKwYBBAHaRw8BAQdA9FwYskD/5BFmiiTgktstviS9svHeszG2JfIkUqjxf+/N JU5pa2l0YSBLYWx5YXppbiA8a2FseWF6aW5AYW1hem9uLmNvbT7CjwQTFggANxYhBGhhGDEy BjLQwD9FsK+SyiCpmmTzBQJnrNfABQkFps9DAhsDBAsJCAcFFQgJCgsFFgIDAQAACgkQr5LK IKmaZPOpfgD/exazh4C2Z8fNEz54YLJ6tuFEgQrVQPX6nQ/PfQi2+dwBAMGTpZcj9Z9NvSe1 CmmKYnYjhzGxzjBs8itSUvWIcMsFzjgEY+ZIvRIKKwYBBAGXVQEFAQEHQCqd7/nb2tb36vZt ubg1iBLCSDctMlKHsQTp7wCnEc4RAwEIB8J+BBgWCAAmFiEEaGEYMTIGMtDAP0Wwr5LKIKma ZPMFAmes18AFCQWmz0MCGwwACgkQr5LKIKmaZPNTlQEA+q+rGFn7273rOAg+rxPty0M8lJbT i2kGo8RmPPLu650A/1kWgz1AnenQUYzTAFnZrKSsXAw5WoHaDLBz9kiO5pAK In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.106.83.33] X-ClientProxiedBy: EX19D003EUB001.ant.amazon.com (10.252.51.97) To EX19D022EUC002.ant.amazon.com (10.252.51.137) X-Rspam-User: X-Stat-Signature: 9ho46w4dgck4zfd3ahwicck6npz5ii9z X-Rspamd-Queue-Id: 94B334000F X-Rspamd-Server: rspam08 X-HE-Tag: 1750954194-203482 X-HE-Meta: U2FsdGVkX1/gx6RG/RVWoSiU9dMrbXhvtynS43AhpYZXVR8hq95Z8c5H/jlxqDQklbwfZGgB+oR2Ep3lP4IdtUGdWpSFtrfWTQZ4nE1mH4/lOxTI1ovX3wKVB8CM/ui88kHLKP6OR5WtrXkodns/fLbTYY1kPfJR/lfcoR1NL9EeLpXHubqWlCKzjLli54hKQwxzV2HS2e0vs8DzTJVxd0DpLwe/+BEzFkKZ53f1tU96sWpQ5tl2aj4MZkZO3/Ud2sGieZzfkcm2AZdrJXh4mT/PciO9vr87yOdMP/7mt41zrRr0acA1ykXdnJxzy8ir11o2ftQ9/FwMzRQteFPLeKHvfrikJW4vfduuHyK2TLsYnnYKNuMfzyf7iyp9dYPodzJCXo7irES6hRJwvkHdo627uT9JfVw5Sy4jAwq+0aSka7/+rCqnTCeyilFKaKGieYA9d222nC5LznG+uuYrsMhpgCAczH/SC27ns4h3UANo6f9sRx/MII4QGeBw9si5d2wjkT0Zs10btZUYlw5fuyFlAZZPT6Ht/6CWxjxiJ8/q1ZnjYaTMRBcIyjxW4EMA09pf8YiM6jfl/YUdSwiFZDxjxexSLKWUyTG4s5iBgA2R0cAVj8ly7b/vcdIlzGL1N8wYP/ifKPo+02coMhVT9sF51tLix1EXoDxcruYBiyE46mtxqg/CNf8xwMxmSl6Z+aG2n5ceXVHX544ut7Xssde+xGc12hbIUnSFomHKX9RNK3lKh+BxFOrdvtJFjgQlnuvTw3z3sgCWlx4/C/mh6QyP7/9UYjGZe2/NC1UeTkhuzolx6WhJ0GKLXhXUW2cCd1llWmGLj6VXP85KwZJMm/3VYf4USw3HfoLIyDRGBv71iWB5MnBVhVv/qMiKUWXSkEwT+fdo2xf0r8rrbTBfvAaFHGv54XutFOgLUEcWzKm2eWk8WduS4sHaoG2BhAYirLLe7EWMa/o6odv6Qh3 P4DNkync nc0I4ecfIIUvRWgUmVWQ2bUI5BsrSSNCNanQdYGPG0vUdx+HnKuP2cDQKjUvr37ydqBMHpESrzg4hAXXnyrVp86eGqFMSb7FLaIcvT7yOSe/zqRxYV1p+h2uV2v5g7Kjs0wsTr9uCncYR2D8WFGmrUyYq1/GckGXH1pR1jpb7Mc4zlYkGojqSUtCuyZ6Tz5V/iIOpucGM7eBI/b3WPD5QjxBeobRW4sk7nfO0fdBBZ2/jb/PkmGIMT7u8lxmH9Y+ww1CBgt7am455pcscL3btDn+Xnb+ABuZLNn+ESB4g7YfALyJRn35WPuCUMBk+1lYmifJ2SwA/DbH5TrEC4IkZZ1jtBLx6RkW1ypEkd0hHdSlq0VdjaCQ5wfvteUFFhF+HSBY8LC7FpYD4bBDg6ohRLstCylk1kUBHKL3/h5mDbS34d4zMVJrMdRp5q7N2siBB/pzC0U+gLhsn1d23PpuwyOZEnhMtVfwVDGAtXezI4w+u3B2tbB+v3WX7UyDJQKKNFNAJhaJ4FWGVHB8ZCoQTG3yysGHMfek7vflFOM/coAHQ+ewtvStuPV02Lxe+YjRXRZuma+D9zzi5vyoedDtIqGlmywQHuwwlvMAS7H3W8GzouDac6zp0LG2Hl0FOVkTxPtIhEc+9T7HLG70/KbwxfmoR9E2wcrZIfEVBzjCyJmagoLHP2caKm03r6sw7fUv2GxNGmPSxq8vtPLQjwuRoHWCXbOCarTfLCVxuWl0O2BySpof0q56GzfgsR/LgDo8SxBYZmvpEQ87ftYzQBYrT/xNGX5ogAIwt+P37cB6qOhHP4QUeBqZ/O6Oe99v01imBxuuXE3wEFKVCt6Eiosid0m7jx7QEXwy2WCHacsk2uLhg4IXDz+vM2KaBlp7sz6FxaGOXRtel+in3OJA1NrW4jp3cg4ZNYrmyQA7aw6bSiBEnWNmfNedorJqyyr88Buvb3kGK96FkrVeALss= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 25/06/2025 21:17, Peter Xu wrote: > On Wed, Jun 25, 2025 at 05:56:23PM +0100, Nikita Kalyazin wrote: >> >> >> On 20/06/2025 20:03, Peter Xu wrote: >>> [based on akpm/mm-new] >>> >>> This series is an alternative proposal of what Nikita proposed here on the >>> initial three patches: >>> >>> https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com >>> >>> This is not yet relevant to any guest-memfd support, but paving way for it. >> >> Hi Peter, > > Hi, Nikita, > >> >> Thanks for posting this. I confirmed that minor fault handling was working >> for guest_memfd based on this series and looked simple (a draft based on >> mmap support in guest_memfd v7 [1]): > > Thanks for the quick spin, glad to know it works. Some trivial things to > mention below.. Following up, I drafted UFFDIO_COPY support for guest_memfd to confirm it works as well: diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 8c44e4b9f5f8..b5458a22fff4 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -349,12 +349,19 @@ static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf ? vmf->vma : NULL; struct inode *inode = file_inode(vmf->vma->vm_file); struct folio *folio; vm_fault_t ret = VM_FAULT_LOCKED; filemap_invalidate_lock_shared(inode->i_mapping); + folio = filemap_get_entry(inode->i_mapping, vmf->pgoff); + if (!folio && vma && userfaultfd_missing(vma)) { + filemap_invalidate_unlock_shared(inode->i_mapping); + return handle_userfault(vmf, VM_UFFD_MISSING); + } + folio = kvm_gmem_get_folio(inode, vmf->pgoff); if (IS_ERR(folio)) { int err = PTR_ERR(folio); @@ -438,10 +445,57 @@ static int kvm_gmem_uffd_get_folio(struct inode *inode, pgoff_t pgoff, return 0; } +static int kvm_gmem_mfill_atomic_pte(pmd_t *dst_pmd, + struct vm_area_struct *dst_vma, + unsigned long dst_addr, + unsigned long src_addr, + uffd_flags_t flags, + struct folio **foliop) +{ + struct inode *inode = file_inode(dst_vma->vm_file); + pgoff_t pgoff = linear_page_index(dst_vma, dst_addr); + struct folio *folio; + int ret; + + folio = kvm_gmem_get_folio(inode, pgoff); + if (IS_ERR(folio)) { + ret = PTR_ERR(folio); + goto out; + } + + folio_unlock(folio); + + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_COPY)) { + void *vaddr = kmap_local_folio(folio, 0); + ret = copy_from_user(vaddr, (const void __user *)src_addr, PAGE_SIZE); + kunmap_local(vaddr); + if (unlikely(ret)) { + *foliop = folio; + ret = -ENOENT; + goto out; + } + } else { /* ZEROPAGE */ + clear_user_highpage(&folio->page, dst_addr); + } + + kvm_gmem_mark_prepared(folio); + + ret = mfill_atomic_install_pte(dst_pmd, dst_vma, dst_addr, + &folio->page, true, flags); + + if (ret) + folio_put(folio); +out: + return ret; +} + static const vm_uffd_ops kvm_gmem_uffd_ops = { - .uffd_features = VM_UFFD_MINOR, - .uffd_ioctls = BIT(_UFFDIO_CONTINUE), + .uffd_features = VM_UFFD_MISSING | VM_UFFD_MINOR, + .uffd_ioctls = BIT(_UFFDIO_COPY) | + BIT(_UFFDIO_ZEROPAGE) | + BIT(_UFFDIO_CONTINUE), .uffd_get_folio = kvm_gmem_uffd_get_folio, + .uffd_copy = kvm_gmem_mfill_atomic_pte, }; #endif > >> >> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c >> index 5abb6d52a375..6ddc73419724 100644 >> --- a/virt/kvm/guest_memfd.c >> +++ b/virt/kvm/guest_memfd.c >> @@ -5,6 +5,9 @@ >> #include >> #include >> #include >> +#ifdef CONFIG_USERFAULTFD > > This ifdef not needed, userfaultfd_k.h has taken care of all cases. Good to know, thanks. >> +#include >> +#endif >> >> #include "kvm_mm.h" >> >> @@ -396,6 +399,14 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) >> kvm_gmem_mark_prepared(folio); >> } >> >> +#ifdef CONFIG_USERFAULTFD > > Same here. userfaultfd_minor() is always defined. Thank you. > I'll wait for a few more days for reviewers, and likely send v2 before next > week. > > Thanks, > > -- > Peter Xu >