From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31756C71155 for ; Fri, 20 Jun 2025 12:00:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F17C6B008A; Fri, 20 Jun 2025 08:00:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A38A6B008C; Fri, 20 Jun 2025 08:00:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 890906B0092; Fri, 20 Jun 2025 08:00:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 75F146B008A for ; Fri, 20 Jun 2025 08:00:37 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 166895B881 for ; Fri, 20 Jun 2025 12:00:37 +0000 (UTC) X-FDA: 83575636914.10.FC7691E Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) by imf06.hostedemail.com (Postfix) with ESMTP id 8F499180012 for ; Fri, 20 Jun 2025 12:00:34 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=XHgt1bvZ; spf=pass (imf06.hostedemail.com: domain of "prvs=259582e42=kalyazin@amazon.co.uk" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=259582e42=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750420834; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w/25pdsb7c2UBnYp18+Oc232d/2gPhKJ+C4vXgNX6LQ=; b=PgA5Fmj0EaLddC/VAXzPDmjKZu2KZ0O/UOrxK3cmT5CgFhE74t0WC/gOswrxDBFYQrU62D jet/JTJbbFdzrdulpoPa2gHtY1QJ6pHmodGxAUxRnl/YRDqzzSA4iwADiLaFxeHDs9fECi LjpQcdCajOao8pWjymgq6u8MLFb8wfk= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazoncorp2 header.b=XHgt1bvZ; spf=pass (imf06.hostedemail.com: domain of "prvs=259582e42=kalyazin@amazon.co.uk" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=259582e42=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750420834; a=rsa-sha256; cv=none; b=B6ApNOI26muRZeCMItrGyZJN9kcIHRjmZfAFkYj5etl10xvJKlvJhf8cndVkG6v2YP+vb+ ku3FxYVXribonhE/2KeR09g1O6NkTggBtjNNcMXfwENsbPgU74f0qXnBYHeo+rZIqyYO6S UgvOuSi876mUzOi9eZ6pg5YGlsvwbwU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2; t=1750420835; x=1781956835; h=message-id:date:mime-version:reply-to:subject:to:cc: references:from:in-reply-to:content-transfer-encoding; bh=w/25pdsb7c2UBnYp18+Oc232d/2gPhKJ+C4vXgNX6LQ=; b=XHgt1bvZWO6heWXycmGtPszUZHERMBBUTLvtzAsXXlwanM4MRUSOcKXi tARYdh9oS1RwDJUzLmrJd1UhlZesrzpaSU8ZolD0YH5iqr6+tdW0CEoQu r3AIH39yavEriKQHqRwjYD/ZzbBMzIeS1llpxXs0FvaDA9PblRSGMH3fB XE2l8E2pobObAymgwDlp+Q6BsTVikX2cZ8t/107/KcgR/9nX6AI50hC6B tbdJVjlWjIfxLCmYgT3tt9xMZQ0lS+cMEbVwvmhLDuWDsyShu3zfkEvqc c+cf7RLrQf6xmhPYVZaR8YLT42z4BaOrdeQIc1Nb3eMClJNY/ClrcArL5 g==; X-IronPort-AV: E=Sophos;i="6.16,251,1744070400"; d="scan'208";a="837455426" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 12:00:29 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.10.100:56193] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.15.147:2525] with esmtp (Farcaster) id f39858af-1d74-433e-8bd7-937764bedc3b; Fri, 20 Jun 2025 12:00:27 +0000 (UTC) X-Farcaster-Flow-ID: f39858af-1d74-433e-8bd7-937764bedc3b Received: from EX19D022EUC002.ant.amazon.com (10.252.51.137) by EX19MTAEUA001.ant.amazon.com (10.252.50.50) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 20 Jun 2025 12:00:27 +0000 Received: from [10.95.102.166] (10.95.102.166) by EX19D022EUC002.ant.amazon.com (10.252.51.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Fri, 20 Jun 2025 12:00:25 +0000 Message-ID: <2097f155-c459-40e1-93e8-3d501ae66b42@amazon.com> Date: Fri, 20 Jun 2025 13:00:24 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: Subject: Re: [PATCH v3 1/6] mm: userfaultfd: generic continue for non hugetlbfs To: Peter Xu CC: , , , , , , , , , , , , , , , , , , , , , , , , References: <20250404154352.23078-1-kalyazin@amazon.com> <20250404154352.23078-2-kalyazin@amazon.com> <36d96316-fd9b-4755-bb35-d1a2cea7bb7e@amazon.com> Content-Language: en-US From: Nikita Kalyazin Autocrypt: addr=kalyazin@amazon.com; keydata= xjMEY+ZIvRYJKwYBBAHaRw8BAQdA9FwYskD/5BFmiiTgktstviS9svHeszG2JfIkUqjxf+/N JU5pa2l0YSBLYWx5YXppbiA8a2FseWF6aW5AYW1hem9uLmNvbT7CjwQTFggANxYhBGhhGDEy BjLQwD9FsK+SyiCpmmTzBQJnrNfABQkFps9DAhsDBAsJCAcFFQgJCgsFFgIDAQAACgkQr5LK IKmaZPOpfgD/exazh4C2Z8fNEz54YLJ6tuFEgQrVQPX6nQ/PfQi2+dwBAMGTpZcj9Z9NvSe1 CmmKYnYjhzGxzjBs8itSUvWIcMsFzjgEY+ZIvRIKKwYBBAGXVQEFAQEHQCqd7/nb2tb36vZt ubg1iBLCSDctMlKHsQTp7wCnEc4RAwEIB8J+BBgWCAAmFiEEaGEYMTIGMtDAP0Wwr5LKIKma ZPMFAmes18AFCQWmz0MCGwwACgkQr5LKIKmaZPNTlQEA+q+rGFn7273rOAg+rxPty0M8lJbT i2kGo8RmPPLu650A/1kWgz1AnenQUYzTAFnZrKSsXAw5WoHaDLBz9kiO5pAK In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.95.102.166] X-ClientProxiedBy: EX19D015EUB002.ant.amazon.com (10.252.51.123) To EX19D022EUC002.ant.amazon.com (10.252.51.137) X-Rspamd-Queue-Id: 8F499180012 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: e6fymbf5cxis8z9zqjy4unoxo385dbj6 X-HE-Tag: 1750420834-449799 X-HE-Meta: U2FsdGVkX1+X58gRJnBOwkQGOnrOR82JT50pzMZP0ZtS/UZfLOraffGS5g5d8ST005sQ7+mRFHpLk+VRkocTtaEhCsO3MCjHZT/UbJFgygrakVCHjkXkp8t+cFvADbJP+QaR4wdgyu1IY2vRtDcCKXp3DUQLO9Po1zZxHKg32wsk4s+DY6AEuoaoDGCu1BYOUP9wIU5q1JNSSTA0dxgiEH3StcImAfXNaw7DeJtPU3IckS1391Ywxnv/dylkI+r5+CU7lUH0IZsiT8Y2wkSWMXmMuczMy1DI6h1kPAkYAt/aX9qUE7iPA57djktUVQ10FRk6mGBu4Rp+3kUA0d4WG1Jwzjcfw8LBcw/nC+jHcUtwhz14UNsSga8fAWxeMpbszR35gWbNM1bqCz4ho5ar2xCKHUqxmaady3sb6pzaHpVvZ+bRujfZEGkc3gAdoBbpAcllKijxyS1OQV2YnAHk7wM0pjHHtkAPrmogM2Fgp+5Fu5lbhVl1mkpxPDOoB4/mQlqt4PNNgb+NA/d95py9IKG90H3FLE9yOhT9mQ7qR5wOGhpoh7p8u38+mj9KThZBjXkA0nOhg9ZwcS4bruLFj6GMNrYbTAaxa6K4Ykb3iJ5zjH3FzT9vRovhtmdP3EJCTdJPA7/RdWfpfBmwqu63rmfQ1QskZ58ZtPq/9e7cJvgBEZLtYSKQfvThifUOr953XIfCAXQ2WmFFwonPX98YO1r8JHWuffHY+YX6Hi3qy5wUB44CeSxO2qgVsWXpMWYbJ5hKl3Khe6Uran5z2A6AUCeZFqAX3+RTKR22U8+vyBA17e+k/HWcopqg6U2u+IE3Y71RemoNxu/btqvxBP/ecVT3nk/kuXtTrnN5F8YQygJka4/rouZYRlkwT3UeNIotFPlA/kz7jaeVVodiiYqZabhhrhXcrLGs5nnKnNvnw9iCA5yVF305xwAmmj/FelVlFydgwurcILFlrsKhvtv DmieZdmi P5F5yQr+Au/OD7r7NdWHBx/DO5n3XPAFH0yqoV0tsroyzEAT5w1VsSZRUanq+76aWW4cYES3VuS0V1wOgQfPLlWaY57ushgqoNkKNxTuK+VHS1eaeLH5hNQIYd0/Mp1iHscWTP5OMJl0+gmN+k5jcc3BqjFeZLXuDsllQt5MJJYGe9e5KwsvaO7N0MeUVrGsdOXnCqNORiB14O6uPFqYTmxvDsAcm3o9NSMlA4weiWiH8ImegkMaLU4HVsSQrQ0wYhD6hi12uz+tYb7xlStRcPg3qUxXp5K+c2Iier/bTUJKHYVb8W88cWqH3f3CCDTIOvksPNgo/LIbuf0xphgCs7YvwhWfP3XZKzwY566gigPx0phrzTnu286fMEpbHRtc8KLTJk0cUnxiSz2VBTPmohxG4vbmmwXj21cOz642q/a3xjmFTr3Iocgh+t265tiykGegj5PXKJaTmwteerdTZPSFnJmDoWT3DE7vSQZJQIUjCTY9ayAk6utlDHQjrkn7e/pKcNmOY55Ez0UzKo999lF7NItGdgJjPFQX2C7S++L9W+h6gAPcEha+NlxluxrDUN71woE5yS6eYSA0+ETYtcGd+/O5Y4l++6IQNym96uZilN+C0CDort2VthaahmhkeeEUFQJ5/zCM+N/Revx6diiWyRKcoI3AdKeRvfKI7UV1gsdb50kwBiAbUrUsj/Bct1ls9PlsGhas6NbM59BFd0tHhjR24AzmCNedQSXRJPaBPP6apcFIwQrxUVjpQvpJx3Ab3aulEY2mvSI5Jf1zXEPt93R/z07NJ+kKaSg8zOPzk+Ole4QaOmiokxMj7py9s04BpSVI0uxCI4QFGaF2hWlanzYjCvWk26cnfR9WDXrhzZm7VNzyTesmczCxnLFlhhh+aGOiM8kmazw8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/06/2025 13:56, Peter Xu wrote: > On Wed, Jun 11, 2025 at 01:09:32PM +0100, Nikita Kalyazin wrote: >> >> >> On 10/06/2025 23:22, Peter Xu wrote: >>> On Fri, Apr 04, 2025 at 03:43:47PM +0000, Nikita Kalyazin wrote: >>>> Remove shmem-specific code from UFFDIO_CONTINUE implementation for >>>> non-huge pages by calling vm_ops->fault(). A new VMF flag, >>>> FAULT_FLAG_USERFAULT_CONTINUE, is introduced to avoid recursive call to >>>> handle_userfault(). >>> >>> It's not clear yet on why this is needed to be generalized out of the blue. >>> >>> Some mentioning of guest_memfd use case might help for other reviewers, or >>> some mention of the need to introduce userfaultfd support in kernel >>> modules. >> >> Hi Peter, >> >> Sounds fair, thank you. >> >>>> >>>> Suggested-by: James Houghton >>>> Signed-off-by: Nikita Kalyazin >>>> --- >>>> include/linux/mm_types.h | 4 ++++ >>>> mm/hugetlb.c | 2 +- >>>> mm/shmem.c | 9 ++++++--- >>>> mm/userfaultfd.c | 37 +++++++++++++++++++++++++++---------- >>>> 4 files changed, 38 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >>>> index 0234f14f2aa6..2f26ee9742bf 100644 >>>> --- a/include/linux/mm_types.h >>>> +++ b/include/linux/mm_types.h >>>> @@ -1429,6 +1429,9 @@ enum tlb_flush_reason { >>>> * @FAULT_FLAG_ORIG_PTE_VALID: whether the fault has vmf->orig_pte cached. >>>> * We should only access orig_pte if this flag set. >>>> * @FAULT_FLAG_VMA_LOCK: The fault is handled under VMA lock. >>>> + * @FAULT_FLAG_USERFAULT_CONTINUE: The fault handler must not call userfaultfd >>>> + * minor handler as it is being called by the >>>> + * userfaultfd code itself. >>> >>> We probably shouldn't leak the "CONTINUE" concept to mm core if possible, >>> as it's not easy to follow when without userfault minor context. It might >>> be better to use generic terms like NO_USERFAULT. >> >> Yes, I agree, can name it more generically. >> >>> Said that, I wonder if we'll need to add a vm_ops anyway in the latter >>> patch, whether we can also avoid reusing fault() but instead resolve the >>> page faults using the vm_ops hook too. That might be helpful because then >>> we can avoid this new FAULT_FLAG_* that is totally not useful to >>> non-userfault users, meanwhile we also don't need to hand-cook the vm_fault >>> struct below just to suite the current fault() interfacing. >> >> I'm not sure I fully understand that. Calling fault() op helps us reuse the >> FS specifics when resolving the fault. I get that the new op can imply the >> userfault flag so the flag doesn't need to be exposed to mm, but doing so >> will bring duplication of the logic within FSes between this new op and the >> fault(), unless we attempt to factor common parts out. For example, for >> shmem_get_folio_gfp(), we would still need to find a way to suppress the >> call to handle_userfault() when shmem_get_folio_gfp() is called from the new >> op. Is that what you're proposing? > > Yes it is what I was proposing. shmem_get_folio_gfp() always has that > handling when vmf==NULL, then vma==NULL and userfault will be skipped. > > So what I was thinking is one vm_ops.userfaultfd_request(req), where req > can be: > > (1) UFFD_REQ_GET_SUPPORTED: this should, for existing RAM-FSes return > both MISSING/WP/MINOR. Here WP should mean sync-wp tracking, async > was so far by default almost supported everywhere except > VM_DROPPABLE. For guest-memfd in the future, we can return MINOR only > as of now (even if I think it shouldn't be hard to support the rest > two..). > > (2) UFFD_REQ_FAULT_RESOLVE: this should play the fault() role but well > defined to suite userfault's need on fault resolutions. It likely > doesn't need vmf as the parameter, but likely (when anon isn't taking > into account, after all anon have vm_ops==NULL..) the inode and > offsets, perhaps some flag would be needed to identify MISSING or > MINOR faults, for example. > > Maybe some more. > > I was even thinking whether we could merge hugetlb into the picture too on > generalize its fault resolutions. Hugetlb was always special, maye this is > a chance too to make it generalized, but it doesn't need to happen in one > shot even if it could work. We could start with shmem. > > So this does sound like slightly involved, and I'm not yet 100% sure this > will work, but likely. If you want, I can take a stab at this this week or > next just to see whether it'll work in general. I also don't expect this > to depend on guest-memfd at all - it can be alone a refactoring making > userfault module-ready. Thanks for explaining that. I played a bit with it myself and it appears to be working for the MISSING mode for both shmem and guest_memfd. Attaching my sketch below. Please let me know if this is how you see it. I found that arguments and return values are significantly different between the two request types, which may be a bit confusing, although we do not expect many callers of those. diff --git a/include/linux/mm.h b/include/linux/mm.h index 8483e09aeb2c..eb30b23b24d3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -603,6 +603,16 @@ struct vm_fault { */ }; +#ifdef CONFIG_USERFAULTFD +/* + * Used by userfaultfd_request(). + */ +enum uffd_req { + UFFD_REQ_GET_SUPPORTED, /* query supported userfaulfd modes */ + UFFD_REQ_FAULT_RESOLVE, /* request fault resolution */ +}; +#endif + /* * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer @@ -680,6 +690,22 @@ struct vm_operations_struct { */ struct page *(*find_special_page)(struct vm_area_struct *vma, unsigned long addr); + +#ifdef CONFIG_USERFAULTFD + /* + * Called by the userfaultfd code to query supported modes or request + * fault resolution. + * If called with req UFFD_REQ_GET_SUPPORTED, it returns a bitmask + * of modes as in struct uffdio_register. No other arguments are + * used. + * If called with req UFFD_REQ_FAULT_RESOLVE, it resolves the fault + * using the mode specified in the mode argument. The inode, pgoff and + * foliop arguments must be set accordingly. + */ + int (*userfaultfd_request)(enum uffd_req req, int mode, + struct inode *inode, pgoff_t pgoff, + struct folio **foliop); +#endif }; #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 75342022d144..1cabb925da0e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -222,7 +222,11 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, return false; if ((vm_flags & VM_UFFD_MINOR) && - (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma))) + (!is_vm_hugetlb_page(vma) && + !vma->vm_ops->userfaultfd_request && + !(vma->vm_ops->userfaultfd_request(UFFD_REQ_GET_SUPPORTED, 0, + NULL, 0, NULL) & + UFFDIO_REGISTER_MODE_MINOR))) return false; /* @@ -243,8 +247,11 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, #endif /* By default, allow any of anon|shmem|hugetlb */ - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); + return vma_is_anonymous(vma) || + is_vm_hugetlb_page(vma) || + (vma->vm_ops->userfaultfd_request && + vma->vm_ops->userfaultfd_request(UFFD_REQ_GET_SUPPORTED, 0, NULL, + 0, NULL)); } static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma) diff --git a/mm/shmem.c b/mm/shmem.c index 1ede0800e846..a5b5c4131dcf 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -5203,6 +5203,40 @@ static int shmem_error_remove_folio(struct address_space *mapping, return 0; } +#ifdef CONFIG_USERFAULTFD +static int shmem_userfaultfd_request(enum uffd_req req, int mode, + struct inode *inode, pgoff_t pgoff, + struct folio **foliop) +{ + int ret; + + switch (req) { + case UFFD_REQ_GET_SUPPORTED: + ret = + UFFDIO_REGISTER_MODE_MISSING | + UFFDIO_REGISTER_MODE_WP | + UFFDIO_REGISTER_MODE_MINOR; + break; + case UFFD_REQ_FAULT_RESOLVE: + ret = shmem_get_folio(inode, pgoff, 0, foliop, SGP_NOALLOC); + if (ret == -ENOENT) + ret = -EFAULT; + if (ret) + break; + if (!*foliop) { + ret = -EFAULT; + break; + } + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} +#endif + static const struct address_space_operations shmem_aops = { .writepage = shmem_writepage, .dirty_folio = noop_dirty_folio, @@ -5306,6 +5340,9 @@ static const struct vm_operations_struct shmem_vm_ops = { .set_policy = shmem_set_policy, .get_policy = shmem_get_policy, #endif +#ifdef CONFIG_USERFAULTFD + .userfaultfd_request = shmem_userfaultfd_request, +#endif }; static const struct vm_operations_struct shmem_anon_vm_ops = { @@ -5315,6 +5352,9 @@ static const struct vm_operations_struct shmem_anon_vm_ops = { .set_policy = shmem_set_policy, .get_policy = shmem_get_policy, #endif +#ifdef CONFIG_USERFAULTFD + .userfaultfd_request = shmem_userfaultfd_request, +#endif }; int shmem_init_fs_context(struct fs_context *fc) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d06453fa8aba..efc150bf5691 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -392,16 +392,18 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, struct page *page; int ret; - ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC); - /* Our caller expects us to return -EFAULT if we failed to find folio */ - if (ret == -ENOENT) - ret = -EFAULT; + if (!dst_vma->vm_ops->userfaultfd_request || + !(dst_vma->vm_ops->userfaultfd_request(UFFD_REQ_GET_SUPPORTED, 0, + NULL, 0, NULL) & + UFFDIO_REGISTER_MODE_MINOR)) { + return -EFAULT; + } + + ret = dst_vma->vm_ops->userfaultfd_request(UFFD_REQ_FAULT_RESOLVE, + UFFDIO_REGISTER_MODE_MINOR, + inode, pgoff, &folio); if (ret) goto out; - if (!folio) { - ret = -EFAULT; - goto out; - } page = folio_file_page(folio, pgoff); if (PageHWPoison(page)) { @@ -770,10 +772,10 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, return mfill_atomic_hugetlb(ctx, dst_vma, dst_start, src_start, len, flags); - if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) - goto out_unlock; - if (!vma_is_shmem(dst_vma) && - uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) + if (!vma_is_anonymous(dst_vma) && + (!dst_vma->vm_ops->userfaultfd_request || + (!dst_vma->vm_ops->userfaultfd_request(UFFD_REQ_GET_SUPPORTED, 0, + NULL, 0, NULL)))) goto out_unlock; while (src_addr < src_start + len) { > > Thanks, > > -- > Peter Xu >