From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB9F9C7115C for ; Fri, 20 Jun 2025 19:03:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 609D76B0096; Fri, 20 Jun 2025 15:03:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E1926B0098; Fri, 20 Jun 2025 15:03:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F9226B009B; Fri, 20 Jun 2025 15:03:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3960A6B0096 for ; Fri, 20 Jun 2025 15:03:57 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D35DE811FF for ; Fri, 20 Jun 2025 19:03:56 +0000 (UTC) X-FDA: 83576703672.14.6F4046A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 9061540009 for ; Fri, 20 Jun 2025 19:03:54 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Z7o5Ysuk; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750446234; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L/uUmIIKaxQEfWKnIn4c6AHy8NnxzF1D+uAlq9OPV6E=; b=LtQzWgE9VcmYr0SXpJ4hXFFn4FgR2t480SDolm+xqKEo2JhM2ZT655G5AhJdt1mcJy/zx4 bpXC3zdMC0bxKkJIvt+dbH9E+TvT7bs8KN9oSvLZ3ibSuV9ipLwlQjY072+h2udWBTq7iH TQxXmKjy7eOB+hX1ZaULzRSzcVdcUhw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Z7o5Ysuk; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750446234; a=rsa-sha256; cv=none; b=B/Xt2fqMjfxgbLZeRsXUXS3OlHYst4/4P1n5bBlBZgS8NjuJtwh3GeC5YPCKiOATxZe9uC TK/OOzgIva79BeysOTPBIXihkVLi3jo3I5eCmL3w/NOVXpNMxBegLYEARzSVcH3/kERvDp PWAq4pRfWjnYTXd76Gp6ZnMXJlAGsKk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750446233; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L/uUmIIKaxQEfWKnIn4c6AHy8NnxzF1D+uAlq9OPV6E=; b=Z7o5YsukadbzRRtVzZWn46pdOb0WomcOZnPjuhBx4ILq7d51DriYeaVyszUNnj3GK5plwu BrmzcttzTQHrKWrmjkLbMnpI22KU8M15/CCiA+Np9qHcEncOqIwwvAIQvZiAxEvXtpvtrd /QyMWJeXdKPzEqGq2OPVrzcSV+FrR34= Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-58-TQ36a-5GNwKmHnZP1XiQ0Q-1; Fri, 20 Jun 2025 15:03:52 -0400 X-MC-Unique: TQ36a-5GNwKmHnZP1XiQ0Q-1 X-Mimecast-MFC-AGG-ID: TQ36a-5GNwKmHnZP1XiQ0Q_1750446232 Received: by mail-pl1-f198.google.com with SMTP id d9443c01a7336-23689228a7fso32044525ad.1 for ; Fri, 20 Jun 2025 12:03:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750446231; x=1751051031; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L/uUmIIKaxQEfWKnIn4c6AHy8NnxzF1D+uAlq9OPV6E=; b=j0d5FYbet8TxyhpMwLyI+O9PlRLr8TbdWu42i3a0ZHJNrHPeKVRi9+O5qTHfw7KeVr 5VpgHsdJV/QRsiOyzuSJkN0ZYgo//+LGExG5HX4OXHmtFhg49dkbYqCoJ9UCwGz+E4Rg D3t1pHEG77RwRchYBzPNyaK5YUiHATiXH98TqkOys52SC8zzcVCwexig4L893/eGLOz7 WXVqDSbFzdE2hYkqfh2AOUQSS9NJghZBBFTRhXha5S85dLHJh0Oorgd+xmmJtjL0PGHd xz4XN97wsLl4taaxOQz/2OyhUaQg5ZwPZb7lJjptYx2Vgbf2chIkXzECcvhtIIeNuNI9 407A== X-Gm-Message-State: AOJu0YxJjTfgYowvnNF/UsyzeG2TI5VWu3ECCCkx+PiQplqLC0Nkb1iH IPbLuiT715VMHJswl9i25uDiFkLHH6WZ98K7psYs2Z0P8HuqCEFZtw79V9TI9GpZrftjFNdyXEo c3w+zvIJioCDWBUEUQ2te1o8+Q9uq8IXhZXv+IpZ3315dqTG3rChHzWatmcO9WJrjKGe8FcJEsi Mldr0QoDyjKoArSL4g3TiwFeYs354JI4mxTg== X-Gm-Gg: ASbGncsh6J4d69BfvNvoqRLsxJCh8MTi8/CnRM8q72iTbOnbkt02ti9sjb8g5+gKjFC 6uHXqqzxuL1eW3O65RVcGOusUaaNqMZjWf8grIRbP82SKctG/yAFnCCK+J6txufYqznt70ToAtn A/nVZZ8BacUmZ7rxobNuRYmI+QuxsEipw/CMYrCi91PcmWwIr0AtzJ2XyoRiJ1b+69XIIDGHNmp N78zLLAMJaS76uGyOFSsBEPPiyPTdtDkxP0SAhZO5EXQm6iExw8S7YmaC1TTJ3lw04D64iBISSi gGapvtOwvgE= X-Received: by 2002:a17:902:ce03:b0:234:a139:1208 with SMTP id d9443c01a7336-237d972e274mr70968845ad.16.1750446231197; Fri, 20 Jun 2025 12:03:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGu8J6Sq7bshzIqQ3TEXmd3Fhe1gqUM/uJN6VB90LrZm8iK8bS2G/SbAjodZmsNkHbCelLp/g== X-Received: by 2002:a17:902:ce03:b0:234:a139:1208 with SMTP id d9443c01a7336-237d972e274mr70968235ad.16.1750446230682; Fri, 20 Jun 2025 12:03:50 -0700 (PDT) Received: from x1.com ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-237d8609968sm24235535ad.136.2025.06.20.12.03.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Jun 2025 12:03:50 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nikita Kalyazin , peterx@redhat.com, Hugh Dickins , Oscar Salvador , Michal Hocko , David Hildenbrand , Muchun Song , Andrea Arcangeli , Ujwal Kundur , Suren Baghdasaryan , Andrew Morton , Vlastimil Babka , "Liam R . Howlett" , James Houghton , Mike Rapoport , Lorenzo Stoakes , Axel Rasmussen Subject: [PATCH 1/4] mm: Introduce vm_uffd_ops API Date: Fri, 20 Jun 2025 15:03:39 -0400 Message-ID: <20250620190342.1780170-2-peterx@redhat.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250620190342.1780170-1-peterx@redhat.com> References: <20250620190342.1780170-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: _6QPs1mcGC2lx14puSSfX3s0z0WGIEphlRDJuPywoSU_1750446232 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspamd-Server: rspam11 X-Rspam-User: X-Rspamd-Queue-Id: 9061540009 X-Stat-Signature: auy4mesubxwbshfz64ic3backdu4ji8y X-HE-Tag: 1750446234-300233 X-HE-Meta: U2FsdGVkX19ToAzBg+JG7jGSsVG9A/Thftq0UQEUuffFOy6XLKgSYU8A5H8bZU+ZphqlZPyHDhDbK39QKxFI1QFg43JTjjphnSlAu9h5t37oOqcm0ZKhTuTnH39n/diTShwrLEyq34aNAJyJqxHR5eVCpd8LlAbxKUiBR+tsRPDOL6JePvppTCv+Mc8vq5jh5XyGOfKNgJ76iD7Fqbzu4FIgUyaWpZQeSyFMNj03tkisDYCkPEWsmzy4XvdaBoOqqgBlY0HiDVxjpngEYYyjH5d3fHR1oorOHsUm6TtXhkc9XwfnNuNpW70jgFDdvSpIP1v7GcOYeK/OHo52VcU/gH6VI7Un2MyI8X+u1+2Yp+On1kYkwhQ2tegk1HJqYQe81zWbjopOQzLjgk0EAC09s7BBkHcI3hui40HM2njDkhxobRFOjg+PBxTuC60AigTk1ZV/habhcqKUY8IpDqXvpXxOVE4tNXqPcC9ofrrmhAwLAeqc1Bx1HVSh884zwh6Y2Cz/IJ38FCWESwluQnzy0KLA3NfcYC+cHvra1D3mlcwcrRMcJ5G7kUF+viaRukJL9+Aj2Lri6Bl9LPPc7KXVd3jl2pI7QBoB+cWlAVwX3i4ST8cU+x3hvrC6Z74Gd3iL2oBl/OEBJJqHj3B9H8AdPPy5NmsgrWMXjHDDEUy7LqHjg0eSW8AsCaJb4iB3szPxO5JS6SN4TCKy78ACdHcr0urFRe/+FOTKQNuFIA5asjE4/vztLvyYWmVRtiiQrfiI9xkCoOHftp+YCSgk2/Hl9hSbvEihDS5kfm56XCI4s/1TmwikB7FGz2V9SIiSAl+Von21VHs6rZfMxYWeJO0VjkW6EjA2uAG69gEkyNxYAmOPn+7m5x0/vmE5Ib7Ms0hUstWPDGQF+h1jzrpoF1o0Bh7ESzBUKhAj1WidrWfecGfjvv5Vj6/xV+rM3gc4bLz0xtVz5BZm8xhK/N9V3gv UUhefK8i wx/g8FnT95S/I7qkTWQbt6lJvvSdVLk2jHkJWoI6JGsxehG5Y6mMKqpR/JHg/kNuJBCdazw4Pe+pjuVlR8ze953vl8NMxo3MtUWnyh4efMbkxcU30MjXX0dDPL5k5wxEJAnatSsOx2Llk1vDV8UVnPD5avQ/TCc5M9pB4vMC1HXP2X1hs6vpHt1Y5F55zEZg6Q15K6apxiRP3hpd0o/99Y0IgleYxYFhqpR2VHbCRKRuCC+bfJeQHdv6tn8jMAj9bcaFq0GJpD7hFXUQ9jPHLMOCt/845D32m4nhDl1TAdyXR5XGp6zanfFFbG5ec1F7ZVtDfZIIIk7ZN5PAQP/ABeRv/I+7xeTinSLYkhaKdqQGNeSw9eC1EeDgeGDd59vaX87l1x32tp5lEqIr65Icma4r0Ah+BACguRcbLk1zy50hDUYnmPIIIfCatzPRYMJzdPwhBoGBbUMW4N0XCfi2Cf7H+pTBSqafVH43Vu5kewqbTcH4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a generic userfaultfd API for vm_operations_struct, so that one vma, especially when as a module, can support userfaults without modifying the core files. More importantly, when the module can be compiled out of the kernel. So, instead of having core mm referencing modules that may not ever exist, we need to have modules opt-in on core mm hooks instead. After this API applied, if a module wants to support userfaultfd, the module should only need to touch its own file and properly define vm_uffd_ops, instead of changing anything in core mm. Note that such API will not work for anonymous. Core mm will process anonymous memory separately for userfault operations like before. This patch only introduces the API alone so that we can start to move existing users over but without breaking them. Currently the uffd_copy() API is almost designed to be the simplistic with minimum mm changes to move over to the API. Signed-off-by: Peter Xu --- include/linux/mm.h | 71 +++++++++++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 12 ------ 2 files changed, 71 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 98a606908307..8dfd83f01d3d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -576,6 +576,70 @@ struct vm_fault { */ }; +#ifdef CONFIG_USERFAULTFD +/* A combined operation mode + behavior flags. */ +typedef unsigned int __bitwise uffd_flags_t; + +enum mfill_atomic_mode { + MFILL_ATOMIC_COPY, + MFILL_ATOMIC_ZEROPAGE, + MFILL_ATOMIC_CONTINUE, + MFILL_ATOMIC_POISON, + NR_MFILL_ATOMIC_MODES, +}; + +/* VMA userfaultfd operations */ +typedef struct { + /** + * @uffd_features: features supported in bitmask. + * + * When the ops is defined, the driver must set non-zero features + * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR. + */ + unsigned long uffd_features; + /** + * @uffd_ioctls: ioctls supported in bitmask. + * + * Userfaultfd ioctls supported by the module. Below will always + * be supported by default whenever a module provides vm_uffd_ops: + * + * _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE + * + * The module needs to provide all the rest optionally supported + * ioctls. For example, when VM_UFFD_MISSING was supported, + * _UFFDIO_COPY must be supported as ioctl, while _UFFDIO_ZEROPAGE + * is optional. + */ + unsigned long uffd_ioctls; + /** + * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request. + * + * @inode: the inode for folio lookup + * @pgoff: the pgoff of the folio + * @folio: returned folio pointer + * + * Return: zero if succeeded, negative for errors. + */ + int (*uffd_get_folio)(struct inode *inode, pgoff_t pgoff, + struct folio **folio); + /** + * uffd_copy: Handler to resolve UFFDIO_COPY|ZEROPAGE request. + * + * @dst_pmd: target pmd to resolve page fault + * @dst_vma: target vma + * @dst_addr: target virtual address + * @src_addr: source address to copy from + * @flags: userfaultfd request flags + * @foliop: previously allocated folio + * + * Return: zero if succeeded, negative for errors. + */ + int (*uffd_copy)(pmd_t *dst_pmd, struct vm_area_struct *dst_vma, + unsigned long dst_addr, unsigned long src_addr, + uffd_flags_t flags, struct folio **foliop); +} vm_uffd_ops; +#endif + /* * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer @@ -653,6 +717,13 @@ struct vm_operations_struct { */ struct page *(*find_special_page)(struct vm_area_struct *vma, unsigned long addr); +#ifdef CONFIG_USERFAULTFD + /* + * Userfaultfd related ops. Modules need to define this to support + * userfaultfd. + */ + const vm_uffd_ops *userfaultfd_ops; +#endif }; #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index ccad58602846..e79c724b3b95 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -80,18 +80,6 @@ struct userfaultfd_ctx { extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); -/* A combined operation mode + behavior flags. */ -typedef unsigned int __bitwise uffd_flags_t; - -/* Mutually exclusive modes of operation. */ -enum mfill_atomic_mode { - MFILL_ATOMIC_COPY, - MFILL_ATOMIC_ZEROPAGE, - MFILL_ATOMIC_CONTINUE, - MFILL_ATOMIC_POISON, - NR_MFILL_ATOMIC_MODES, -}; - #define MFILL_ATOMIC_MODE_BITS (const_ilog2(NR_MFILL_ATOMIC_MODES - 1) + 1) #define MFILL_ATOMIC_BIT(nr) BIT(MFILL_ATOMIC_MODE_BITS + (nr)) #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr)) -- 2.49.0