From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FCF3C77B7F for ; Fri, 27 Jun 2025 15:47:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4039E6B0093; Fri, 27 Jun 2025 11:47:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B4B66B00BE; Fri, 27 Jun 2025 11:47:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 192726B00A9; Fri, 27 Jun 2025 11:47:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 029106B00BE for ; Fri, 27 Jun 2025 11:47:12 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C47F7B6066 for ; Fri, 27 Jun 2025 15:47:11 +0000 (UTC) X-FDA: 83601609462.14.987BE30 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 9A05D160003 for ; Fri, 27 Jun 2025 15:47:09 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QpF3Ouz+; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751039229; a=rsa-sha256; cv=none; b=wBZYt/qVaHeRZ1PExTH7t8wqiDKzECrwpfR39gFh0DFDNBR4ZueljjBTAcjpnTJ25feBvt iBb7s53QYyX3XipTnVVt/SDfHnWbl17zX5UaG9NtkE65KfoE8/CuMiwRXaRU7iv4+t+tGA /rpHOjof7TCuOoYiRK+9Zu8zydA2gG4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QpF3Ouz+; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751039229; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DdfSKdvyHqltB4jBxjhCDVFPXPlOAjjmfzthyT7ql+g=; b=PkygcuftOvJ13vnqXD/khdRVhpxORfIJ2C5SY/TfnlHd5rk+L6mLoIWqen0or9kYvqi9ae 9q/Y6E3LE+VHy3/OYeW3zQHEYGkcC9Uy2KBgTpwfBmA88rsCM5ov4JeoKC9reUTtGfUnC/ aKyFJnxbmzmoEytvJlLZoanP32G9YF8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1751039229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DdfSKdvyHqltB4jBxjhCDVFPXPlOAjjmfzthyT7ql+g=; b=QpF3Ouz++lGeZqnEhfHFuK1PLLwTYwcktrvvFDkAFq+WJKdvRLkJRR/GzWcyhfgxczN+ov lIHoF/KwAvBOu8hEIVygri8iqYsLI2auVK1EHPyXi9vVuHlA7EBZYsxjc7dmFMu/WK3mG5 6FeWe+p1y0XAhP7QeOC0bsKQXZPikhM= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-45-3hHu5R5dNqa0SvxwUXQRrg-1; Fri, 27 Jun 2025 11:47:04 -0400 X-MC-Unique: 3hHu5R5dNqa0SvxwUXQRrg-1 X-Mimecast-MFC-AGG-ID: 3hHu5R5dNqa0SvxwUXQRrg_1751039223 Received: by mail-qt1-f199.google.com with SMTP id d75a77b69052e-4a6fb9bbbc9so75667111cf.0 for ; Fri, 27 Jun 2025 08:47:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751039222; x=1751644022; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DdfSKdvyHqltB4jBxjhCDVFPXPlOAjjmfzthyT7ql+g=; b=K8P1JEtINa6a1cTKHqf3FFf5Z0IZoHrmUE4G7wW2hhLz4jNq4Wq+jxaLd7G3ZP0IWc Ivv0BaoMwgddHS+6QHD0W+Syh0QkWP+JPCYDSUHxtNyHKmdW5tDY/6JsEIO3DnjPezFH c1NnqaQ1rEkN971+kW75uhfUO2GV3X8IkZx0QUv0v97Ct6qOh+vLhdvB5y92v1BIxjnN 3MWXH51wouWvdFm9YVo9hBpdx7eKRV3g5SiBf3ehR/7E8wr0Njl4an1kiAeOm0mezYjD Rnf4tc+ZFA9wBeVgL5OJQFbGCU1YfidfpgxS22W7/Rh/KZ4JO8ioNlvFnD6Ad0YgqpFB jKzw== X-Gm-Message-State: AOJu0Yxc/dOWs1Ul1Cip5VgPSh3MZCFhyIN7BzFizLKaxbAn9b6oVmqr mTiFz9+3OUmmbdaoO51BhojRFrK0cxFGZkq5qu/Ct9tjukG+dIQecWZJXUae3ze2PAYabLYt4/R oSbJdeJ2AhcKAOl37dUDsbhJsBUObed4a720qTGZogid6+4ojXegGwKVslNYe+mxUbmAOvguxzn oGWabuBnMexgjmS65//fgNTgOxvtYOWvHIFw== X-Gm-Gg: ASbGncv0iynIC7G08DIWZ0vElDNuvUmwim1dgHWSFWQOu3dYKVXe3CD54MTm1rPzx5X 2PtQ93zG3LFp9V0lAV9sCX3QFHI/AQxYiW1ILaZd0CPlFWvkio7/DwSwKWFxDDAGGiYFm4VBSo8 j/L/htkkO61mA96BmSB7grLG6GXScwFpjD8lfSFzr99sbyaoOm+Wn263oFxL+Zi1Vwr4wDQelqS O48S1S5TqlFU7tGBTgJ8+xBLSCp9Cw6DX2wWgkcUutN0hFLMtR+031niyy03Jfi4oYRml4/on92 FkCcsQRNGjA= X-Received: by 2002:a05:6214:4287:b0:6fa:b03f:8a39 with SMTP id 6a1803df08f44-70002918557mr69776376d6.35.1751039222257; Fri, 27 Jun 2025 08:47:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGqPx1Yiq6pj0PPzBl/SyDQ/yJvfWNr26rAFkehOuVMiuG+zZIZjshG/+mWcNwt5gDb3iVcZA== X-Received: by 2002:a05:6214:4287:b0:6fa:b03f:8a39 with SMTP id 6a1803df08f44-70002918557mr69775896d6.35.1751039221825; Fri, 27 Jun 2025 08:47:01 -0700 (PDT) Received: from x1.com ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6fd772e4fddsm22296066d6.65.2025.06.27.08.46.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Jun 2025 08:47:01 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Vlastimil Babka , Suren Baghdasaryan , Muchun Song , Mike Rapoport , Lorenzo Stoakes , Hugh Dickins , Andrew Morton , James Houghton , peterx@redhat.com, "Liam R . Howlett" , Nikita Kalyazin , Michal Hocko , David Hildenbrand , Andrea Arcangeli , Oscar Salvador , Axel Rasmussen , Ujwal Kundur Subject: [PATCH v2 1/4] mm: Introduce vm_uffd_ops API Date: Fri, 27 Jun 2025 11:46:52 -0400 Message-ID: <20250627154655.2085903-2-peterx@redhat.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250627154655.2085903-1-peterx@redhat.com> References: <20250627154655.2085903-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: bzfgYOoSgSNPH563KGv8qCu6k-3q5VDR45dqx3yMf1U_1751039223 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspam-User: X-Stat-Signature: fzerrzx766gpz8qtpowssc6twro7i6wq X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9A05D160003 X-HE-Tag: 1751039229-901955 X-HE-Meta: U2FsdGVkX18SpFwgN+3fDJtVuL4yS1LPe1zY7vvNEXyvq9yKQXLVXkBYrLxbR1D1UQrGn2tS2k0MCuNT+IH0ohWA5vNaLShXFypeWpBmkoIX0uW3rvluR5Joj7U+kpPB7Zq1uHCgEPEMDi6oIgxq5u1NgYgv6zza4HkrlSxm4/wxKlYADMvDzBguOuBLF/ijzmct1i3RVa39oqLgpq166eocf80Ri9xd3HzGx9AZb1b2mzZhUZnmdP/nnnrquQLqh5Y5F3jmtoAjIbSZGZPRF/lBlruvmrUvlFeXx2XTs3nXMobU9wxBmK1MXHLhc3HYA94sBp1pl2JTx6KGksr6dV8bziqMVoNdB9Cu6IMmmQQ0vcA8anEaR/vQmO83nWAyAnzVJkx9b/aS1/Drnlpkxu/ZXz7oJj0l0YH6jiw/IGeK3pLUAofAt41jQsayc1GqzIlQ93xcDLxjlnYvmH/QKJswpJmSnxxlncgFHCCUApsjVqyUsf7u9do0fxcxzZaQhynTnWbYJ3Muz/SyOhtqH4WHmFH/L6Zi2XD7vVO7ck5vy+rMAeJfEkODzl3UKlP7I1U83AvWQ1ihtI1uOxjtXBomA0dCGPETKjXXeVprYvVz8OoJEh0xY5fuRn5tCq2RnMooMhnXUdgLvCxLN9ZgTidhnRGSEiZv0mh3ArNlbXHJ5GoIXQhbaK2IYhnUSjk+KPHshR2aS08ePjhDfOZEQeEwNKDcv6llAPRoSUXE+arI7C3O8nqdfWp64ijKTAXVihH2Gt0hSR2NwJnbllMXHxeDUzi1UV+XbTPG12AkDcsOSE200TYkigjrT0xFVaVZA9c7tiP3bPzflDzRdzb14ZEvWluRYWzHZ0+Z1b4YOzVOfSVxjzpIbXAvldQbk12ySXPjQrqjU0/pH814pg8ajiUPjXQeja6VCqeFJ5aS+Zcd2MXjwsjcPY/FW379pSfrefYHp8pDZAxySkz7NaY 6mp1iwKl 5F2uSgeyx/GjLCCdZIshn3GXsYfCXRsNzw/zJypGoD36lScDOnXB9EsKa/8qUwM+4HWM6AWY8n17vB67Cn9zwOPA/ywMB98fNzJ/UyOIBA+lrwsel+kbNZjZ5V/0K8yqmxXlYOCYbIDUVX5ghQccoNBmo5RdcwM6b+1supX/kirX3hApeREGbRRM0vjENb1cw/rAY3brwv4r/ulHUeyHrESKbPC5OtOx6RsDSANNaKtfIM5O3R3RNpx9CMn7uKkcNeEMES9IalvS/etIgt90NdEbeO06ORKMlbY38ET9z5mwNFE9qnXN6lxJ2Dldv30QjlsTSwEBLJ35CJ7E8bYmoxEEzgqUSLKXs0jKzh6IdWYQ01RqWK2ypUZEg1BAnzj/1M/w/AUVDuntaUFPCzzAxfnbAuzCiaESFsjGq2i39t/nAFYGNeEDIl0hUdZfknvvGegTD2KdOEi8BL3NDePtedpUyIJiRjE+R0d/4WJACmrpPhW03UAhLAOerGZy7SVvQxXldc/zJwELoAn0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a generic userfaultfd API for vm_operations_struct, so that one vma, especially when as a module, can support userfaults without modifying the core files. More importantly, when the module can be compiled out of the kernel. So, instead of having core mm referencing modules that may not ever exist, we need to have modules opt-in on core mm hooks instead. After this API applied, if a module wants to support userfaultfd, the module should only need to touch its own file and properly define vm_uffd_ops, instead of changing anything in core mm. Note that such API will not work for anonymous. Core mm will process anonymous memory separately for userfault operations like before. This patch only introduces the API alone so that we can start to move existing users over but without breaking them. Currently the uffd_copy() API is almost designed to be the simplistic with minimum mm changes to move over to the API. Signed-off-by: Peter Xu --- include/linux/mm.h | 9 ++++++ include/linux/userfaultfd_k.h | 52 +++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index ef40f68c1183..6a5447bd43fd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -576,6 +576,8 @@ struct vm_fault { */ }; +struct vm_uffd_ops; + /* * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer @@ -653,6 +655,13 @@ struct vm_operations_struct { */ struct page *(*find_special_page)(struct vm_area_struct *vma, unsigned long addr); +#ifdef CONFIG_USERFAULTFD + /* + * Userfaultfd related ops. Modules need to define this to support + * userfaultfd. + */ + const struct vm_uffd_ops *userfaultfd_ops; +#endif }; #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index df85330bcfa6..c9a093c4502b 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -92,6 +92,58 @@ enum mfill_atomic_mode { NR_MFILL_ATOMIC_MODES, }; +/* VMA userfaultfd operations */ +struct vm_uffd_ops { + /** + * @uffd_features: features supported in bitmask. + * + * When the ops is defined, the driver must set non-zero features + * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR. + */ + unsigned long uffd_features; + /** + * @uffd_ioctls: ioctls supported in bitmask. + * + * Userfaultfd ioctls supported by the module. Below will always + * be supported by default whenever a module provides vm_uffd_ops: + * + * _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE + * + * The module needs to provide all the rest optionally supported + * ioctls. For example, when VM_UFFD_MISSING was supported, + * _UFFDIO_COPY must be supported as ioctl, while _UFFDIO_ZEROPAGE + * is optional. + */ + unsigned long uffd_ioctls; + /** + * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request. + * + * @inode: the inode for folio lookup + * @pgoff: the pgoff of the folio + * @folio: returned folio pointer + * + * Return: zero if succeeded, negative for errors. + */ + int (*uffd_get_folio)(struct inode *inode, pgoff_t pgoff, + struct folio **folio); + /** + * uffd_copy: Handler to resolve UFFDIO_COPY|ZEROPAGE request. + * + * @dst_pmd: target pmd to resolve page fault + * @dst_vma: target vma + * @dst_addr: target virtual address + * @src_addr: source address to copy from + * @flags: userfaultfd request flags + * @foliop: previously allocated folio + * + * Return: zero if succeeded, negative for errors. + */ + int (*uffd_copy)(pmd_t *dst_pmd, struct vm_area_struct *dst_vma, + unsigned long dst_addr, unsigned long src_addr, + uffd_flags_t flags, struct folio **foliop); +}; +typedef struct vm_uffd_ops vm_uffd_ops; + #define MFILL_ATOMIC_MODE_BITS (const_ilog2(NR_MFILL_ATOMIC_MODES - 1) + 1) #define MFILL_ATOMIC_BIT(nr) BIT(MFILL_ATOMIC_MODE_BITS + (nr)) #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr)) -- 2.49.0