From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAA80C19776 for ; Fri, 28 Feb 2025 09:07:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF6626B0083; Fri, 28 Feb 2025 04:07:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EA4B5280001; Fri, 28 Feb 2025 04:07:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1DF76B0088; Fri, 28 Feb 2025 04:07:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B45D76B0083 for ; Fri, 28 Feb 2025 04:07:52 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 690851405D4 for ; Fri, 28 Feb 2025 09:07:52 +0000 (UTC) X-FDA: 83168775984.03.BBA0344 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 019BA40006 for ; Fri, 28 Feb 2025 09:07:49 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=C7Vf1FsR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740733670; a=rsa-sha256; cv=none; b=tH/3Fq9tZdGJ042jBX5/2Emr/SdpV+gGFzw2f/17FX50V+PeJyrQF7MJ5zGSr7i7Reqnf7 7yuQdk6S5wA6Xm7g6U2/55Gvzts0swyNLn6Nz42A7iGMAQIJ5Ug7SG1m318LwLAjNgwhap OpqSICQqBdRwPDUUeVz9lxFGFNZWcxk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=C7Vf1FsR; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740733670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CLpDI5xGE6mXb+IR67XG43Kov2MdpYGXuo5ipHK4BBc=; b=OVnzr6Ys1RuUQZhOx8mVAVQP2RrWetW1wZ3RjjyR3HsKuvK08koy6D7mHBF8Nht3nubQhB t2Kfm45OZraFmCFFSrOTEqFsX13lPjWtIyaeSnMZ45/UWLtvWuUu7tYcRc0CU40AmZKFwR RwA65NtT1VEcFkJu1bgob3hOxIaRtec= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740733669; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=CLpDI5xGE6mXb+IR67XG43Kov2MdpYGXuo5ipHK4BBc=; b=C7Vf1FsRLWXGewCJ3rNtx7YOsbgHfWOfXlwXGpWqz5V1QS1IMy3t03huQPmb/u3VO/4F7n tyXCU4HSxRTLssQpQloZM9BGfizo5HuzrahC15GRfvGe4gPb8d1t8x7cu3M/N5NjCOrYXK v7Fa88pzew0yr8lTCmiB3fikDfuFZss= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-614-763bzMeyNji8ShMk0rXhGA-1; Fri, 28 Feb 2025 04:07:47 -0500 X-MC-Unique: 763bzMeyNji8ShMk0rXhGA-1 X-Mimecast-MFC-AGG-ID: 763bzMeyNji8ShMk0rXhGA_1740733666 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-4398ed35b10so9858635e9.1 for ; Fri, 28 Feb 2025 01:07:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740733666; x=1741338466; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=CLpDI5xGE6mXb+IR67XG43Kov2MdpYGXuo5ipHK4BBc=; b=vPkNfN4knUh1JoYvZwbWsrc75+55JjY2n06cX4xx2EIjmqri7ObwL8+Q0dpltE0UFZ uVIxQU6gCrWPJOGwiAM9jUoLbsWpgVLHH0ZHcSS3xa+y3VKCqS2ztdGowfb0XruVxZgH KwxgdIsekRL+xoMrm1poSRoF3ro0K44VOT9GOQ1RE2Qays7oSojnJuEbqidNr9fnT6yU uK6M978Ow/EZhBxpa5HDVTMH5z8TOeIR4xRBW1aLqkASO5IdD0DJsZazxg9BLsw4M6FB dEn4Ydead1NiEl+Zb9EReI2klATlES5HjFSz5EhS+3Xqme/VQe1O4jPvGzO2DrQUnf2d 1sjw== X-Forwarded-Encrypted: i=1; AJvYcCWwqcfQzkH/+lTWsHgqMtvo8c5vw1+IJhpgOjefubBIyMZdtItN6IwSbCpua/pIztFjfNSgF/emdQ==@kvack.org X-Gm-Message-State: AOJu0YzF9swLbSZB3znEOD0l8AEIstpyIefZTbv94/XukZEM3bwrQ0LG fvgpf/sHbrr6to3EGdUoGUNr5Ch4w5RdPsioyXq4qYb1KW0fBdWUr9X1pvvisfXHrElVlCCHvMj GrM18C///pViOY097RrNyKxqt01CL5cahM4bO0QIXqQLxmOIX X-Gm-Gg: ASbGncuEH4mhAhKj+GfV4HnOSaMTFOvhuiEPyI8hkQpZCCoAiq/BvYySegRL7Co8CDi pD4ph7egg2qU+WuJdWwrT1yAx6BnA4+vJpiUiaQ4qzWhzg1V/rl7ypzPxKavl7iCZT/62P6W8dj SYvyYMrTF6IdAc97Lm4UAziMRqpF5O07oyS9E4zJncuKO1l4b+IEcZAncZaC38bfiMPWZd7n/sl i9olNMzAtn+J3td30NSBFW8v9hG2kUklGy6sFPq/UGsYzWFo6R0V7spindLv/c16qiyhWhftfXW MN4rmrW5pITqrzAzk8DTKlWbG+BXFB1KDKhXb0ASu7V29bQnOUj2V3gU5bXKCbU4L99JOMINVLk 318ABQ3i0J2d002EZGMZBR9qa6yH8ng/PIQ7G5G+1TAI= X-Received: by 2002:a05:600c:5494:b0:439:9274:8203 with SMTP id 5b1f17b1804b1-43ba66da2c6mr18843505e9.6.1740733666025; Fri, 28 Feb 2025 01:07:46 -0800 (PST) X-Google-Smtp-Source: AGHT+IHcowIRN1si6B/2EEwD1Z6PD6dmMdXxwC9HncOy9CKvmoKbafJCK/yvPTC25F2J3LGP00Ca1w== X-Received: by 2002:a05:600c:5494:b0:439:9274:8203 with SMTP id 5b1f17b1804b1-43ba66da2c6mr18843165e9.6.1740733665603; Fri, 28 Feb 2025 01:07:45 -0800 (PST) Received: from ?IPV6:2003:cb:c701:e300:af53:3949:eced:246d? (p200300cbc701e300af533949eced246d.dip0.t-ipconnect.de. [2003:cb:c701:e300:af53:3949:eced:246d]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43ab2c4051bsm84198565e9.0.2025.02.28.01.07.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 28 Feb 2025 01:07:45 -0800 (PST) Message-ID: <2dcaa0a6-c20d-4e57-80df-b288d2faa58d@redhat.com> Date: Fri, 28 Feb 2025 10:07:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Optimizing Page Cache Readahead Behavior To: Matthew Wilcox , Dave Chinner Cc: Kalesh Singh , Lorenzo Stoakes , Jan Kara , lsf-pc@lists.linux-foundation.org, "open list:MEMORY MANAGEMENT" , linux-fsdevel , Suren Baghdasaryan , "Liam R. Howlett" , Juan Yescas , android-mm , Vlastimil Babka , Michal Hocko , "Cc: Android Kernel" References: <3bd275ed-7951-4a55-9331-560981770d30@lucifer.local> <82fbe53b-98c4-4e55-9eeb-5a013596c4c6@lucifer.local> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: F5CeehpYcKs5Wl8jA8hChwzDPYmNK-376znf_TvJvVc_1740733666 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 019BA40006 X-Stat-Signature: 5iasejb14zkx5hkzbujq9ikraint79pw X-Rspam-User: X-HE-Tag: 1740733669-246821 X-HE-Meta: U2FsdGVkX1/wUoblIOt7+V6ynPCP61bBfK020E1tbZEBYE+oXdrm0WIEnc8UvwH35Xnj42qpnG/01qdfBU+zpcIjr7wikwW+oLgIahbdnJ/NPhBSezqAQnqHc2QhAkWPl/7o1COubWq4NEBN0nYer+hevnNx/yV0ep4bB1/U+OXraC2pxmNNuSoUpKqIe4HfIqXyE92xuqtE9qwcV9TD6soYkT9MCiUG/HaIOC1NjPrfzcBZ/C2ctZYkpqw+3dvZatjOoHdnlYWY6nouCAFUZMFXThrUKbWoUrJNPde1EHWthKM6v+tmES3D1igpjX4XnD/U+iUUFa5oj7vymZk7tnLe2mTIuV9QhviXMcnkq88WmsWf7/2H5qbl0JqA7rLdZwBXiZR0R1vM9BcUKTQHhNZ/GESIYvbU2si7g3MPoEPutso6o2vjjopLGlak3j79tOuJWppSVCqPAEfLFLqIHqNyKrLvliNdqyg/Bs5q3xQb+5ZBevudomYsmn8ntCTsPNELltxp//hH4AZAseIwzidcrGELpdd6FgWKexP9W8by79H8K1gtroa8xCBn6sm6GbvyEcRHoVuDgg2GbzCr9J8Vb5ijGxTzq16n+1/3gZ3JwOZCY83RC4I6NgYaSe2LlnImfZEsUD2M+IMN5bVYWkuofdNc/w0Drhgaio4/Hv7eCG9B1kjuO8QGHZHwJnRGG2HaERk0NfyPBn3NQFFLyFMON0tiHX8aPTtCmjzStMkywfj0nquz/MbcVpWkg0TfkijYtbDmNrDk4ZjX1hZRGD6BvfI2DN6M8HbvNWmeixlu2iY/y72Ti3HvvAPW8kfDa8hr7NQRZs9rkh+SiuRIDpRoTt5I/7Yp1VqJlpaHlne3+GgCnTiKoOj0PLxSe+oLKRj23aOjzI57dWqpQnvoKGKGo9SVGLQeKQ+a/KzFEBnge27R6Q5VEN871hQak3mY8GOjiqrKDmYu+XNem8f hVQLxKad ki6bQWDgfFNZDLi9mtJtlrVJ+IKym8qstCQbx0bFkb3YjF+CnNFgkGr96SKnG38ISmorM06qEhDcn5RVPfaxYUCLn/J/K5XnG7MwUMegohPIaHOK5rxYjAbBM11+a9sU0J9wFAeg/MU15FtiK14kPf2EfjiMQKaxBZ/jsX/OD5+PgqutUSfyda6wHiNdwHNT7+A1hkC7fzNUM2S+GSuKf+si4W6sESICT9igUq2qmRscFvmlmjAoyZsfmDhqJfZrwh/knnz1ggKDTntpQ0TKUN1o7QZCSe6IGHpsYbAB+wJYV8ptXUU7oc7wxw5QvaUxC24N+YhetZ7e42jbbJpsndszz5EInpHwOldaNC8U4ezc6DdCGyuoi+difYhowHSbRTXBnTzFAWN6EUszs+E99nEd8WbC1xTXzDyV/DtS8KoU9c3W9obQgnLKlTUISaxjkyH2/xzpeKn7G03qQIfXJFy4dyBmwfLjsrKsM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 27.02.25 23:12, Matthew Wilcox wrote: > On Tue, Feb 25, 2025 at 10:56:21AM +1100, Dave Chinner wrote: >>> From the previous discussions that Matthew shared [7], it seems like >>> Dave proposed an alternative to moving the extents to the VFS layer to >>> invert the IO read path operations [8]. Maybe this is a move >>> approachable solution since there is precedence for the same in the >>> write path? >>> >>> [7] https://lore.kernel.org/linux-fsdevel/Zs97qHI-wA1a53Mm@casper.infradead.org/ >>> [8] https://lore.kernel.org/linux-fsdevel/ZtAPsMcc3IC1VaAF@dread.disaster.area/ >> >> Yes, if we are going to optimise away redundant zeros being stored >> in the page cache over holes, we need to know where the holes in the >> file are before the page cache is populated. > > Well, you shot that down when I started trying to flesh it out: > https://lore.kernel.org/linux-fsdevel/Zs+2u3%2FUsoaUHuid@dread.disaster.area/ > >> As for efficient hole tracking in the mapping tree, I suspect that >> we should be looking at using exceptional entries in the mapping >> tree for holes, not inserting mulitple references to the zero folio. >> i.e. the important information for data storage optimisation is that >> the region covers a hole, not that it contains zeros. > > The xarray is very much optimised for storing power-of-two sized & > aligned objects. It makes no sense to try to track extents using the > mapping tree. Now, if we abandon the radix tree for the maple tree, we > could talk about storing zero extents in the same data structure. > But that's a big change with potentially significant downsides. > It's something I want to play with, but I'm a little busy right now. > >> For buffered reads, all that is required when such an exceptional >> entry is returned is a memset of the user buffer. For buffered >> writes, we simply treat it like a normal folio allocating write and >> replace the exceptional entry with the allocated (and zeroed) folio. > > ... and unmap the zero page from any mappings. > >> For read page faults, the zero page gets mapped (and maybe >> accounted) via the vma rather than the mapping tree entry. For write >> faults, a folio gets allocated and the exception entry replaced >> before we call into ->page_mkwrite(). >> >> Invalidation simply removes the exceptional entries. > > ... and unmap the zero page from any mappings. > I'll add one detail for future reference; not sure about the priority this should have, but it's one of these nasty corner cases that are not the obvious to spot when having the shared zeropage in MAP_SHARED mappings: Currently, only FS-DAX makes use of the shared zeropage in "ordinary MAP_SHARED" mappings. It doesn't use it for "holes" but for "logically zero" pages, to avoid allocating disk blocks (-> translating to actual DAX memory) on read-only access. There is one issue between gup(FOLL_LONGTERM | FOLL_PIN) and the shared zeropage in MAP_SHARED mappings. It so far does not apply to fsdax, because ... we don't support FOLL_LONGTERM for fsdax at all. I spelled out part of the issue in fce831c92092 ("mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()"). In general, the problem is that gup(FOLL_LONGTERM | FOLL_PIN) will have to decide if it is okay to longterm-pin the shared zeropage in a MAP_SHARED mapping (which might just be fine with a R/O file in some cases?), and if not, it would have to trigger FAULT_FLAG_UNSHARE similar to how we break COW in MAP_PRIVATE mappings (shared zeropage -> anonymous folio). If gup(FOLL_LONGTERM | FOLL_PIN) would just always longterm-pin the shared zeropage, and somebody else would end up triggering replacement of the shared zeropage in the pagecache (e.g., write() to the file offset, write access to the VMA that triggers a write fault etc.), you'd get a disconnect between what the GUP user sees and what the pagecache actually contains. The file system fault logic will have to be taught about FAULT_FLAG_UNSHARE and handle it accordingly (e.g., allocate fill file hole, allocate disk space, allocate an actual folio ...). Things like memfd_pin_folios() might require similar care -- that one in particular should likely never return the shared zeropage. Likely gup(FOLL_LONGTERM | FOLL_PIN) users like RDMA or VFIO will be able to trigger it. Not using the shared zeropage but instead some "hole" PTE marker could avoid this problem. Of course, not allowing for reading the shared zeropage there, but maybe that's not strictly required? -- Cheers, David / dhildenb