From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C70FC64EC4 for ; Wed, 8 Mar 2023 23:42:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97A73280001; Wed, 8 Mar 2023 18:42:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 903226B0072; Wed, 8 Mar 2023 18:42:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77C89280001; Wed, 8 Mar 2023 18:42:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 647976B0071 for ; Wed, 8 Mar 2023 18:42:45 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 39DBE1A06F6 for ; Wed, 8 Mar 2023 23:42:45 +0000 (UTC) X-FDA: 80547358290.25.3CFB56C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 56E40180006 for ; Wed, 8 Mar 2023 23:42:43 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WV01Vnog; spf=pass (imf16.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678318963; a=rsa-sha256; cv=none; b=uo+eWUX4k+A8D5VxRXiQeK1KRkK7QUwopUIAbjAjsAzSrZofYxme4pMd7x4EirY/iyjrdM 4kXIVG8wE+HJgQOeHaBGTV0rajBIWgwtc6KweGj5HyWdURjVvq/cpBaiW8v58KrOQTRmmV 0nRa2uFZrmrtllcGckxO+qowBuQHrGE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WV01Vnog; spf=pass (imf16.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678318963; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z4+90+ZxONY2XZZy5o49XHYJX/j1G6CBk1SjoO4mL6Q=; b=Bc5UYyOV4/8U477SyMG8zUuciHtAoIm6wQeXmqiK27VIWvzhU5kz+L/mGei3dlaFMj3CzE ns8ZjTTrB9ejC5A8mKRL3EFWh2/dJpSZrcMVoQFshA9iYZgBa1lrix61dHTVyUGGUfq5Pa XJc/uGHp30znfg0wJ1lVsuLEe0HKM9g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678318962; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Z4+90+ZxONY2XZZy5o49XHYJX/j1G6CBk1SjoO4mL6Q=; b=WV01Vnog3TTAs9Y0MuegB7UXLvYTSW3jguuh2iI5DyTQu12A5KzRTpiL0ExsGHRzaqph/R 8ylowAuwVtoVreZz2CKonGoqJ3lG7DMvCmZe3Y7RWF+uzBbtKL4P3dfMilZIMMMNw/+7Ww lLQ/0jKEg5WSOiJrF7U/93k7o8aYKYQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-395-WEXlRhpyMhWZy7SX8kvLaw-1; Wed, 08 Mar 2023 18:42:37 -0500 X-MC-Unique: WEXlRhpyMhWZy7SX8kvLaw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 78A1E101A521; Wed, 8 Mar 2023 23:42:35 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id E59B440C945A; Wed, 8 Mar 2023 23:42:32 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <20230308165251.2078898-1-dhowells@redhat.com> <20230308165251.2078898-4-dhowells@redhat.com> To: Linus Torvalds Cc: dhowells@redhat.com, Jens Axboe , Al Viro , Christoph Hellwig , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Daniel Golle , Guenter Roeck , Christoph Hellwig , John Hubbard , Hugh Dickins Subject: Re: [PATCH v17 03/14] shmem: Implement splice-read MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2139516.1678318952.1@warthog.procyon.org.uk> Date: Wed, 08 Mar 2023 23:42:32 +0000 Message-ID: <2139517.1678318952@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Rspam-User: X-Rspamd-Queue-Id: 56E40180006 X-Rspamd-Server: rspam01 X-Stat-Signature: 9o8wrk1stxobbgnk6yxsidq8a69rq7jb X-HE-Tag: 1678318963-572130 X-HE-Meta: U2FsdGVkX1+FxpZ4SVyUfqovPO8yaxQIiWlDdaqQ+l0xVbRbS45DLnRROeXhMPJVBoTONuG0bFwypYENRlyCuauKyEngy+vjyoFwwGoyMbjTjJQCdkA5G6gObb/ELruSUdxdECg4zjk2BRq1YVONAFEXlBPksiwEoW1GwfgZTbljJQZCqdR0sQwMzcd07n0BOAA4AMLTyBNcG0a276B9d554qIQs34vBCHec8z/WYVzl6lBwxvz3YbK3xqQbguRH8Xm4KHWidJUethfHRuQBFN1/HgoxyXiNHa1yMtbO7ENASkIUsPY6J5idO18KpoNNS6wXbU0d80Wrfy8WrBAHnKTL+ST/utoKHjXIdGIR+L/ZNI2SGNGpqV0a7HWaJlQkSPleY1xjE93MjIXsYU88y31P8SxpuH47E/4disFyVef99Svpocvj9d8YXDG+duRNCBlHFolEdAJybCKwaB+yQPtv05Mf9AhHJzfjtvgKt0Ej0M9eZNqwJJDVA5KJR5UBPDOkFA58HQ8rZz7NDH0kOGT03RAh6VosATIHrmJtPfnJFRHebrmZk4JkjfJK+6zCRjK6HISWmNOBdKIaGGSmKliQlRrCZ4ZS6Pn1htzpWWv1KKaNB5W4CdjswKs0ytwU/9l1Fx/lZb4KYFqJZd5LLmhqGk8nndsgArbFmi9wMlke3Y340ht2m6/htSZXu1EalEyQv2bzxNHxAJg/z8nIpDUfHHTgF1aj68Pg46mMUvBTqCcUusmW/TdQhcWZ6dr0MQ6wNExmeCukPDY2y9umB1V3Qwmpm2TjS4qO69Q4Qk0x4S6fKiZOT8kkL9DFjxcATiqkiEhvR87o6lgQRDwYLi5zW4eMU/+8lslT1FhRTUPjuSlEKsSvJFloXnOywABQTkbbBgdULu/ELqePBkqIPl8qRiCVrYgWaQPa/Z6Gp5T5hKGEsChD98vlrPhkJE2thziqiQ/27r1bXBOR3I6 c9k45BL1 R0XCb4xpZtOwtCIpAuu66kZHCuQ67XX4XkbgyowbD+GFH3CrD1N7BXp70CCkXQbZznvc7m++j1L+3nG2isvoEP2GSSPnwe0j1MeosMtYZBXz9iEjXUxqx01e7vMz5OnSBtbxYigc4tFK5uG4Lm8SSpZCHNGJfDGSyut8c0EL9Q7gUoJvnm4GnCK+gU2u8jVwmoBJS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Linus Torvalds wrote: > I get the feeling that the zeropage case just isn't so important that we'd > need to duplicate filemap_splice_read() just for that, and I think that the > code should either > > (a) just make a silly "read_folio()" for shmfs that just clears the page. > > Ugly but maybe simple and not horrid? The problem with that is that once a page is in the pagecache attached to a shmem file, we can't get rid of it without deleting or truncating the file... At least, I think that the case. For all other regular filesystems, a page full of zeros can be flushed/discarded by the LRU. shmem also has its own function for allocating folios in its pagecache, the caller of ->read_folio() would probably have to use that. > (b) teach filemap_splice_read() that a NULL 'read_folio' function > means "use the zero page" > > That might not be splice() itself, but maybe in > filemap_get_pages() or something. It would require some special handling in filemap_get_pages() - and/or probably better filemap_splice_read() since, for shmem, it's only relevant to splice. An additional flag could be added to filemap_get_pages() to tell it to stop at a hole in the pagecache rather than invoking readahead. filemap_splice_read() would then need to examine the pagecache to work out how big the hole is and insert the appropriate number of zeropages before calling back into filemap_get_pages() again. Possibly it could use SEEK_DATA. > or > > (c) go even further, and teach read_folio() in general about file > holes, and allow *any* filesystem to read zeroes that way in general > without creating a folio for it. Nice idea, but we'd need a way to store a "negative" marker (as opposed to "unknown") in the pagecache for the filemap code to be able to use it. This sort of thing might become easier if xarray gets switched to a maple tree implementation as that would better allow for caching of a known file hole of arbitrary size with a single entry. But for the moment, the filemap code would have to jump through a filesystem's ->readahead or ->read_folio vectors to work out if there's a hole there or not - but in both cases it must already have allocated the pages it wants to query. David