From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47A86C05027 for ; Tue, 14 Mar 2023 20:09:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7845E8E0001; Tue, 14 Mar 2023 16:09:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 735466B0074; Tue, 14 Mar 2023 16:09:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6235F8E0001; Tue, 14 Mar 2023 16:09:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 541F26B0072 for ; Tue, 14 Mar 2023 16:09:22 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 18ECD40C86 for ; Tue, 14 Mar 2023 20:09:22 +0000 (UTC) X-FDA: 80568593364.02.CC8E1DE Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf07.hostedemail.com (Postfix) with ESMTP id 5777C40017 for ; Tue, 14 Mar 2023 20:09:18 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="H/OzrV4Z"; dmarc=none; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678824559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IaZ/Ro/ijOLAt3RzYQfBauDHIMb1QN04KckKloToQhQ=; b=d+y+pz8cWvtimaRWDCBBorbGiLWKwzqglrb51cDRyjAUSf9EvqhVUQSFnS1fE/KQakDZLn 0L8QuBxe64U8SReMn0ALX2cdRJ7EjD4xpgTWQ5JQr3KEwQEfbeFLJbvVIpKDkBWg9KkKy6 VsLLPhwn+7CaFbYNPZEsIcVrRW6gRqQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="H/OzrV4Z"; dmarc=none; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678824559; a=rsa-sha256; cv=none; b=SaaWFiHAU1gU2tIcTlwVGItdZ42hLMRG0gFFKgSh60RptgRaSaxE/rzyLZQgWC9rJ6hW2x CPYU/8Budi6EDK9iWEsNTxgiAPpfhxms10+8pJtumqLbl92SwK45W4vnM49ChWXfaXB2Fe sOe8sk+Fm0bBq5ZU+JNZF7rupSOiIPE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=IaZ/Ro/ijOLAt3RzYQfBauDHIMb1QN04KckKloToQhQ=; b=H/OzrV4ZZOdQ766rINvqXhx3Uz DE5Ya4WI3WaejSgHW4GSEdlGfwC0K4ktwYnNkBk4L+TDz5TruL3iXen7Xmt1UDAFY7rF0xZ418wzN cql632ZS1KbI0zaDvhisw6wF200LC2ayvUqscm3z8DxN6yRCRSz4iuMeoBctga4b74Dg2+NrhMnSJ s7zZdegBZ6tpHLq6CKItzi9PlmTsIzltgYbuE0vzrn2PAm0Urtp7bkugiePcGDG0XOfSDbT3BlKDY ghLxTQg2eVkNAQYj9wF5Lbl7X3Fs3Iu7AtMARyxsIlkJswSTnVSoc1/tbtEZvBO+dt/Qk4EfaBvS6 4x5cqx7Q==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pcAwy-00DCLc-6R; Tue, 14 Mar 2023 20:08:48 +0000 Date: Tue, 14 Mar 2023 20:08:48 +0000 From: Matthew Wilcox To: Linus Torvalds Cc: David Howells , Jens Axboe , Al Viro , Christoph Hellwig , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Daniel Golle , Guenter Roeck , Christoph Hellwig , John Hubbard , Hugh Dickins Subject: Re: [PATCH v17 03/14] shmem: Implement splice-read Message-ID: References: <20230308165251.2078898-1-dhowells@redhat.com> <20230308165251.2078898-4-dhowells@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 5777C40017 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: e3hwif43zkxcka4uu9nocxdcm8hqf9ye X-HE-Tag: 1678824558-124200 X-HE-Meta: U2FsdGVkX1/BUHsp/YhuMZv8YzkhoC3EiPZXuQ1ppgQLw/VXVPVjFgfSEsUMFvjA/7iuGkQ5LIIqkZw64c8DDCtjKsax3z8Kmwtg59G+jlu2gKWV1N+m32vgHDBD0p/fEccrHQSMsCrZqJkaJgglD0GdWHk3Zh2rNG3Cus0REF7KIwF8Ke4hhceFg8jvgXwytirJG3/NYMoMGF7tmDHNZu7cX49cuChdlYh85mdV1F0NAVPIhM+dPi5fwJ0dJSSAyjhHeiCZ+dBGS9O272mQk72hUB9JOSE5j+rkMJTfxX4Ob5n2jdYyXT/o392QDtQr7KiHevJwfn7pzdoAE68WDp3qsJe7QWlum2AULbEWOdjb00CacZkUrCgbehN9vOh+AI4o4s54JST3Ir8x1IkxkkL5AwzbPcxLvHlbJuqCf86obJxRaX8j9grUuIeKdEHPqdNVFF38gEjO6OCBYHwKVhGdB2CLuWO//91158Fn/lK2BKwHryIkUZj4ByzzqR4RMKybZ8dGWou+mJzZJI1U8UbglMI4SzgkpNhX17TO1Jhp5CJVtC3fd2ZYWR1cAv9ZAiEUaCcCQAHSJl7Oxe9frf4Ve3d84QrGEKE4/qfsY0R32Fxe3bpN5xEGbBSsyYxaiezQf3qA8oa6AhRmsO+hJMjB6eHPNtH1/HBOogw/SLO+LqtSrdTDwXya8HBvA0QLn5fGMhKumQBCD4Ysf91JHITFPgzFfq9jR0rNwrX5gsAlFfKJ31j78C5XDkT1tpYCoc52dsBFhSTuOv+qwQjZ31wM6+7gMzq8CgoY7Kl3ogmf6IN7QsIKruUpUI9URLOUWvbjWWCqMMSw+Nr3lE36PjemFvTQ54BQffNrH+JW18s11MxWaxASKW7ZEVj0wGPjBX60jWhCdM/rncFcRUxyo/f03uccBGKYi0XYsenRptEhv5Gbv2IZKW8tTGndIbKTOg1BlVMrPEDMwACPrBC RkAEJwwC xAocQYZrR/mdbi9iUaTjf2fGRBotz+NzpcOQnuAkQoi/3g/g4GqpOWco8cIMJczvaaMRBj5X8ybUvGEX9wIJ0a34y/ZIVDWoJtkmYDNm6vuvgHHJqmwRQfDB9VjUx7R4l5HBUKIpaSvMYogZfjydoD0RUpc+gQTck9XfXzWpf8CWIBglVCTT1zYi8BGwb5P8rG5XLUyu5VoFrhObC7Mj23I6z5A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 14, 2023 at 11:02:40AM -0700, Linus Torvalds wrote: > On Tue, Mar 14, 2023 at 9:43 AM Matthew Wilcox wrote: > > > > The problem is that we might have swapped out the shmem folio. So we > > don't want to clear the page, but ask swap to fill the page. > > Doesn't shmem_swapin_folio() already basically do all that work? > > The real oddity with shmem - compared to other filesystems - is that > the xarray has a value entry instead of being a real folio. And yes, > the current filemap code will then just ignore such entries as > "doesn't exist", and so the regular read iterators will all fail on > it. > > But while filemap_get_read_batch() will stop at a value-folio, I feel > like filemap_create_folio() should be able to turn a value page into a > "real" page. Right now it already allocates said page, but then I > think filemap_add_folio() will return -EEXIST when said entry exists > as a value. > > But *if* instead of -EEXIST we could just replace the value with the > (already locked) page, and have some sane way to pass that value > (which is the swap entry data) to readpage(), I think that should just > do it all. This was basically what I had in mind: I don't think this will handle a case like: Alloc order-0 folio at index 4 Alloc order-0 folio at index 7 Swap out both folios Alloc order-9 folio at indices 0-511 But I don't see where shmem currently handles that either. Maybe it falls back to order-0 folios instead of the crude BUG_ON I put in. Hugh? diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 82c1262f396f..30f2502314de 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -114,12 +114,6 @@ int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, struct folio *shmem_read_folio_gfp(struct address_space *mapping, pgoff_t index, gfp_t gfp); -static inline struct folio *shmem_read_folio(struct address_space *mapping, - pgoff_t index) -{ - return shmem_read_folio_gfp(mapping, index, mapping_gfp_mask(mapping)); -} - static inline struct page *shmem_read_mapping_page( struct address_space *mapping, pgoff_t index) { diff --git a/mm/filemap.c b/mm/filemap.c index 57c1b154fb5a..8e4f95c5b65a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -877,6 +877,8 @@ noinline int __filemap_add_folio(struct address_space *mapping, order, gfp); xas_lock_irq(&xas); xas_for_each_conflict(&xas, entry) { + if (old) + BUG_ON(shmem_mapping(mapping)); old = entry; if (!xa_is_value(entry)) { xas_set_err(&xas, -EEXIST); @@ -885,7 +887,12 @@ noinline int __filemap_add_folio(struct address_space *mapping, } if (old) { - if (shadowp) + if (shmem_mapping(mapping)) { + folio_set_swap_entry(folio, + radix_to_swp_entry(old)); + folio_set_swapcache(folio); + folio_set_swapbacked(folio); + } else if (shadowp) *shadowp = old; /* entry may have been split before we acquired lock */ order = xa_get_order(xas.xa, xas.xa_index); diff --git a/mm/shmem.c b/mm/shmem.c index 8e60826e4246..ea75c7dcf5ec 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2059,6 +2059,18 @@ int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, mapping_gfp_mask(inode->i_mapping), NULL, NULL, NULL); } +static int shmem_read_folio(struct file *file, struct folio *folio) +{ + if (folio_test_swapcache(folio)) { + swap_readpage(&folio->page, true, NULL); + } else { + folio_zero_segment(folio, 0, folio_size(folio)); + folio_mark_uptodate(folio); + folio_unlock(folio); + } + return 0; +} + /* * This is like autoremove_wake_function, but it removes the wait queue * entry unconditionally - even if something else had already woken the @@ -2396,7 +2408,8 @@ static int shmem_fadvise_willneed(struct address_space *mapping, xa_for_each_range(&mapping->i_pages, index, folio, start, end) { if (!xa_is_value(folio)) continue; - folio = shmem_read_folio(mapping, index); + folio = shmem_read_folio_gfp(mapping, index, + mapping_gfp_mask(mapping)); if (!IS_ERR(folio)) folio_put(folio); } @@ -4037,6 +4050,7 @@ static int shmem_error_remove_page(struct address_space *mapping, } const struct address_space_operations shmem_aops = { + .read_folio = shmem_read_folio, .writepage = shmem_writepage, .dirty_folio = noop_dirty_folio, #ifdef CONFIG_TMPFS