linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Keith Busch <kbusch@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Theodore Ts'o <tytso@mit.edu>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-block@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations
Date: Sat, 4 Mar 2023 07:34:33 +0000	[thread overview]
Message-ID: <ZAL0ifa66TfMinCh@casper.infradead.org> (raw)
In-Reply-To: <f68905c5785b355b621847974d620fb59f021a41.camel@HansenPartnership.com>

On Fri, Mar 03, 2023 at 08:11:47AM -0500, James Bottomley wrote:
> On Fri, 2023-03-03 at 03:49 +0000, Matthew Wilcox wrote:
> > On Thu, Mar 02, 2023 at 06:58:58PM -0700, Keith Busch wrote:
> > > That said, I was hoping you were going to suggest supporting 16k
> > > logical block sizes. Not a problem on some arch's, but still
> > > problematic when PAGE_SIZE is 4k. :)
> > 
> > I was hoping Luis was going to propose a session on LBA size >
> > PAGE_SIZE. Funnily, while the pressure is coming from the storage
> > vendors, I don't think there's any work to be done in the storage
> > layers.  It's purely a FS+MM problem.
> 
> Heh, I can do the fools rush in bit, especially if what we're
> interested in the minimum it would take to support this ...
> 
> The FS problem could be solved simply by saying FS block size must
> equal device block size, then it becomes purely a MM issue.

Spoken like somebody who's never converted a filesystem to
supporting large folios.  There are a number of issues:

1. The obvious; use of PAGE_SIZE and/or PAGE_SHIFT
2. Use of kmap-family to access, eg directories.  You can't kmap
   an entire folio, only one page at a time.  And if a dentry is split
   across a page boundary ...
3. buffer_heads do not currently support large folios.  Working on it.

Probably a few other things I forget.  But look through the recent
patches to AFS, CIFS, NFS, XFS, iomap that do folio conversions.
A lot of it is pretty mechanical, but some of it takes hard thought.
And if you have ideas about how to handle ext2 directories, I'm all ears.

> The MM
> issue could be solved by adding a page order attribute to struct
> address_space and insisting that pagecache/filemap functions in
> mm/filemap.c all have to operate on objects that are an integer
> multiple of the address space order.  The base allocator is
> filemap_alloc_folio, which already has an apparently always zero order
> parameter (hmmm...) and it always seems to be called from sites that
> have the address_space, so it could simply be modified to always
> operate at the address_space order.

Oh, I have a patch for that.  That's the easy part.  The hard part is
plugging your ears to the screams of the MM people who are convinced
that fragmentation will make it impossible to mount your filesystem.



  reply	other threads:[~2023-03-04  7:34 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  3:52 Theodore Ts'o
2023-03-01  4:18 ` Gao Xiang
2023-03-01  4:40   ` Matthew Wilcox
2023-03-01  4:59     ` Gao Xiang
2023-03-01  4:35 ` Matthew Wilcox
2023-03-01  4:49   ` Gao Xiang
2023-03-01  5:01     ` Matthew Wilcox
2023-03-01  5:09       ` Gao Xiang
2023-03-01  5:19         ` Gao Xiang
2023-03-01  5:42         ` Matthew Wilcox
2023-03-01  5:51           ` Gao Xiang
2023-03-01  6:00             ` Gao Xiang
2023-03-02  3:13 ` Chaitanya Kulkarni
2023-03-02  3:50 ` Darrick J. Wong
2023-03-03  3:03   ` Martin K. Petersen
2023-03-02 20:30 ` Bart Van Assche
2023-03-03  3:05   ` Martin K. Petersen
2023-03-03  1:58 ` Keith Busch
2023-03-03  3:49   ` Matthew Wilcox
2023-03-03 11:32     ` Hannes Reinecke
2023-03-03 13:11     ` James Bottomley
2023-03-04  7:34       ` Matthew Wilcox [this message]
2023-03-04 13:41         ` James Bottomley
2023-03-04 16:39           ` Matthew Wilcox
2023-03-05  4:15             ` Luis Chamberlain
2023-03-05  5:02               ` Matthew Wilcox
2023-03-08  6:11                 ` Luis Chamberlain
2023-03-08  7:59                   ` Dave Chinner
2023-03-06 12:04               ` Hannes Reinecke
2023-03-06  3:50             ` James Bottomley
2023-03-04 19:04         ` Luis Chamberlain
2023-03-03 21:45     ` Luis Chamberlain
2023-03-03 22:07       ` Keith Busch
2023-03-03 22:14         ` Luis Chamberlain
2023-03-03 22:32           ` Keith Busch
2023-03-03 23:09             ` Luis Chamberlain
2023-03-16 15:29             ` Pankaj Raghav
2023-03-16 15:41               ` Pankaj Raghav
2023-03-03 23:51       ` Bart Van Assche
2023-03-04 11:08       ` Hannes Reinecke
2023-03-04 13:24         ` Javier González
2023-03-04 16:47         ` Matthew Wilcox
2023-03-04 17:17           ` Hannes Reinecke
2023-03-04 17:54             ` Matthew Wilcox
2023-03-04 18:53               ` Luis Chamberlain
2023-03-05  3:06               ` Damien Le Moal
2023-03-05 11:22               ` Hannes Reinecke
2023-03-06  8:23                 ` Matthew Wilcox
2023-03-06 10:05                   ` Hannes Reinecke
2023-03-06 16:12                   ` Theodore Ts'o
2023-03-08 17:53                     ` Matthew Wilcox
2023-03-08 18:13                       ` James Bottomley
2023-03-09  8:04                         ` Javier González
2023-03-09 13:11                           ` James Bottomley
2023-03-09 14:05                             ` Keith Busch
2023-03-09 15:23                             ` Martin K. Petersen
2023-03-09 20:49                               ` James Bottomley
2023-03-09 21:13                                 ` Luis Chamberlain
2023-03-09 21:28                                   ` Martin K. Petersen
2023-03-10  1:16                                     ` Dan Helmick
2023-03-10  7:59                             ` Javier González
2023-03-08 19:35                 ` Luis Chamberlain
2023-03-08 19:55                 ` Bart Van Assche
2023-03-03  2:54 ` Martin K. Petersen
2023-03-03  3:29   ` Keith Busch
2023-03-03  4:20   ` Theodore Ts'o
2023-07-16  4:09 BELINDA Goodpaster kelly
2025-09-22 17:49 Belinda R Goodpaster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAL0ifa66TfMinCh@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox