From: Dave Chinner <david@fromorbit.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk,
brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, Kent Overstreet <kent.overstreet@linux.dev>
Subject: Re: [PATCH 1/2] mm: Add memalloc_nowait_{save,restore}
Date: Thu, 15 Aug 2024 12:54:17 +1000 [thread overview]
Message-ID: <Zr1t2d/3tqNBc7qM@dread.disaster.area> (raw)
In-Reply-To: <CALOAHbCTv5w4Lg3SeA43yCAww8DobJ_CN+9BcQDMJzaHVPNZZQ@mail.gmail.com>
On Wed, Aug 14, 2024 at 03:32:26PM +0800, Yafang Shao wrote:
> On Wed, Aug 14, 2024 at 1:42 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Wed, Aug 14, 2024 at 10:19:36AM +0800, Yafang Shao wrote:
> > > On Wed, Aug 14, 2024 at 8:28 AM Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Aug 12, 2024 at 05:05:24PM +0800, Yafang Shao wrote:
> > > > > The PF_MEMALLOC_NORECLAIM flag was introduced in commit eab0af905bfc
> > > > > ("mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN"). To complement
> > > > > this, let's add two helper functions, memalloc_nowait_{save,restore}, which
> > > > > will be useful in scenarios where we want to avoid waiting for memory
> > > > > reclamation.
> > > >
> > > > Readahead already uses this context:
> > > >
> > > > static inline gfp_t readahead_gfp_mask(struct address_space *x)
> > > > {
> > > > return mapping_gfp_mask(x) | __GFP_NORETRY | __GFP_NOWARN;
> > > > }
> > > >
> > > > and __GFP_NORETRY means minimal direct reclaim should be performed.
> > > > Most filesystems already have GFP_NOFS context from
> > > > mapping_gfp_mask(), so how much difference does completely avoiding
> > > > direct reclaim actually make under memory pressure?
> > >
> > > Besides the __GFP_NOFS , ~__GFP_DIRECT_RECLAIM also implies
> > > __GPF_NOIO. If we don't set __GPF_NOIO, the readahead can wait for IO,
> > > right?
> >
> > There's a *lot* more difference between __GFP_NORETRY and
> > __GFP_NOWAIT than just __GFP_NOIO. I don't need you to try to
> > describe to me what the differences are; What I'm asking you is this:
> >
> > > > i.e. doing some direct reclaim without blocking when under memory
> > > > pressure might actually give better performance than skipping direct
> > > > reclaim and aborting readahead altogether....
> > > >
> > > > This really, really needs some numbers (both throughput and IO
> > > > latency histograms) to go with it because we have no evidence either
> > > > way to determine what is the best approach here.
> >
> > Put simply: does the existing readahead mechanism give better results
> > than the proposed one, and if so, why wouldn't we just reenable
> > readahead unconditionally instead of making it behave differently
> > for this specific case?
>
> Are you suggesting we compare the following change with the current proposal?
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index fd34b5755c0b..ced74b1b350d 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3455,7 +3455,6 @@ static inline int kiocb_set_rw_flags(struct
> kiocb *ki, rwf_t flags,
> if (flags & RWF_NOWAIT) {
> if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
> return -EOPNOTSUPP;
> - kiocb_flags |= IOCB_NOIO;
> }
> if (flags & RWF_ATOMIC) {
> if (rw_type != WRITE)
Yes.
> Doesn't unconditional readahead break the semantics of RWF_NOWAIT,
> which is supposed to avoid waiting for I/O? For example, it might
> trigger a pageout for a dirty page.
Yes, but only for *some filesystems* in *some configurations*.
Readahead allocation behaviour is specifically controlled by the gfp
mask set on the mapping by the filesystem at inode instantiation
time. i.e. via a call to mapping_set_gfp_mask().
XFS, for one, always clears __GFP_FS from this mask, and several
other filesystems set it to GFP_NOFS. Filesystems that do this will
not do pageout for a dirty page during memory allocation.
Further, memory reclaim can not write dirty pages to a filesystem
without a ->writepage implementation. ->writepage is almost
completely gone - neither ext4, btrfs or XFS have a ->writepage
implementation anymore - with f2fs being the only "major" filesystem
with a ->writepage implementation remaining.
IOWs, for most readahead cases right now, direct memory reclaim will
not issue writeback IO on dirty cached file pages and in the near
future that will change to -never-.
That means the only IO that direct reclaim will be able to do is for
swapping and compaction. Both of these can be prevented simply by
setting a GFP_NOIO allocation context. IOWs, in the not-to-distant
future we won't have to turn direct reclaim off to prevent IO from
and blocking in direct reclaim during readahead - GFP_NOIO context
will be all that is necessary for IOCB_NOWAIT readahead.
That's why I'm asking if just doing readahead as it stands from
RWF_NOWAIT causes any obvious problems. I think we really only need
need GFP_NOIO | __GFP_NORETRY allocation context for NOWAIT
readahead IO, and that's something we already have a context API
for.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-08-15 2:54 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-12 9:05 [PATCH 0/2] mm: Add readahead support for IOCB_NOWAIT Yafang Shao
2024-08-12 9:05 ` [PATCH 1/2] mm: Add memalloc_nowait_{save,restore} Yafang Shao
2024-08-12 11:37 ` Christoph Hellwig
2024-08-12 12:59 ` Yafang Shao
2024-08-12 13:21 ` Christoph Hellwig
2024-08-13 2:09 ` Yafang Shao
2024-08-14 5:27 ` Christoph Hellwig
2024-08-14 7:33 ` Yafang Shao
2024-09-01 20:24 ` Vlastimil Babka
2024-09-01 20:42 ` Kent Overstreet
2024-08-14 7:42 ` Michal Hocko
2024-08-14 8:12 ` Yafang Shao
2024-08-14 12:43 ` Michal Hocko
2024-08-15 3:26 ` Yafang Shao
2024-08-15 6:22 ` Michal Hocko
2024-08-15 6:32 ` Yafang Shao
2024-08-15 6:51 ` Michal Hocko
2024-08-16 8:17 ` [PATCH] mm: document risk of PF_MEMALLOC_NORECLAIM Michal Hocko
2024-08-16 8:22 ` Christoph Hellwig
2024-08-16 8:54 ` Michal Hocko
2024-08-16 14:26 ` Christoph Hellwig
2024-08-16 15:57 ` Michal Hocko
2024-08-21 7:30 ` Michal Hocko
2024-08-21 11:44 ` Christoph Hellwig
2024-08-21 12:37 ` Michal Hocko
2024-08-22 9:09 ` Christian Brauner
2024-08-17 2:29 ` Yafang Shao
2024-08-19 7:57 ` Michal Hocko
2024-08-12 16:48 ` [PATCH 1/2] mm: Add memalloc_nowait_{save,restore} Kent Overstreet
2024-08-14 5:24 ` Christoph Hellwig
2024-08-14 0:28 ` Dave Chinner
2024-08-14 2:19 ` Yafang Shao
2024-08-14 5:42 ` Dave Chinner
2024-08-14 7:32 ` Yafang Shao
2024-08-15 2:54 ` Dave Chinner [this message]
2024-08-15 3:38 ` Yafang Shao
2024-08-12 9:05 ` [PATCH 2/2] mm: allow read-ahead with IOCB_NOWAIT set Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zr1t2d/3tqNBc7qM@dread.disaster.area \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=kent.overstreet@linux.dev \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox