From: Daniel Vetter <daniel@ffwll.ch>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
"Dave Chinner" <dchinner@redhat.com>, "Leo Liu" <Leo.Liu@amd.com>,
"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
"Linux MM" <linux-mm@kvack.org>,
mhocko@kernel.org
Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure
Date: Mon, 22 Mar 2021 15:22:02 +0100 [thread overview]
Message-ID: <YFioChrLPkjMBTP3@phenom.ffwll.local> (raw)
In-Reply-To: <20210322140548.GN1719932@casper.infradead.org>
On Mon, Mar 22, 2021 at 02:05:48PM +0000, Matthew Wilcox wrote:
> On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > > > <ckoenig.leichtzumerken@gmail.com> wrote:
> > > > > Am 19.03.21 um 20:06 schrieb Daniel Vetter:
> > > > > > On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian König wrote:
> > > > > > > Am 19.03.21 um 18:52 schrieb Daniel Vetter:
> > > > > > > > On Fri, Mar 19, 2021 at 03:08:57PM +0100, Christian König wrote:
> > > > > > > > > Don't print a warning when we fail to allocate a page for swapping things out.
> > > > > > > > >
> > > > > > > > > Also rely on memalloc_nofs_save/memalloc_nofs_restore instead of GFP_NOFS.
> > > > > > > > Uh this part doesn't make sense. Especially since you only do it for the
> > > > > > > > debugfs file, not in general. Which means you've just completely broken
> > > > > > > > the shrinker.
> > > > > > > Are you sure? My impression is that GFP_NOFS should now work much more out
> > > > > > > of the box with the memalloc_nofs_save()/memalloc_nofs_restore().
> > > > > > Yeah, if you'd put it in the right place :-)
> > > > > >
> > > > > > But also -mm folks are very clear that memalloc_no*() family is for dire
> > > > > > situation where there's really no other way out. For anything where you
> > > > > > know what you're doing, you really should use explicit gfp flags.
> > > > > My impression is just the other way around. You should try to avoid the
> > > > > NOFS/NOIO flags and use the memalloc_no* approach instead.
> > > > Where did you get that idea?
> > >
> > > Well from the kernel comment on GFP_NOFS:
> > >
> > > * %GFP_NOFS will use direct reclaim but will not use any filesystem
> > > interfaces.
> > > * Please try to avoid using this flag directly and instead use
> > > * memalloc_nofs_{save,restore} to mark the whole scope which
> > > cannot/shouldn't
> > > * recurse into the FS layer with a short explanation why. All allocation
> > > * requests will inherit GFP_NOFS implicitly.
> >
> > Huh that's interesting, since iirc Willy or Dave told me the opposite, and
> > the memalloc_no* stuff is for e.g. nfs calling into network layer (needs
> > GFP_NOFS) or swap on top of a filesystems (even needs GFP_NOIO I think).
> >
> > Adding them, maybe I got confused.
>
> My impression is that the scoped API is preferred these days.
>
> https://www.kernel.org/doc/html/latest/core-api/gfp_mask-from-fs-io.html
>
> I'd probably need to spend a few months learning the DRM subsystem to
> have a more detailed opinion on whether passing GFP flags around explicitly
> or using the scope API is the better approach for your situation.
Atm it's a single allocation in the ttm shrinker that's already explicitly
using GFP_NOFS that we're talking about here.
The scoped api might make sense for gpu scheduler, where we really operate
under GFP_NOWAIT for somewhat awkward reasons. But also I thought at least
for GFP_NOIO you generally need a mempool and think about how you
guarantee forward progress anyway. Is that also a bit outdated thinking,
and nowadays we could operate under the assumption that this Just Works?
Given that GFP_NOFS seems to fall over already for us I'm not super sure
about that ...
> I usually defer to Michal on these kinds of questions.
>
> > > > The kernel is full of explicit gfp_t flag
> > > > passing to make this as explicit as possible. The memalloc_no* stuff
> > > > is just for when you go through entire subsystems and really can't
> > > > wire it through. I can't find the discussion anymore, but that was the
> > > > advice I got from mm/fs people.
> > > >
> > > > One reason is that generally a small GFP_KERNEL allocation never
> > > > fails. But it absolutely can fail if it's in a memalloc_no* section,
> > > > and these kind of non-obvious non-local effects are a real pain in
> > > > testing and review. Hence explicit gfp_flag passing as much as
> > > > possible.
>
> I agree with this; it's definitely a problem with the scope API. I wanted
> to extend it to include GFP_NOWAIT, but if you do that, your chances of
> memory allocation failure go way up, so you really want to set __GFP_NOWARN
> too, but now you need to audit all the places that you're calling to be
> sure they really handle errors correctly.
>
> So I think I'm giving up on that patch set.
Yeah the auditing is what scares me, and why at least personally I prefer
explicit gfp flags. It's much easier to debug a lockdep splat involving
fs_reclaim than memory allocation failures leading to very strange bugs
because we're not handling the allocation failure properly (or maybe not
even at all).
-Daniel
>
> > > > > > > > If this is just to paper over the seq_printf doing the wrong allocations,
> > > > > > > > then just move that out from under the fs_reclaim_acquire/release part.
> > > > > > > No, that wasn't the problem.
> > > > > > >
> > > > > > > We have just seen to many failures to allocate pages for swapout and I think
> > > > > > > that would improve this because in a lot of cases we can then immediately
> > > > > > > swap things out instead of having to rely on upper layers.
> > > > > > Yeah, you broke it. Now the real shrinker is running with GFP_KERNEL,
> > > > > > because your memalloc_no is only around the debugfs function. And ofc it's
> > > > > > much easier to allocate with GFP_KERNEL, right until you deadlock :-)
> > > > > The problem here is that for example kswapd calls the shrinker without
> > > > > holding a FS lock as far as I can see.
> > > > >
> > > > > And it is rather sad that we can't optimize this case directly.
> > > > I'm still not clear what you want to optimize? You can check for "is
> > > > this kswapd" in pf flags, but that sounds very hairy and fragile.
> > >
> > > Well we only need the NOFS flag when the shrinker callback really comes from
> > > a memory shortage in the FS subsystem, and that is rather unlikely.
> > >
> > > When we would allow all other cases to be able to directly IO the freed up
> > > pages to swap it would certainly help.
> >
> > tbh I'm not sure. i915-gem code has played tricks with special casing the
> > kswapd path, and they do kinda scare me at least. I'm not sure whether
> > there's not some hidden dependencies there that would make this a bad
> > idea. Like afaik direct reclaim can sometimes stall for kswapd to catch up
> > a bit, or at least did in the past (I think, really not much clue about
> > this)
> >
> > The other thing is that the fs_reclaim_acquire/release annotation really
> > only works well if you use it outside of the direct reclaim path too.
> > Otherwise it's not much better than just lots of testing. That pretty much
> > means you have to annotate the kswapd path.
> > -Daniel
> >
> >
> >
> > >
> > > Christian.
> > >
> > > > -Daniel
> > > >
> > > > > Anyway you are right if some caller doesn't use the memalloc_no*()
> > > > > approach we are busted.
> > > > >
> > > > > Going to change the patch to only not warn for the moment.
> > > > >
> > > > > Regards,
> > > > > Christian.
> > > > >
> > > > > > Shrinking is hard, there's no easy way out here.
> > > > > >
> > > > > > Cheers, Daniel
> > > > > >
> > > > > > > Regards,
> > > > > > > Christian.
> > > > > > >
> > > > > > >
> > > > > > > > __GFP_NOWARN should be there indeed I think.
> > > > > > > > -Daniel
> > > > > > > >
> > > > > > > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > > > > > > ---
> > > > > > > > > drivers/gpu/drm/ttm/ttm_tt.c | 5 ++++-
> > > > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> > > > > > > > > index 2f0833c98d2c..86fa3e82dacc 100644
> > > > > > > > > --- a/drivers/gpu/drm/ttm/ttm_tt.c
> > > > > > > > > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> > > > > > > > > @@ -369,7 +369,7 @@ static unsigned long ttm_tt_shrinker_scan(struct shrinker *shrink,
> > > > > > > > > };
> > > > > > > > > int ret;
> > > > > > > > > - ret = ttm_bo_swapout(&ctx, GFP_NOFS);
> > > > > > > > > + ret = ttm_bo_swapout(&ctx, GFP_KERNEL | __GFP_NOWARN);
> > > > > > > > > return ret < 0 ? SHRINK_EMPTY : ret;
> > > > > > > > > }
> > > > > > > > > @@ -389,10 +389,13 @@ static unsigned long ttm_tt_shrinker_count(struct shrinker *shrink,
> > > > > > > > > static int ttm_tt_debugfs_shrink_show(struct seq_file *m, void *data)
> > > > > > > > > {
> > > > > > > > > struct shrink_control sc = { .gfp_mask = GFP_KERNEL };
> > > > > > > > > + unsigned int flags;
> > > > > > > > > fs_reclaim_acquire(GFP_KERNEL);
> > > > > > > > > + flags = memalloc_nofs_save();
> > > > > > > > > seq_printf(m, "%lu/%lu\n", ttm_tt_shrinker_count(&mm_shrinker, &sc),
> > > > > > > > > ttm_tt_shrinker_scan(&mm_shrinker, &sc));
> > > > > > > > > + memalloc_nofs_restore(flags);
> > > > > > > > > fs_reclaim_release(GFP_KERNEL);
> > > > > > > > > return 0;
> > > > > > > > > --
> > > > > > > > > 2.25.1
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > dri-devel mailing list
> > > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2021-03-22 14:22 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210319140857.2262-1-christian.koenig@amd.com>
[not found] ` <YFTk1GSaUDI3wcWt@phenom.ffwll.local>
[not found] ` <2831bfcc-140e-dade-1f50-a6431e495e9d@gmail.com>
[not found] ` <YFT2LSR97rkkPyEP@phenom.ffwll.local>
[not found] ` <1ae415c4-8e49-5183-b44d-bc92088657d5@gmail.com>
[not found] ` <CAKMK7uEDhuvSwJj5CX8vHgLb+5zm=rdJPmXwb-VQWdrW6GwQZw@mail.gmail.com>
[not found] ` <e6e9df3e-cd2b-d80f-205d-6ca1865819b2@gmail.com>
2021-03-22 13:49 ` Daniel Vetter
2021-03-22 14:05 ` Matthew Wilcox
2021-03-22 14:22 ` Daniel Vetter [this message]
2021-03-22 15:57 ` Michal Hocko
2021-03-22 17:02 ` Daniel Vetter
2021-03-22 19:34 ` Christian König
2021-03-23 7:38 ` Michal Hocko
2021-03-23 11:28 ` Daniel Vetter
2021-03-23 11:46 ` Michal Hocko
2021-03-23 11:51 ` Christian König
2021-03-23 12:00 ` Daniel Vetter
2021-03-23 12:05 ` Michal Hocko
2021-03-23 11:48 ` Christian König
2021-03-23 12:04 ` Michal Hocko
2021-03-23 12:21 ` Christian König
2021-03-23 12:37 ` Michal Hocko
2021-03-23 13:06 ` Christian König
2021-03-23 13:41 ` Michal Hocko
2021-03-23 13:56 ` Christian König
2021-03-23 15:13 ` Michal Hocko
2021-03-23 15:45 ` Christian König
2021-03-24 10:19 ` Thomas Hellström (Intel)
2021-03-24 11:55 ` Daniel Vetter
2021-03-24 12:00 ` Christian König
2021-03-24 12:01 ` Daniel Vetter
2021-03-24 12:07 ` Christian König
2021-03-24 19:20 ` Daniel Vetter
2021-03-23 13:15 ` Daniel Vetter
2021-03-23 13:48 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YFioChrLPkjMBTP3@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=Leo.Liu@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dchinner@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox