From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4730AC433DB for ; Mon, 22 Mar 2021 14:06:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 925B561972 for ; Mon, 22 Mar 2021 14:06:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 925B561972 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2C97C6B00C9; Mon, 22 Mar 2021 09:47:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2560B6B00CA; Mon, 22 Mar 2021 09:47:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07FF46B00CB; Mon, 22 Mar 2021 09:47:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0041.hostedemail.com [216.40.44.41]) by kanga.kvack.org (Postfix) with ESMTP id DC1CF6B00C9 for ; Mon, 22 Mar 2021 09:47:49 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 45847181B04A1 for ; Mon, 22 Mar 2021 14:06:40 +0000 (UTC) X-FDA: 77947685760.09.4549981 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf18.hostedemail.com (Postfix) with ESMTP id 3B2A22000250 for ; Mon, 22 Mar 2021 14:06:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=N3XxQ2BvoH2USDw7/7l5KlhyWNYrIOJ2SRRjJXkBfDU=; b=jwON3c1pPcPA9bNRM4iK0JyckF BzGo+8UXS+seZ4SDF4XwP8u1Pprwa0VoBcXl8uLphI361ysCm9Ztk0JHLcFFt8bKzybjmD9antpz0 eKPScsv74paX2UYCLVi8iIePDHq6NupMzQkvXuEG9rne5wmO8hcye73bCvHd+sSKOv7QJbcj5FMLq lX8p+TE8VzzasRoKnxQnby14LaqqC4NNix072GPxdRIwdo9elVpq59HDb0d+KKhP16VOFZvpTUxEd l5t1Ugghd5yoNyqVPO6njpnmVN6+7NzsUMrScBD34FbJVdALxPs7IPZzKssZSG9GtGb/SSDp8CEc/ Yiiz5zVw==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lOLBk-008c2u-1m; Mon, 22 Mar 2021 14:05:55 +0000 Date: Mon, 22 Mar 2021 14:05:48 +0000 From: Matthew Wilcox To: Christian =?iso-8859-1?Q?K=F6nig?= , Dave Chinner , Leo Liu , amd-gfx list , dri-devel , Linux MM Cc: mhocko@kernel.org Subject: Re: [PATCH] drm/ttm: stop warning on TT shrinker failure Message-ID: <20210322140548.GN1719932@casper.infradead.org> References: <20210319140857.2262-1-christian.koenig@amd.com> <2831bfcc-140e-dade-1f50-a6431e495e9d@gmail.com> <1ae415c4-8e49-5183-b44d-bc92088657d5@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3B2A22000250 X-Stat-Signature: e6j1yikm3qxbb5ajx6n7qrh1krdfyp6e Received-SPF: none (infradead.org>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=casper.infradead.org; client-ip=90.155.50.34 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616421993-399047 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote: > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian K=F6nig wrote: > > Am 20.03.21 um 14:17 schrieb Daniel Vetter: > > > On Sat, Mar 20, 2021 at 10:04 AM Christian K=F6nig > > > wrote: > > > > Am 19.03.21 um 20:06 schrieb Daniel Vetter: > > > > > On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian K=F6nig wro= te: > > > > > > Am 19.03.21 um 18:52 schrieb Daniel Vetter: > > > > > > > On Fri, Mar 19, 2021 at 03:08:57PM +0100, Christian K=F6nig= wrote: > > > > > > > > Don't print a warning when we fail to allocate a page for= swapping things out. > > > > > > > >=20 > > > > > > > > Also rely on memalloc_nofs_save/memalloc_nofs_restore ins= tead of GFP_NOFS. > > > > > > > Uh this part doesn't make sense. Especially since you only = do it for the > > > > > > > debugfs file, not in general. Which means you've just compl= etely broken > > > > > > > the shrinker. > > > > > > Are you sure? My impression is that GFP_NOFS should now work = much more out > > > > > > of the box with the memalloc_nofs_save()/memalloc_nofs_restor= e(). > > > > > Yeah, if you'd put it in the right place :-) > > > > >=20 > > > > > But also -mm folks are very clear that memalloc_no*() family is= for dire > > > > > situation where there's really no other way out. For anything w= here you > > > > > know what you're doing, you really should use explicit gfp flag= s. > > > > My impression is just the other way around. You should try to avo= id the > > > > NOFS/NOIO flags and use the memalloc_no* approach instead. > > > Where did you get that idea? > >=20 > > Well from the kernel comment on GFP_NOFS: > >=20 > > =A0* %GFP_NOFS will use direct reclaim but will not use any filesyste= m > > interfaces. > > =A0* Please try to avoid using this flag directly and instead use > > =A0* memalloc_nofs_{save,restore} to mark the whole scope which > > cannot/shouldn't > > =A0* recurse into the FS layer with a short explanation why. All allo= cation > > =A0* requests will inherit GFP_NOFS implicitly. >=20 > Huh that's interesting, since iirc Willy or Dave told me the opposite, = and > the memalloc_no* stuff is for e.g. nfs calling into network layer (need= s > GFP_NOFS) or swap on top of a filesystems (even needs GFP_NOIO I think)= . >=20 > Adding them, maybe I got confused. My impression is that the scoped API is preferred these days. https://www.kernel.org/doc/html/latest/core-api/gfp_mask-from-fs-io.html I'd probably need to spend a few months learning the DRM subsystem to have a more detailed opinion on whether passing GFP flags around explicit= ly or using the scope API is the better approach for your situation. I usually defer to Michal on these kinds of questions. > > > The kernel is full of explicit gfp_t flag > > > passing to make this as explicit as possible. The memalloc_no* stuf= f > > > is just for when you go through entire subsystems and really can't > > > wire it through. I can't find the discussion anymore, but that was = the > > > advice I got from mm/fs people. > > >=20 > > > One reason is that generally a small GFP_KERNEL allocation never > > > fails. But it absolutely can fail if it's in a memalloc_no* section= , > > > and these kind of non-obvious non-local effects are a real pain in > > > testing and review. Hence explicit gfp_flag passing as much as > > > possible. I agree with this; it's definitely a problem with the scope API. I wante= d to extend it to include GFP_NOWAIT, but if you do that, your chances of memory allocation failure go way up, so you really want to set __GFP_NOWA= RN too, but now you need to audit all the places that you're calling to be sure they really handle errors correctly. So I think I'm giving up on that patch set. > > > > > > > If this is just to paper over the seq_printf doing the wron= g allocations, > > > > > > > then just move that out from under the fs_reclaim_acquire/r= elease part. > > > > > > No, that wasn't the problem. > > > > > >=20 > > > > > > We have just seen to many failures to allocate pages for swap= out and I think > > > > > > that would improve this because in a lot of cases we can then= immediately > > > > > > swap things out instead of having to rely on upper layers. > > > > > Yeah, you broke it. Now the real shrinker is running with GFP_K= ERNEL, > > > > > because your memalloc_no is only around the debugfs function. A= nd ofc it's > > > > > much easier to allocate with GFP_KERNEL, right until you deadlo= ck :-) > > > > The problem here is that for example kswapd calls the shrinker wi= thout > > > > holding a FS lock as far as I can see. > > > >=20 > > > > And it is rather sad that we can't optimize this case directly. > > > I'm still not clear what you want to optimize? You can check for "i= s > > > this kswapd" in pf flags, but that sounds very hairy and fragile. > >=20 > > Well we only need the NOFS flag when the shrinker callback really com= es from > > a memory shortage in the FS subsystem, and that is rather unlikely. > >=20 > > When we would allow all other cases to be able to directly IO the fre= ed up > > pages to swap it would certainly help. >=20 > tbh I'm not sure. i915-gem code has played tricks with special casing t= he > kswapd path, and they do kinda scare me at least. I'm not sure whether > there's not some hidden dependencies there that would make this a bad > idea. Like afaik direct reclaim can sometimes stall for kswapd to catch= up > a bit, or at least did in the past (I think, really not much clue about > this) >=20 > The other thing is that the fs_reclaim_acquire/release annotation reall= y > only works well if you use it outside of the direct reclaim path too. > Otherwise it's not much better than just lots of testing. That pretty m= uch > means you have to annotate the kswapd path. > -Daniel >=20 >=20 >=20 > >=20 > > Christian. > >=20 > > > -Daniel > > >=20 > > > > Anyway you are right if some caller doesn't use the memalloc_no*(= ) > > > > approach we are busted. > > > >=20 > > > > Going to change the patch to only not warn for the moment. > > > >=20 > > > > Regards, > > > > Christian. > > > >=20 > > > > > Shrinking is hard, there's no easy way out here. > > > > >=20 > > > > > Cheers, Daniel > > > > >=20 > > > > > > Regards, > > > > > > Christian. > > > > > >=20 > > > > > >=20 > > > > > > > __GFP_NOWARN should be there indeed I think. > > > > > > > -Daniel > > > > > > >=20 > > > > > > > > Signed-off-by: Christian K=F6nig > > > > > > > > --- > > > > > > > > drivers/gpu/drm/ttm/ttm_tt.c | 5 ++++- > > > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > >=20 > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/d= rm/ttm/ttm_tt.c > > > > > > > > index 2f0833c98d2c..86fa3e82dacc 100644 > > > > > > > > --- a/drivers/gpu/drm/ttm/ttm_tt.c > > > > > > > > +++ b/drivers/gpu/drm/ttm/ttm_tt.c > > > > > > > > @@ -369,7 +369,7 @@ static unsigned long ttm_tt_shrinker_= scan(struct shrinker *shrink, > > > > > > > > }; > > > > > > > > int ret; > > > > > > > > - ret =3D ttm_bo_swapout(&ctx, GFP_NOFS); > > > > > > > > + ret =3D ttm_bo_swapout(&ctx, GFP_KERNEL | __GFP_NOWARN= ); > > > > > > > > return ret < 0 ? SHRINK_EMPTY : ret; > > > > > > > > } > > > > > > > > @@ -389,10 +389,13 @@ static unsigned long ttm_tt_shrinke= r_count(struct shrinker *shrink, > > > > > > > > static int ttm_tt_debugfs_shrink_show(struct seq_file= *m, void *data) > > > > > > > > { > > > > > > > > struct shrink_control sc =3D { .gfp_mask =3D = GFP_KERNEL }; > > > > > > > > + unsigned int flags; > > > > > > > > fs_reclaim_acquire(GFP_KERNEL); > > > > > > > > + flags =3D memalloc_nofs_save(); > > > > > > > > seq_printf(m, "%lu/%lu\n", ttm_tt_shrinker_co= unt(&mm_shrinker, &sc), > > > > > > > > ttm_tt_shrinker_scan(&mm_shrinker,= &sc)); > > > > > > > > + memalloc_nofs_restore(flags); > > > > > > > > fs_reclaim_release(GFP_KERNEL); > > > > > > > > return 0; > > > > > > > > -- > > > > > > > > 2.25.1 > > > > > > > >=20 > > > > > > > > _______________________________________________ > > > > > > > > dri-devel mailing list > > > > > > > > dri-devel@lists.freedesktop.org > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > >=20 > >=20 >=20 > --=20 > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch