linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	 Matthew Wilcox <willy@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>
Subject: Re: [PATCH 2/2] mm: introduce PF_MEMALLOC_NOWARN
Date: Thu, 1 Feb 2024 06:03:27 -0500	[thread overview]
Message-ID: <3lzd24v653hn7ks6ocdeos6xkppk5l7pgun7nysmmmsfgrjob3@mqynwkythb4d> (raw)
In-Reply-To: <ZbeCYB1z1s78I8ql@tiehlicka>

On Mon, Jan 29, 2024 at 11:48:00AM +0100, Michal Hocko wrote:
> On Sun 28-01-24 14:43:16, Kent Overstreet wrote:
> > On Sun, Jan 28, 2024 at 04:45:32PM +0100, Michal Hocko wrote:
> > > On Fri 26-01-24 17:07:56, Kent Overstreet wrote:
> > > > If we're using PF_MEMALLOC, we might have a fallback and might not to
> > > > warn about a failing allocation - thus we need a PF_* equivalent of
> > > > __GFP_NOWARN.
> > > 
> > > Could you be more specific about the user? Is this an allocation from
> > > the reclaim path or an explicit PF_MEMALLOC one? It would be also really
> > > helpful to explain why GFP_NOWARN cannot be used directly.
> > 
> > Explicit PF_MEMALLOC.
> > 
> > It's for a call to alloc_inode(), which doesn't take gfp flags, and
> > plumbing it would require modifying a s_ops callback and would touch
> > every filesystem -
> 
> OK, I see. This should be part of the changelog.
> 
> > but we want to get away from gfp flags anyways :)
> > 
> > More specifically, the code where I'm using it is doing a "try
> > GFP_NOWAIT first; if that fails drop locks and do GFP_KERNEL" dance;
> > it's part of a cleanup for some weird lifetime stuff related to
> > fs/inode.c.
> > 
> > #define memalloc_flags_do(_flags, _do)						\
> > ({										\
> > 	unsigned _saved_flags = memalloc_flags_save(_flags);			\
> > 	typeof(_do) _ret = _do;							\
> > 	memalloc_noreclaim_restore(_saved_flags);				\
> > 	_ret;									\
> > })
> > 
> > /*
> >  * Allocate a new inode, dropping/retaking btree locks if necessary:
> >  */
> > static struct bch_inode_info *bch2_new_inode(struct btree_trans *trans)
> > {
> > 	struct bch_fs *c = trans->c;
> > 
> > 	struct bch_inode_info *inode =
> > 		memalloc_flags_do(PF_MEMALLOC|PF_MEMALLOC_NOWARN, to_bch_ei(new_inode(c->vfs_sb)));
> 
> Is this what you meant by GFP_NOWAIT allocation? You would be right
> about this allocation not entering the direct reclaim but do you realize
> this is not an ordinary NOWAIT request beause PF_MEMALLOC will allow to
> dip into memory reserves without any limits? (Unless the specific
> allocation down the road is explicitly GFP_NOMEMALLOC)
> A failure here means that the system is really in a desparate state with all
> the memory reserves gone which migt break reclaimers who rely on those
> reserves.
> 
> Are you sure it is a good idea? Unless I am missing something you are
> just giving an ordinary user access to those reserves by creating inodes
> without any bounds.

Did not realize that; thank you.

How about this version?

From 0e87e55058ccddde4b6bcc092f43e66a4e632575 Mon Sep 17 00:00:00 2001
From: Kent Overstreet <kent.overstreet@linux.dev>
Date: Thu, 25 Jan 2024 19:00:24 -0500
Subject: [PATCH] mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN

Introduce PF_MEMALLOC_* equivalents of some GFP_ flags:

PF_MEMALLOC_NORECLAIM	-> GFP_NOWAIT
PF_MEMALLOC_NOWARN	-> __GFP_NOWARN

Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index cdb8ea53c365..36de7a2343c3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1632,12 +1632,12 @@ extern struct pid *cad_pid;
 #define PF_KSWAPD		0x00020000	/* I am kswapd */
 #define PF_MEMALLOC_NOFS	0x00040000	/* All allocation requests will inherit GFP_NOFS */
 #define PF_MEMALLOC_NOIO	0x00080000	/* All allocation requests will inherit GFP_NOIO */
-#define PF_LOCAL_THROTTLE	0x00100000	/* Throttle writes only against the bdi I write to,
+#define PF_MEMALLOC_NORECLAIM	0x00100000	/* All allocation requests will inherit __GFP_NOWARN */
+#define PF_MEMALLOC_NOWARN	0x00200000	/* All allocation requests will inherit __GFP_NOWARN */
+#define PF_LOCAL_THROTTLE	0x00400000	/* Throttle writes only against the bdi I write to,
 						 * I am cleaning dirty pages from some other bdi. */
-#define PF_KTHREAD		0x00200000	/* I am a kernel thread */
-#define PF_RANDOMIZE		0x00400000	/* Randomize virtual address space */
-#define PF__HOLE__00800000	0x00800000
-#define PF__HOLE__01000000	0x01000000
+#define PF_KTHREAD		0x00800000	/* I am a kernel thread */
+#define PF_RANDOMIZE		0x01000000	/* Randomize virtual address space */
 #define PF__HOLE__02000000	0x02000000
 #define PF_NO_SETAFFINITY	0x04000000	/* Userland is not allowed to meddle with cpus_mask */
 #define PF_MCE_EARLY		0x08000000      /* Early kill for mce process policy */
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index f00d7ecc2adf..c29059a76052 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -236,16 +236,25 @@ static inline gfp_t current_gfp_context(gfp_t flags)
 {
 	unsigned int pflags = READ_ONCE(current->flags);
 
-	if (unlikely(pflags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_PIN))) {
+	if (unlikely(pflags & (PF_MEMALLOC_NOIO |
+			       PF_MEMALLOC_NOFS |
+			       PF_MEMALLOC_NORECLAIM |
+			       PF_MEMALLOC_NOWARN |
+			       PF_MEMALLOC_PIN))) {
 		/*
-		 * NOIO implies both NOIO and NOFS and it is a weaker context
-		 * so always make sure it makes precedence
+		 * Stronger flags before weaker flags:
+		 * NORECLAIM implies NOIO, which in turn implies NOFS
 		 */
-		if (pflags & PF_MEMALLOC_NOIO)
+		if (pflags & PF_MEMALLOC_NORECLAIM)
+			flags &= ~__GFP_DIRECT_RECLAIM;
+		else if (pflags & PF_MEMALLOC_NOIO)
 			flags &= ~(__GFP_IO | __GFP_FS);
 		else if (pflags & PF_MEMALLOC_NOFS)
 			flags &= ~__GFP_FS;
 
+		if (pflags & PF_MEMALLOC_NOWARN)
+			flags |= __GFP_NOWARN;
+
 		if (pflags & PF_MEMALLOC_PIN)
 			flags &= ~__GFP_MOVABLE;
 	}


  reply	other threads:[~2024-02-01 11:03 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 22:07 [PATCH 1/2] mm: introduce memalloc_flags_{save,restore} Kent Overstreet
2024-01-26 22:07 ` [PATCH 2/2] mm: introduce PF_MEMALLOC_NOWARN Kent Overstreet
2024-01-28 15:45   ` Michal Hocko
2024-01-28 19:43     ` Kent Overstreet
2024-01-29 10:48       ` Michal Hocko
2024-02-01 11:03         ` Kent Overstreet [this message]
2024-02-01 15:59           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3lzd24v653hn7ks6ocdeos6xkppk5l7pgun7nysmmmsfgrjob3@mqynwkythb4d \
    --to=kent.overstreet@linux.dev \
    --cc=djwong@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox