linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Tejun Heo <tj@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>,
	Filipe Manana <fdmanana@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic
Date: Wed, 12 Feb 2025 17:57:04 +0100	[thread overview]
Message-ID: <Z6zS4Dtyway78Gif@tiehlicka> (raw)
In-Reply-To: <Z6u5OJIQBw8QLGe_@slm.duckdns.org>

On Tue 11-02-25 10:55:20, Tejun Heo wrote:
> Hello, Michal.
> 
> On Thu, Feb 06, 2025 at 01:26:33PM +0100, Michal Hocko wrote:
> ...
> > It has turned out that iscsid has worked around this by dropping
> > PR_SET_IO_FLUSHER (https://github.com/open-iscsi/open-iscsi/pull/382)
> > when scanning host. But we can do better in this case on the kernel side
> 
> FWIW, requiring GFP_KERNEL context for probing doesn't sound too crazy to
> me.
> 
> > @@ -2204,7 +2204,12 @@ static void pcpu_balance_workfn(struct work_struct *work)
> >  	 * to grow other chunks.  This then gives pcpu_reclaim_populated() time
> >  	 * to move fully free chunks to the active list to be freed if
> >  	 * appropriate.
> > +	 *
> > +	 * Enforce GFP_NOIO allocations because we have pcpu_alloc users
> > +	 * constrained to GFP_NOIO/NOFS contexts and they could form lock
> > +	 * dependency through pcpu_alloc_mutex
> >  	 */
> > +	unsigned int flags = memalloc_noio_save();
> 
> Just for context, the reason why the allocation mask support was limited to
> GFP_KERNEL or not rather than supporting full range of GFP flags is because
> percpu memory area expansion can involve page table allocations in the
> vmalloc area which always uses GFP_KERNEL. memalloc_noio_save() masks IO
> part out of that, right? It might be worthwhile to explain why we aren't
> passing down GPF flags throughout and instead depending on masking.

I have gone with masking because that seemed easier to review and more
robust solution. vmalloc does support NOFS/NOIO contexts these days (it
will just uses scoped masking in those cases). Propagating the gfp
throughout the worker code path is likely possible, but I haven't really
explored that in detail to be sure. Would that be preferable even if the
fix would be more involved?

> Also, doesn't the above always prevent percpu allocations from doing fs/io
> reclaims? 

Yes it does. Probably worth mentioning in the changelog. These
allocations should be rare so having a constrained reclaim didn't really
seem problematic to me. There should be kswapd running in the background
with the full reclaim power.

> ie. Shouldn't the masking only be used if the passed in gfp
> doesn't allow fs/io?

This is a good question. I have to admit that my understanding might be
incorrect but wouldn't it be possible that we could get the lock
dependency chain if GFP_KERNEL and scoped NOFS alloc_pcp calls are
competing? 

					fs/io lock
					pcpu_alloc_noprof(NOFS/NOIO)
pcpu_alloc_noprof(GFP_KERNEL)
  pcpu_schedule_balance_work
    pcpu_alloc_mutex
    					  pcpu_alloc_mutex
      allocation_deadlock throgh fs/io lock

This is currently not possible because constrained allocations only do
trylock.

Makes sense?
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2025-02-12 16:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-06 12:26 Michal Hocko
2025-02-11 15:05 ` Vlastimil Babka
2025-02-11 20:55 ` Tejun Heo
2025-02-12 16:57   ` Michal Hocko [this message]
2025-02-12 18:14     ` Tejun Heo
2025-02-12 20:53       ` Michal Hocko
2025-02-12 21:30         ` Tejun Heo
2025-02-12 21:39           ` Dennis Zhou
2025-02-14 15:52             ` Michal Hocko
2025-02-21  2:36               ` Dennis Zhou
2025-02-21  9:48                 ` Vlastimil Babka
2025-03-05 15:10                   ` Michal Hocko
2025-03-05 15:35                     ` Vlastimil Babka
2025-02-14 15:43           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6zS4Dtyway78Gif@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=dennis@kernel.org \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox