linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	linux-mm@kvack.org,
	"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	John Ogness <john.ogness@linutronix.de>
Subject: Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().
Date: Wed, 21 Jun 2023 17:38:03 +0200	[thread overview]
Message-ID: <ZJMZWxDrkV3M1EQH@alley> (raw)
In-Reply-To: <20230621143421.BgHjJklo@linutronix.de>

On Wed 2023-06-21 16:34:21, Sebastian Andrzej Siewior wrote:
> On 2023-06-21 22:32:40 [+0900], Tetsuo Handa wrote:
> > include/linux/seqlock.h says
> …
> > Is above understanding correct?
> 
> That is correct.
> 
> > And you are trying to replace
> > 
> >   local_irq_save(flags);
> >   printk_deferred_enter();
> >   write_seqlock(&zonelist_update_seq);
> > 
> > with
> > 
> >   write_seqlock_irqsave(&zonelist_update_seq, flags);
> >   printk_deferred_enter();
> > 
> > , aren't you?
> 
> correct.
> 
> > But include/linux/printk.h says
> > 
> >   /*
> >    * The printk_deferred_enter/exit macros are available only as a hack for
> >    * some code paths that need to defer all printk console printing. Interrupts
> >    * must be disabled for the deferred duration.
> >    */
> >   #define printk_deferred_enter __printk_safe_enter
> >   #define printk_deferred_exit __printk_safe_exit
> > 
> > which means that local_irq_save() is _required_ before printk_deferred_enter().
> 
> It says that, yes, but that requirement is described as too heavy. The
> requirement is that printk_deferred_enter() happens on the same CPU as
> printk_deferred_exit().

True.

> This can be achieved by an explicit
> local_irq_save(), yes, but also by something like spin_lock_irq() which
> _ensures_ that the task does not migrate to another CPU while the lock
> is acquired. This is the requirement by the current implementation.

Good to know.

> > If local_irq_save() is hidden by your patch, what guarantees that
> > printk_deferred_enter() and printk_deferred_exit() run on the same CPU?
> 
> Because spin_lock_irqsave() on CONFIG_PREEMPT_RT uses migrate_disable().
> The function ensures that the scheduler does not migrate the task to
> another CPU. The CPU is even block from going down (as in CPU-hotplug)
> until the matching migrate_enable() occurs.
> 
> > Also, if local_irq_save() is hidden due to RT, what guarantees that
> > 
> >   write_seqlock_irqsave(&zonelist_update_seq, flags);
> >   <<IRQ>>
> >     some_timer_function() {
> >       printk();
> >     }
> >   <<IRQ>>
> >   printk_deferred_enter();
> >
> > does not happen because write_seqlock_irqsave() does not disable IRQ?
> 
> I don't see how zonelist_update_seq and printk here are connected
> without the port lock/ or memory allocation. But there are two things
> that are different on RT which probably answer your question:
> 
> - If the reader observes an odd sequence number then it acquires the
>   lock of the sequence counter (it is held by the writer) which
>   forces the writer to complete the write critical section and then the
>   reader can continue. There are _no_ memory allocation within a
>   hard IRQ context (as in the actual interrupt). The timer (hrtimer or
>   timer_list timer) are served in task context and we have
>   forced-threaded interrupts. Clearly this means that the seqlock_t (as
>   used here) can only be used task context and not in hard IRQ context.
> 
> - The printk implementation is slightly different and it is being worked
>   to merge it upstream. The two important differences here:
>   - Printing happens by default in a dedicated printing thread.
>   - In emergency cases like panic(), printing happens directly within
>     the invocation of printk(). This requires a so called atomic console
>     which does not use the tty_port::lock.

If I get it correctly, RT solve both possible deadlocks by offloading
the nested operation into a kthread (irq and printk threads).
Plus, printk uses emergency_write() when the kthread is not usable.

If this is true then printk_safe_enter() might be a nop on RT.
All possible deadlocks are prevented either by the kthread or
the console->emergency_write() call.

But wait. AFAIK, the printk kthread is implemented only for the serial
console. So, it won't be safe for other consoles, especially
the problematic tty_insert_flip_string_and_push_buffer() call.

Note that adding printk thread for the graphical consoles will
be a real challenge. We have hard times even with the basic
UART 8250. There are still races possible in the implementation
in the RT tree...

OK, what about using migrate_disable() in printk_deferred_enter()?
Something like:

/*
 * The printk_deferred_enter/exit macros are available only as a hack.
 * They define a per-CPU context where all printk console printing
 * is deferred because it might cause a deadlock otherwise.
 *
 * It is highly recommended to use them only in a context with interrupts
 * disabled. Otherwise, other unrelated printk() calls might be deferred
 * when they interrupt/preempt the deferred code section.
 *
 * They should not be used for to deffer printing of many messages. It might
 * create softlockup when they are flushed later.
 *
 * IMPORTANT: Any new use of these MUST be consulted with printk maintainers.
 *    It might have unexpected side effects on the printk infrastructure.
 */
#ifdef CONFIG_PREEMPT_RT

#define printk_deferred_enter()			\
	do {					\
		migrate_disable();		\
		__printk_safe_enter();		\
	} while (0)

#define printk_deferred_exit()			\
	do {					\
		__printk_safe_exit();		\
		migrate_enable();		\
	} while (0)

#else  /* CONFIG_PREEMPT_RT */

#define printk_deferred_enter()			\
	do {					\
		preempt_disable();		\
		__printk_safe_enter();		\
	} while (0)

#define printk_deferred_exit()			\
	do {					\
		__printk_safe_exit();		\
		preempt_enable();		\
	} while (0)

#endif   /* CONFIG_PREEMPT_RT */

Note that I have used preempt_disable() on non-RT because it is much
cheaper. And IRQs will be disabled anyway on non-RT system
in this code path.

> > Disabling IRQ before incrementing zonelist_update_seq is _required_ for both
> > 
> >   making printk_deferred_enter() safe
> > 
> > and
> > 
> >   making sure that printk_deferred_enter() takes effect
> > 
> > .
> Did I explain why it is sufficient to do
> 	write_seqlock_irqsave()
> 	printk_deferred_enter()
> 
> assuming we have
> 
> | static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
> | {
> |         seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
> |         do_raw_write_seqcount_begin(s);
> | }

Will this prevent any printk() called on the same CPU before
printk_deferred_enter() is called?

Best Regards,
Petr


  parent reply	other threads:[~2023-06-21 15:38 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21 10:40 Sebastian Andrzej Siewior
2023-06-21 10:59 ` Michal Hocko
2023-06-21 11:16   ` Sebastian Andrzej Siewior
2023-06-21 11:49     ` Michal Hocko
2023-06-21 13:11       ` Sebastian Andrzej Siewior
2023-06-21 13:22         ` Michal Hocko
2023-06-21 13:25           ` Sebastian Andrzej Siewior
2023-06-21 11:14 ` David Hildenbrand
2023-06-21 11:33 ` Tetsuo Handa
2023-06-21 12:40   ` Petr Mladek
2023-06-21 13:08     ` Sebastian Andrzej Siewior
2023-06-21 13:06   ` Sebastian Andrzej Siewior
2023-06-21 13:32     ` Tetsuo Handa
2023-06-21 14:34       ` Sebastian Andrzej Siewior
2023-06-21 14:50         ` Tetsuo Handa
2023-06-21 23:24           ` Tetsuo Handa
2023-06-22  7:18             ` Michal Hocko
2023-06-22 10:58               ` Tetsuo Handa
2023-06-22 12:09                 ` Michal Hocko
2023-06-22 13:36             ` Tetsuo Handa
2023-06-22 14:11               ` Petr Mladek
2023-06-22 14:28                 ` Tetsuo Handa
2023-06-23  9:35                   ` Sebastian Andrzej Siewior
2023-06-22 15:04                 ` Petr Mladek
2023-06-22 15:43                   ` Tetsuo Handa
2023-06-23  9:45                     ` Sebastian Andrzej Siewior
2023-06-23  9:51                       ` Tetsuo Handa
2023-06-23 10:11                         ` Sebastian Andrzej Siewior
2023-06-23 10:36                           ` Tetsuo Handa
2023-06-23 12:44                             ` Sebastian Andrzej Siewior
2023-06-23 12:57                               ` Michal Hocko
2023-06-23 10:53                           ` Petr Mladek
2023-06-23 11:16                             ` Tetsuo Handa
2023-06-23 13:31                             ` Sebastian Andrzej Siewior
2023-06-23 15:38                               ` Petr Mladek
2023-06-23 16:04                                 ` Sebastian Andrzej Siewior
2023-06-23  9:31               ` Sebastian Andrzej Siewior
2023-06-23  7:27           ` Sebastian Andrzej Siewior
2023-06-21 15:38         ` Petr Mladek [this message]
2023-06-23  8:12           ` Sebastian Andrzej Siewior
2023-06-23  9:21             ` Michal Hocko
2023-06-23  9:58               ` Sebastian Andrzej Siewior
2023-06-23 10:43                 ` Michal Hocko
2023-06-23 10:45                 ` Sebastian Andrzej Siewior
2023-06-23 10:50                   ` Sebastian Andrzej Siewior
2023-06-23 11:32                   ` Michal Hocko
2023-06-23 10:40             ` Petr Mladek
2023-06-23 13:24               ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJMZWxDrkV3M1EQH@alley \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=john.ogness@linutronix.de \
    --cc=lgoncalv@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox