linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Michal Hocko <mhocko@suse.com>
Cc: "Mel Gorman" <mgorman@techsingularity.net>,
	"Patrick Daly" <quic_pdaly@quicinc.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	syzkaller-bugs@googlegroups.com,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
	syzbot <syzbot+223c7461c58c58a4cb10@syzkaller.appspotmail.com>,
	linux-mm <linux-mm@kvack.org>, "Petr Mladek" <pmladek@suse.com>,
	"Sergey Senozhatsky" <senozhatsky@chromium.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"John Ogness" <john.ogness@linutronix.de>
Subject: Re: [PATCH] mm/page_alloc: don't check zonelist_update_seq from atomic allocations
Date: Mon, 3 Apr 2023 20:14:28 +0900	[thread overview]
Message-ID: <bc630622-8b24-e4ca-2685-64880b5a7647@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <ZCqLJqxA9EuPPmAg@dhcp22.suse.cz>

On 2023/04/03 17:15, Michal Hocko wrote:
> Is this
> https://lore.kernel.org/all/0000000000001d74d205f7c1821f@google.com/ the
> underlying report ?

Yes.

> Could you explain the the deadlock scenario?

build_zonelists() from __build_all_zonelists() calls printk() with
zonelist_update_seq held.

printk() holds console_owner lock for synchronous printing, and then upon
unlock of console_owner lock, printk() holds port_lock_key and port->lock.

tty_insert_flip_string_and_push_buffer() from pty_write() conditionally calls
kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. But since commit 3d36424b3b58,
zonelist_update_seq is checked by GFP_ATOMIC allocation (i.e. a new locking dependency
was added by that commit).

  CPU0                                                       CPU1
  pty_write() {
    tty_insert_flip_string_and_push_buffer() {
                                                             __build_all_zonelists() {
      spin_lock_irqsave(&port->lock, flags);
      tty_insert_flip_string() {
        tty_insert_flip_string_fixed_flag() {
          __tty_buffer_request_room() {
            tty_buffer_alloc() {
              kmalloc(GFP_ATOMIC | __GFP_NOWARN) {
                __alloc_pages_slowpath() {
                                                               write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd
                                                               // interrupt handler starts
                                                                 handle_irq() {
                                                                   serial8250_interrupt() {
                                                                     serial8250_tx_chars() {
                                                                       tty_port_tty_get() {
                                                                         spin_lock_irqsave(&port->lock, flags); // spins here waiting for kmalloc() from tty_insert_flip_string() to complete
                  zonelist_iter_begin() {
                    read_seqbegin(&zonelist_update_seq) {
                      // spins here waiting for interrupt handler to complete if zonelist_update_seq.seqcount is odd
                                                                         tty = tty_kref_get(port->tty);
                                                                         spin_unlock_irqrestore(&port->lock, flags);
                                                                       }
                                                                     }
                                                                   }
                                                                 }
                                                               // interrupt handler ends
                                                               write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even
                                                             }
                    }
                  }
                }
              }
            }
          }
        }
      }
      spin_unlock_irqrestore(&port->lock, flags);
    }
  }

Well, it seems that read_mems_allowed_begin() is protected by calling
local_irq_save(flags) before write_seqcount_begin(&current->mems_allowed_seq).

Can zonelist_iter_begin() be protected as well (i.e. call local_irq_save(flags)
before write_seqlock(&zonelist_update_seq)) ?

But even if write_seqlock(&zonelist_update_seq) is called with local irq disabled,
port_lock_key after all makes this warning again?



This bug report might be a suggestion that we want to use two versions of
__alloc_pages_slowpath(), one for atomic context which is geared towards smaller
kernel stack usage and simplified locking dependency (because atomic allocation can
happen from subtle context including interrupt handler) and the other for noinline
version for schedulable context which is geared towards larger kernel stack usage
and complicated locking dependency for implementing rich retry paths including
direct reclaim and OOM kill...



  reply	other threads:[~2023-04-03 11:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <000000000000b21f0a05e9ec310d@google.com>
     [not found] ` <f6bd471c-f961-ef5e-21c5-bf158be19d12@linux.intel.com>
2023-04-02 10:48   ` Tetsuo Handa
2023-04-03  8:15     ` Michal Hocko
2023-04-03 11:14       ` Tetsuo Handa [this message]
2023-04-03 12:09         ` Michal Hocko
2023-04-03 12:51           ` Tetsuo Handa
2023-04-03 13:44             ` Michal Hocko
2023-04-03 15:12               ` Petr Mladek
2023-04-04  0:37                 ` [PATCH] mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock Tetsuo Handa
2023-04-04  2:11                   ` Sergey Senozhatsky
2023-04-04  7:43                   ` Petr Mladek
2023-04-04  7:54                   ` Michal Hocko
2023-04-04  8:20                     ` Tetsuo Handa
2023-04-04 11:05                       ` Michal Hocko
2023-04-04 11:19                         ` Tetsuo Handa
2023-04-04 14:31                           ` [PATCH v2] " Tetsuo Handa
2023-04-04 15:20                             ` Michal Hocko
2023-04-05  9:02                               ` Mel Gorman
2023-04-04 21:25                             ` Andrew Morton
2023-04-05  8:28                               ` Michal Hocko
2023-04-05  8:53                                 ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc630622-8b24-e4ca-2685-64880b5a7647@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=john.ogness@linutronix.de \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=pmladek@suse.com \
    --cc=quic_pdaly@quicinc.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=syzbot+223c7461c58c58a4cb10@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox