linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>, Jan Kara <jack@suse.cz>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	stable@vger.kernel.org, regressions@lists.linux.dev,
	Alasdair Kergon <agk@redhat.com>,
	Mike Snitzer <snitzer@kernel.org>,
	dm-devel@lists.linux.dev, linux-mm@kvack.org
Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5
Date: Tue, 31 Oct 2023 15:01:36 +0100	[thread overview]
Message-ID: <20231031140136.25bio5wajc5pmdtl@quack3> (raw)
In-Reply-To: <ZUB5HFeK3eHeI8UH@mail-itl>

On Tue 31-10-23 04:48:44, Marek Marczykowski-Górecki wrote:
> On Mon, Oct 30, 2023 at 06:50:35PM +0100, Mikulas Patocka wrote:
> > On Mon, 30 Oct 2023, Marek Marczykowski-Górecki wrote:
> > > Then retried with order=PAGE_ALLOC_COSTLY_ORDER and
> > > PAGE_ALLOC_COSTLY_ORDER back at 3, and also got similar crash.
> > 
> > So, does it mean that even allocating with order=PAGE_ALLOC_COSTLY_ORDER 
> > isn't safe?
> 
> That seems to be another bug, see below.
> 
> > Try enabling CONFIG_DEBUG_VM (it also needs CONFIG_DEBUG_KERNEL) and try 
> > to provoke a similar crash. Let's see if it crashes on one of the 
> > VM_BUG_ON statements.
> 
> This was very interesting idea. With this, immediately after login I get
> the crash like below. Which makes sense, as this is when pulseaudio
> starts and opens /dev/snd/*. I then tried with the dm-crypt commit
> reverted and still got the crash! But, after blacklisting snd_pcm,
> there is no BUG splat, but the storage freeze still happens on vanilla
> 6.5.6.

OK, great. Thanks for testing.

<snip snd_pcm bug>

> Plain 6.5.6 (so order = MAX_ORDER - 1, and PAGE_ALLOC_COSTLY_ORDER=3), in frozen state:
> [  143.196106] task:blkdiscard      state:D stack:13672 pid:4884  ppid:2025   flags:0x00000002
> [  143.196130] Call Trace:
> [  143.196139]  <TASK>
> [  143.196147]  __schedule+0x30e/0x8b0
> [  143.196162]  schedule+0x59/0xb0
> [  143.196175]  schedule_timeout+0x14c/0x160
> [  143.196193]  io_schedule_timeout+0x4b/0x70
> [  143.196207]  wait_for_completion_io+0x81/0x130
> [  143.196226]  submit_bio_wait+0x5c/0x90
> [  143.196241]  blkdev_issue_discard+0x94/0xe0
> [  143.196260]  blkdev_common_ioctl+0x79e/0x9c0
> [  143.196279]  blkdev_ioctl+0xc7/0x270
> [  143.196293]  __x64_sys_ioctl+0x8f/0xd0
> [  143.196310]  do_syscall_64+0x3c/0x90

So this shows there was bio submitted and it never ran to completion.
 
> for f in $(grep -l crypt /proc/*/comm); do head $f ${f/comm/stack}; done
<snip some backtraces>

So this shows dm-crypt layer isn't stuck anywhere. So the allocation path
itself doesn't seem to be locking up, looping or anything.

> Then tried:
>  - PAGE_ALLOC_COSTLY_ORDER=4, order=4 - cannot reproduce,
>  - PAGE_ALLOC_COSTLY_ORDER=4, order=5 - cannot reproduce,
>  - PAGE_ALLOC_COSTLY_ORDER=4, order=6 - freeze rather quickly
> 
> I've retried the PAGE_ALLOC_COSTLY_ORDER=4,order=5 case several times
> and I can't reproduce the issue there. I'm confused...

And this kind of confirms that allocations > PAGE_ALLOC_COSTLY_ORDER
causing hangs is most likely just a coincidence. Rather something either in
the block layer or in the storage driver has problems with handling bios
with sufficiently high order pages attached. This is going to be a bit
painful to debug I'm afraid. How long does it take for you trigger the
hang? I'm asking to get rough estimate how heavy tracing we can afford so
that we don't overwhelm the system...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


  reply	other threads:[~2023-10-31 14:02 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ZTNH0qtmint/zLJZ@mail-itl>
     [not found] ` <e427823c-e869-86a2-3549-61b3fdf29537@redhat.com>
     [not found]   ` <ZTiHQDY54E7WAld+@mail-itl>
     [not found]     ` <ZTiJ3CO8w0jauOzW@mail-itl>
2023-10-25 10:13       ` Mikulas Patocka
2023-10-27 17:32         ` Mikulas Patocka
2023-10-28  9:23           ` Matthew Wilcox
2023-10-28 15:14             ` Mike Snitzer
2023-10-29 11:15           ` Marek Marczykowski-Górecki
2023-10-29 20:02         ` Vlastimil Babka
2023-10-30  7:37           ` Mikulas Patocka
2023-10-30  8:37             ` Vlastimil Babka
2023-10-30 11:22               ` Mikulas Patocka
2023-10-30 11:30                 ` Vlastimil Babka
2023-10-30 11:37                   ` Mikulas Patocka
2023-10-30 12:25                   ` Jan Kara
2023-10-30 13:30                     ` Marek Marczykowski-Górecki
2023-10-30 14:08                       ` Mikulas Patocka
2023-10-30 15:56                         ` Jan Kara
2023-10-30 16:51                           ` Marek Marczykowski-Górecki
2023-10-30 17:50                             ` Mikulas Patocka
2023-10-31  3:48                               ` Marek Marczykowski-Górecki
2023-10-31 14:01                                 ` Jan Kara [this message]
2023-10-31 15:42                                   ` Marek Marczykowski-Górecki
2023-10-31 17:17                                     ` Mikulas Patocka
2023-10-31 17:24                                       ` Mikulas Patocka
2023-11-02  0:38                                         ` Marek Marczykowski-Górecki
2023-11-02  9:28                                           ` Mikulas Patocka
2023-11-02 11:45                                             ` Marek Marczykowski-Górecki
2023-11-02 17:06                                               ` Mikulas Patocka
2023-11-03 15:01                                                 ` Marek Marczykowski-Górecki
2023-11-03 15:10                                                   ` Keith Busch
2023-11-03 16:15                                                 ` Marek Marczykowski-Górecki
2023-11-03 16:54                                                   ` Keith Busch
2023-11-03 20:30                                                     ` Marek Marczykowski-G'orecki
2023-11-03 22:42                                                       ` Keith Busch
2023-11-04  9:27                                                         ` Mikulas Patocka
2023-11-04 13:59                                                           ` Keith Busch
2023-11-06  7:10                                                             ` Christoph Hellwig
2023-11-06 14:59                                                               ` [PATCH] swiotlb-xen: provide the "max_mapping_size" method Mikulas Patocka
2023-11-06 15:16                                                                 ` Keith Busch
2023-11-06 15:30                                                                   ` Mike Snitzer
2023-11-06 17:12                                                                     ` [PATCH v2] " Mikulas Patocka
2023-11-07  4:18                                                                       ` Stefano Stabellini
2023-11-08  7:31                                                                       ` Christoph Hellwig
2023-11-06  7:08                                                     ` Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 Christoph Hellwig
2023-11-02 12:21                                             ` Jan Kara
2023-11-01  1:27                                     ` Ming Lei
     [not found]                                       ` <ZUG0gcRhUlFm57qN@mail-itl>
     [not found]                                         ` <ZUG016NyTms2073C@mail-itl>
2023-11-01  2:35                                           ` Marek Marczykowski-Górecki
2023-11-01  3:24                                         ` Ming Lei
2023-11-01 10:15                                           ` Hannes Reinecke
2023-11-01 10:26                                             ` Jan Kara
2023-11-01 11:23                                             ` Ming Lei
2023-11-02 14:02                                               ` Keith Busch
2023-11-01 12:16                                             ` Mikulas Patocka
2023-10-30 11:28               ` Jan Kara
2023-10-30 11:49                 ` Mikulas Patocka
2023-10-30 12:11                   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231031140136.25bio5wajc5pmdtl@quack3 \
    --to=jack@suse.cz \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dm-devel@lists.linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=marmarek@invisiblethingslab.com \
    --cc=mhocko@suse.com \
    --cc=mpatocka@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=snitzer@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox