linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	David Bueso <dave@stgolabs.net>, Jan Kara <jack@suse.cz>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Tso Ted <tytso@mit.edu>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Oliver Sang <oliver.sang@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Alistair Popple <apopple@nvidia.com>,
	linux-mm@kvack.org, Christian Brauner <brauner@kernel.org>,
	Hannes Reinecke <hare@suse.de>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	John Garry <john.g.garry@oracle.com>,
	linux-block@vger.kernel.org, ltp@lists.linux.it,
	Pankaj Raghav <p.raghav@samsung.com>,
	Daniel Gomez <da.gomez@samsung.com>,
	Dave Chinner <david@fromorbit.com>,
	gost.dev@samsung.com,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [linux-next:master] [block/bdev] 3c20917120: BUG:sleeping_function_called_from_invalid_context_at_mm/util.c
Date: Tue, 8 Apr 2025 12:13:06 -0700	[thread overview]
Message-ID: <Z_V1QiXTCYQk9sfZ@bombadil.infradead.org> (raw)
In-Reply-To: <Z_VwF1MA-R7MgDVG@casper.infradead.org>

On Tue, Apr 08, 2025 at 07:51:03PM +0100, Matthew Wilcox wrote:
> On Tue, Apr 08, 2025 at 11:02:40AM -0700, Darrick J. Wong wrote:
> > On Tue, Apr 08, 2025 at 06:51:14PM +0100, Matthew Wilcox wrote:
> > > On Tue, Apr 08, 2025 at 10:48:55AM -0700, Darrick J. Wong wrote:
> > > > On Tue, Apr 08, 2025 at 10:24:40AM -0700, Luis Chamberlain wrote:
> > > > > On Tue, Apr 8, 2025 at 10:06 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > > > > > Fun
> > > > > > puzzle for the community is figuring out *why* oh why did a large folio
> > > > > > end up being used on buffer-heads for your use case *without* an LBS
> > > > > > device (logical block size) being present, as I assume you didn't have
> > > > > > one, ie say a nvme or virtio block device with logical block size  >
> > > > > > PAGE_SIZE. The area in question would trigger on folio migration *only*
> > > > > > if you are migrating large buffer-head folios. We only create those
> > > > > 
> > > > > To be clear, large folios for buffer-heads.
> > > > > > if
> > > > > > you have an LBS device and are leveraging the block device cache or a
> > > > > > filesystem with buffer-heads with LBS (they don't exist yet other than
> > > > > > the block device cache).
> > > > 
> > > > My guess is that udev or something tries to read the disk label in
> > > > response to some uevent (mkfs, mount, unmount, etc), which creates a
> > > > large folio because min_order > 0, and attaches a buffer head.  There's
> > > > a separate crash report that I'll cc you on.
> > > 
> > > But you said:
> > > 
> > > > the machine is arm64 with 64k basepages and 4k fsblock size:
> > > 
> > > so that shouldn't be using large folios because you should have set the
> > > order to 0.  Right?  Or did you mis-speak and use a 4K PAGE_SIZE kernel
> > > with a 64k fsblocksize?
> > 
> > This particular kernel warning is arm64 with 64k base pages and a 4k
> > fsblock size, and my suspicion is that udev/libblkid are creating the
> > buffer heads or something weird like that.
> > 
> > On x64 with 4k base pages, xfs/032 creates a filesystem with 64k sector
> > size and there's an actual kernel crash resulting from a udev worker:
> > https://lore.kernel.org/linux-fsdevel/20250408175125.GL6266@frogsfrogsfrogs/T/#u
> > 
> > So I didn't misspeak, I just have two problems.  I actually have four
> > problems, but the others are loop device behavior changes.
> 
> Right, but this warning only triggers for large folios.  So somehow
> we've got a multi-page folio in the bdev's page cache.
> 
> Ah.  I see.
> 
> block/bdev.c:   mapping_set_folio_min_order(BD_INODE(bdev)->i_mapping,
> 
> so we're telling the bdev that it can go up to MAX_PAGECACHE_ORDER.

Ah yes silly me that would explain the large folios without LBS devices.

> And then we call readahead, which will happily put order-2 folios
> in the pagecache because of my bug that we've never bothered fixing.
> 
> We should probably fix that now, but as a temporary measure if
> you'd like to put:
> 
> mapping_set_folio_order_range(BD_INODE(bdev)->i_mapping, min, min)
> 
> instead of the mapping_set_folio_min_order(), that would make the bug
> no longer appear for you.

Agreed.

  Luis


  reply	other threads:[~2025-04-08 19:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <202503101536.27099c77-lkp@intel.com>
     [not found] ` <20250311-testphasen-behelfen-09b950bbecbf@brauner>
     [not found]   ` <Z9kEdPLNT8SOyOQT@xsang-OptiPlex-9020>
2025-03-18  8:15     ` Luis Chamberlain
2025-03-18 14:37       ` Matthew Wilcox
2025-03-18 23:17         ` Luis Chamberlain
2025-03-19  2:58           ` Matthew Wilcox
2025-03-19 16:55             ` Luis Chamberlain
2025-03-19 19:16               ` Luis Chamberlain
2025-03-19 19:24                 ` Matthew Wilcox
2025-03-20 12:11                   ` Luis Chamberlain
2025-03-20 12:18                     ` Luis Chamberlain
2025-03-22 23:14                     ` Johannes Weiner
2025-03-23  1:02                       ` Luis Chamberlain
2025-03-23  7:07                         ` Luis Chamberlain
2025-03-25  6:52                           ` Oliver Sang
2025-03-28  1:44                             ` Luis Chamberlain
2025-03-28  4:21                               ` Luis Chamberlain
2025-03-28  9:47                                 ` Luis Chamberlain
2025-03-28 19:09                                   ` Luis Chamberlain
2025-03-29  0:08                                     ` Luis Chamberlain
2025-03-29  1:06                                       ` Luis Chamberlain
2025-03-31  7:45                                       ` Sebastian Andrzej Siewior
2025-04-08 16:43                                         ` Darrick J. Wong
2025-04-08 17:06                                           ` Luis Chamberlain
2025-04-08 17:24                                             ` Luis Chamberlain
2025-04-08 17:48                                               ` Darrick J. Wong
2025-04-08 17:51                                                 ` Matthew Wilcox
2025-04-08 18:02                                                   ` Darrick J. Wong
2025-04-08 18:51                                                     ` Matthew Wilcox
2025-04-08 19:13                                                       ` Luis Chamberlain [this message]
2025-04-08 19:13                                                       ` Luis Chamberlain
2025-04-08 18:06                                                 ` Luis Chamberlain
2025-03-20  1:24       ` Lai, Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_V1QiXTCYQk9sfZ@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=apopple@nvidia.com \
    --cc=brauner@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hannes@cmpxchg.org \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=ltp@lists.linux.it \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=p.raghav@samsung.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox