From: Qu Wenruo <wqu@suse.com>
To: Matthew Wilcox <willy@infradead.org>, Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
vivek.kasireddy@intel.com,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: Large folios and filemap_get_folios_contig()
Date: Fri, 4 Apr 2025 07:46:59 +1030 [thread overview]
Message-ID: <59539c02-d353-4811-bcbe-080b408f445e@suse.com> (raw)
In-Reply-To: <Z-6ApNtjw9yvAGc4@casper.infradead.org>
在 2025/4/3 23:05, Matthew Wilcox 写道:
> On Thu, Apr 03, 2025 at 08:06:53PM +1030, Qu Wenruo wrote:
>> Recently I hit a bug when developing the large folios support for btrfs.
>>
>> That we call filemap_get_folios_contig(), then lock each returned folio.
>> (We also have a case where we unlock each returned folio)
>>
>> However since a large folio can be returned several times in the batch,
>> this obviously makes a deadlock, as btrfs is trying to lock the same
>> folio more than once.
>
> Sorry, what? A large folio should only be returned once. xas_next()
> moves to the next folio. How is it possible that
> filemap_get_folios_contig() returns the same folio more than once?
But that's exactly what I got from filemap_get_folios_contig():
lock_delalloc_folios: r/i=5/260 locked_folio=720896(65536) start=782336
end=819199(36864)
lock_delalloc_folios: r/i=5/260 found_folios=1
lock_delalloc_folios: r/i=5/260 i=0 folio=720896(65536)
lock_delalloc_folios: r/i=5/260 found_folios=8
lock_delalloc_folios: r/i=5/260 i=0 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=1 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=2 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=3 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=4 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=5 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=6 folio=786432(262144)
lock_delalloc_folios: r/i=5/260 i=7 folio=786432(262144)
r/i is the root and inode number from btrfs, and you can completely
ignore it.
@locked_folio is the folio we're already holding a lock, the value
inside the brackets is the folio size.
@start and @end is the range we're searching for, the value inside the
brackets is the search range length.
The first iteration returns the current locked folio, and since the
range inside that folio is only 4K, thus it's only returned once.
The next 8 slots are all inside the same large folio at 786432,
resulting duplicated entries.
>
>> Then I looked into the caller of filemap_get_folios_contig() inside
>> mm/gup, and it indeed does the correct skip.
>
> ... that code looks wrong to me.
It looks like it's xas_find() is doing the correct skip by calling
xas_next_offset() -> xas_move_index() to skip the next one.
But the filemap_get_folios_contig() only calls xas_next() by increasing
the index, not really skip to the next folio.
Although I can be totally wrong as I'm not familiar with the xarray
internals at all.
However I totally agree the duplicated behavior (and the extra handling
of duplicated entries) looks very wrong.
Thanks,
Qu
next prev parent reply other threads:[~2025-04-03 21:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-03 9:36 Qu Wenruo
2025-04-03 12:35 ` Matthew Wilcox
2025-04-03 21:16 ` Qu Wenruo [this message]
2025-04-04 0:50 ` Vishal Moola (Oracle)
2025-04-04 4:15 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59539c02-d353-4811-bcbe-080b408f445e@suse.com \
--to=wqu@suse.com \
--cc=akpm@linux-foundation.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=vivek.kasireddy@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox