* Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
@ 2024-09-12 21:18 Christian Theune
2024-09-12 21:55 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-09-12 21:18 UTC (permalink / raw)
To: linux-mm, linux-xfs, linux-fsdevel, linux-kernel
Cc: torvalds, axboe, Daniel Dao, Dave Chinner, willy, clm,
regressions, regressions
Hello everyone,
I’d like to raise awareness about a bug causing data loss somewhere in MM interacting with XFS that seems to have been around since Dec 2021 (https://github.com/torvalds/linux/commit/6795801366da0cd3d99e27c37f020a8f16714886).
We started encountering this bug when upgrading to 6.1 around June 2023 and we have had at least 16 instances with data loss in a fleet of 1.5k VMs.
This bug is very hard to reproduce but has been known to exist as a “fluke” for a while already. I have invested a number of days trying to come up with workloads to trigger it quicker than that stochastic “once every few weeks in a fleet of 1.5k machines", but it eludes me so far. I know that this also affects Facebook/Meta as well as Cloudflare who are both running newer kernels (at least 6.1, 6.6, and 6.9) with the above mentioned patch reverted. I’m from a much smaller company and seeing that those guys are running with this patch reverted (that now makes their kernel basically an untested/unsupported deviation from the mainline) smells like desparation. I’m with a much smaller team and company and I’m wondering why this isn’t tackled more urgently from more hands to make it shallow (hopefully).
The issue appears to happen mostly on nodes that are running some kind of database or specifically storage-oriented load. In our case we see this happening with PostgreSQL and MySQL. Cloudflare IIRC saw this with RocksDB load and Meta is talking about nfsd load.
I suspect low memory (but not OOM low) / pressure and maybe swap conditions seem to increase the chance of triggering it - but I might be completely wrong on that suspicion.
There is a bug report I started here back then: https://bugzilla.kernel.org/show_bug.cgi?id=217572 and there have been discussions on the XFS list: https://lore.kernel.org/lkml/CA+wXwBS7YTHUmxGP3JrhcKMnYQJcd6=7HE+E1v-guk01L2K3Zw@mail.gmail.com/T/ but ultimately this didn’t receive sufficient interested to keep it moving forward and I ran out of steam. Unfortunately we can’t be stuck on 5.15 forever and other kernel developers correctly keep pointing out that we should be updating, but that isn’t an option as long as this time bomb still exists.
Jens pointed out that Meta's findings and their notes on the revert included "When testing nfsd on top of v5.19, we hit lockups in filemap_read(). These ended up being because the xarray for the files being read had pages from other files mixed in."
XFS is known to me and admired for the very high standards they represent regarding testing and avoiding data loss but ultimately that doesn’t matter if we’re going to be stuck with this bug forever.
I’m able to help funding efforts, help creating a reproducer, generally donate my time (not a kernel developer myself) and even provide access to machines that did see the crash (but don’t carry customer data), but I’m not making any progress or getting any traction here.
Jens encouraged me to raise the visibility in this way - so that’s what I’m trying here.
Please help.
In appreciation of all the hard work everyone is putting in and with hugs and love,
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 21:18 Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Christian Theune
@ 2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11 ` Christian Theune
2024-09-12 22:12 ` Jens Axboe
0 siblings, 2 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-12 21:55 UTC (permalink / raw)
To: Christian Theune
Cc: linux-mm, linux-xfs, linux-fsdevel, linux-kernel, torvalds,
axboe, Daniel Dao, Dave Chinner, clm, regressions, regressions
On Thu, Sep 12, 2024 at 11:18:34PM +0200, Christian Theune wrote:
> This bug is very hard to reproduce but has been known to exist as a
> “fluke” for a while already. I have invested a number of days trying
> to come up with workloads to trigger it quicker than that stochastic
> “once every few weeks in a fleet of 1.5k machines", but it eludes
> me so far. I know that this also affects Facebook/Meta as well as
> Cloudflare who are both running newer kernels (at least 6.1, 6.6,
> and 6.9) with the above mentioned patch reverted. I’m from a much
> smaller company and seeing that those guys are running with this patch
> reverted (that now makes their kernel basically an untested/unsupported
> deviation from the mainline) smells like desparation. I’m with a
> much smaller team and company and I’m wondering why this isn’t
> tackled more urgently from more hands to make it shallow (hopefully).
This passive-aggressive nonsense is deeply aggravating. I've known
about this bug for much longer, but like you I am utterly unable to
reproduce it. I've spent months looking for the bug, and I cannot.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 21:55 ` Matthew Wilcox
@ 2024-09-12 22:11 ` Christian Theune
2024-09-12 22:12 ` Jens Axboe
1 sibling, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-12 22:11 UTC (permalink / raw)
To: Matthew Wilcox
Cc: linux-mm, linux-xfs, linux-fsdevel, linux-kernel, torvalds,
axboe, Daniel Dao, Dave Chinner, clm, regressions, regressions
Hi Matthew,
> On 12. Sep 2024, at 23:55, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Sep 12, 2024 at 11:18:34PM +0200, Christian Theune wrote:
>> This bug is very hard to reproduce but has been known to exist as a
>> “fluke” for a while already. I have invested a number of days trying
>> to come up with workloads to trigger it quicker than that stochastic
>> “once every few weeks in a fleet of 1.5k machines", but it eludes
>> me so far. I know that this also affects Facebook/Meta as well as
>> Cloudflare who are both running newer kernels (at least 6.1, 6.6,
>> and 6.9) with the above mentioned patch reverted. I’m from a much
>> smaller company and seeing that those guys are running with this patch
>> reverted (that now makes their kernel basically an untested/unsupported
>> deviation from the mainline) smells like desparation. I’m with a
>> much smaller team and company and I’m wondering why this isn’t
>> tackled more urgently from more hands to make it shallow (hopefully).
>
> This passive-aggressive nonsense is deeply aggravating. I've known
> about this bug for much longer, but like you I am utterly unable to
> reproduce it. I've spent months looking for the bug, and I cannot.
I’m sorry. I’ve honestly tried my best to not make this message personally injuring to anybody involved while trying to also communicate the seriousness of this issue that we’re stuck with. Apparently I failed.
As I’m not a kernel developer I tried to stick to describing the issue and am not sure what strategies would typically need to be applied when individual efforts fail.
I’m not sure why it’s nonsense, though.
Liebe Grüße,
Christian Theune
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11 ` Christian Theune
@ 2024-09-12 22:12 ` Jens Axboe
2024-09-12 22:25 ` Linus Torvalds
1 sibling, 1 reply; 81+ messages in thread
From: Jens Axboe @ 2024-09-12 22:12 UTC (permalink / raw)
To: Matthew Wilcox, Christian Theune
Cc: linux-mm, linux-xfs, linux-fsdevel, linux-kernel, torvalds,
Daniel Dao, Dave Chinner, clm, regressions, regressions
On 9/12/24 3:55 PM, Matthew Wilcox wrote:
> On Thu, Sep 12, 2024 at 11:18:34PM +0200, Christian Theune wrote:
>> This bug is very hard to reproduce but has been known to exist as a
>> ?fluke? for a while already. I have invested a number of days trying
>> to come up with workloads to trigger it quicker than that stochastic
>> ?once every few weeks in a fleet of 1.5k machines", but it eludes
>> me so far. I know that this also affects Facebook/Meta as well as
>> Cloudflare who are both running newer kernels (at least 6.1, 6.6,
>> and 6.9) with the above mentioned patch reverted. I?m from a much
>> smaller company and seeing that those guys are running with this patch
>> reverted (that now makes their kernel basically an untested/unsupported
>> deviation from the mainline) smells like desparation. I?m with a
>> much smaller team and company and I?m wondering why this isn?t
>> tackled more urgently from more hands to make it shallow (hopefully).
>
> This passive-aggressive nonsense is deeply aggravating. I've known
> about this bug for much longer, but like you I am utterly unable to
> reproduce it. I've spent months looking for the bug, and I cannot.
What passive aggressiveness?! There's a data corruption bug where we
know what causes it, yet we continue to ship it. That's aggravating.
People are aware of the bug, and since there's no good reproducer, it's
hard to fix. That part is fine and understandable. What seems amiss here
is the fact that large folio support for xfs hasn't just been reverted
until the issue is understood and resolved.
When I saw Christian's report, I seemed to recall that we ran into this
at Meta too. And we did, and hence have been reverting it since our 5.19
release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
things that are known broken.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:12 ` Jens Axboe
@ 2024-09-12 22:25 ` Linus Torvalds
2024-09-12 22:30 ` Jens Axboe
` (4 more replies)
0 siblings, 5 replies; 81+ messages in thread
From: Linus Torvalds @ 2024-09-12 22:25 UTC (permalink / raw)
To: Jens Axboe
Cc: Matthew Wilcox, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner, clm,
regressions, regressions
On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
>
> When I saw Christian's report, I seemed to recall that we ran into this
> at Meta too. And we did, and hence have been reverting it since our 5.19
> release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
> things that are known broken.
I do think that if we have big sites just reverting it as known broken
and can't figure out why, we should do so upstream too.
Yes, it's going to make it even harder to figure out what's wrong.
Not great. But if this causes filesystem corruption, that sure isn't
great either. And people end up going "I'll use ext4 which doesn't
have the problem", that's not exactly helpful either.
And yeah, the reason ext4 doesn't have the problem is simply because
ext4 doesn't enable large folios. So that doesn't pin anything down
either (ie it does *not* say "this is an xfs bug" - it obviously might
be, but it's probably more likely some large-folio issue).
Other filesystems do enable large folios (afs, bcachefs, erofs, nfs,
smb), but maybe just not be used under the kind of load to show it.
Honestly, the fact that it hasn't been reverted after apparently
people knowing about it for months is a bit shocking to me. Filesystem
people tend to take unknown corruption issues as a big deal. What
makes this so special? Is it because the XFS people don't consider it
an XFS issue, so...
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:25 ` Linus Torvalds
@ 2024-09-12 22:30 ` Jens Axboe
2024-09-12 22:56 ` Linus Torvalds
2024-09-13 12:11 ` Christian Brauner
` (3 subsequent siblings)
4 siblings, 1 reply; 81+ messages in thread
From: Jens Axboe @ 2024-09-12 22:30 UTC (permalink / raw)
To: Linus Torvalds
Cc: Matthew Wilcox, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner, clm,
regressions, regressions
On 9/12/24 4:25 PM, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> When I saw Christian's report, I seemed to recall that we ran into this
>> at Meta too. And we did, and hence have been reverting it since our 5.19
>> release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
>> things that are known broken.
>
> I do think that if we have big sites just reverting it as known broken
> and can't figure out why, we should do so upstream too.
Agree. I suspect it would've come up internally shortly too, as we're
just now preparing to roll 6.11 as the next kernel. That always starts
with a list of "what commits are in our 6.9 tree that aren't upstream"
and then porting those, and this one is in that (pretty short) list.
> Yes, it's going to make it even harder to figure out what's wrong.
> Not great. But if this causes filesystem corruption, that sure isn't
> great either. And people end up going "I'll use ext4 which doesn't
> have the problem", that's not exactly helpful either.
Until someone has a good reproducer for it, it is going to remain
elusive. And it's a two-liner to enable it again for testing, hence
should not be a hard thing to do.
> And yeah, the reason ext4 doesn't have the problem is simply because
> ext4 doesn't enable large folios. So that doesn't pin anything down
> either (ie it does *not* say "this is an xfs bug" - it obviously might
> be, but it's probably more likely some large-folio issue).
>
> Other filesystems do enable large folios (afs, bcachefs, erofs, nfs,
> smb), but maybe just not be used under the kind of load to show it.
It might be an iomap thing... Other file systems do use it, but to
various degrees, and XFS is definitely the primary user.
> Honestly, the fact that it hasn't been reverted after apparently
> people knowing about it for months is a bit shocking to me. Filesystem
> people tend to take unknown corruption issues as a big deal. What
> makes this so special? Is it because the XFS people don't consider it
> an XFS issue, so...
Double agree, I was pretty surprised when I learned of all this today.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:30 ` Jens Axboe
@ 2024-09-12 22:56 ` Linus Torvalds
2024-09-13 3:44 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-12 22:56 UTC (permalink / raw)
To: Jens Axboe
Cc: Matthew Wilcox, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner, clm,
regressions, regressions
On Thu, 12 Sept 2024 at 15:30, Jens Axboe <axboe@kernel.dk> wrote:
>
> It might be an iomap thing... Other file systems do use it, but to
> various degrees, and XFS is definitely the primary user.
I have to say, I looked at the iomap code, and it's disgusting.
The "I don't support large folios" check doesn't even say "don't do
large folios". That's what the regular __filemap_get_folio() code does
for reads, and that's the sane thing to do. But that's not what the
iomap code does. AT ALL.
No, the iomap code limits "len" of a write in iomap_write_begin() to
be within one page, and then magically depends on
(a) __iomap_get_folio() using that length to decide how big a folio to allocate
(b) iomap_write_begin() doing its own "what is the real length:" based on that.
(c) the *caller* then having to do the same thing, to see what length
iomap_write_begin() _actually_ used (because it wasn't the 'bytes'
that was passed in).
Honestly, the iomap code is just odd. Having these kinds of subtle
interdependencies doesn't make sense. The two code sequences don't
even use the same logic, with iomap_write_begin() doing
if (!mapping_large_folio_support(iter->inode->i_mapping))
len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos));
[... alloc folio ...]
if (pos + len > folio_pos(folio) + folio_size(folio))
len = folio_pos(folio) + folio_size(folio) - pos;
and the caller (iomap_write_iter) doing
offset = offset_in_folio(folio, pos);
if (bytes > folio_size(folio) - offset)
bytes = folio_size(folio) - offset;
and yes, the two completely different ways of picking 'len' (called
'bytes' in the second case) had *better* match.
I do think they match, but code shouldn't be organized this way.
It's not just the above kind of odd thing either, it's things like
iomap_get_folio() using that fgf_set_order(len), which does
unsigned int shift = ilog2(size);
if (shift <= PAGE_SHIFT)
return 0;
so now it has done that potentially expensive ilog2() for the common
case of "len < PAGE_SIZE", but dammit, it should never have even
bothered looking at 'len' if the inode didn't support large folios in
the first place, and we shouldn't have had that special odd 'len =
min_t(..)" magic rule to force an order-0 thing, because
Yeah, yeah, modern CPU's all have reasonably cheap bit finding
instructions. But the code simply shouldn't have this kind of thing in
the first place.
The folio should have been allocated *before* iomap_write_begin(), the
"no large folios" should just have fixed the order to zero there, and
the actual real-life length of the write should have been limited in
*one* piece of code after the allocation point instead of then having
two different pieces of code depending on matching (subtle and
undocumented) logic.
Put another way: I most certainly don't see the bug here - it may look
_odd_, but not wrong - but at the same time, looking at that code
doesn't make me get the warm and fuzzies about the iomap large-folio
situation either.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:56 ` Linus Torvalds
@ 2024-09-13 3:44 ` Matthew Wilcox
2024-09-13 13:23 ` Christian Theune
0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-13 3:44 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Christian Theune, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, Dave Chinner, clm, regressions,
regressions
On Thu, Sep 12, 2024 at 03:56:17PM -0700, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:30, Jens Axboe <axboe@kernel.dk> wrote:
> >
> > It might be an iomap thing... Other file systems do use it, but to
> > various degrees, and XFS is definitely the primary user.
>
> I have to say, I looked at the iomap code, and it's disgusting.
I'm not going to comment on this because I think it's unrelated to
the problem.
We have reports of bad entries being returned from page cache lookups.
Sometimes they're pages which have been freed, sometimes they're pages
which are very definitely in use by a different filesystem.
I think that's what the underlying problem is here (or else we have
two problems). I'm not convinced that it's necessarily related to large
folios, but it's certainly easier to reproduce with large folios.
I've looked at a number of explanations for this. Could it be a page
that's being freed without being removed from the xarray? We seem to
have debug that would trigger in that case, so I don't think so.
Could it be a page with a messed-up refcount? Again, I think we'd
notice the VM_BUG_ON_PAGE() in put_page_testzero(), so I don't think
it's that either.
My current best guess is that we have an xarray node with a stray pointer
in it; that the node is freed from one xarray, allocated to a different
xarray, but not properly cleared. But I can't reproduce the problem,
so that's pure speculation on my part.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:25 ` Linus Torvalds
2024-09-12 22:30 ` Jens Axboe
@ 2024-09-13 12:11 ` Christian Brauner
2024-09-16 13:29 ` Matthew Wilcox
2024-09-13 15:30 ` Chris Mason
` (2 subsequent siblings)
4 siblings, 1 reply; 81+ messages in thread
From: Christian Brauner @ 2024-09-13 12:11 UTC (permalink / raw)
To: Linus Torvalds, Pankaj Raghav, Luis Chamberlain
Cc: Jens Axboe, Matthew Wilcox, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
clm, regressions, regressions
On Thu, Sep 12, 2024 at 03:25:50PM GMT, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
> >
> > When I saw Christian's report, I seemed to recall that we ran into this
> > at Meta too. And we did, and hence have been reverting it since our 5.19
> > release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
> > things that are known broken.
>
> I do think that if we have big sites just reverting it as known broken
> and can't figure out why, we should do so upstream too.
>
> Yes, it's going to make it even harder to figure out what's wrong.
> Not great. But if this causes filesystem corruption, that sure isn't
> great either. And people end up going "I'll use ext4 which doesn't
> have the problem", that's not exactly helpful either.
>
> And yeah, the reason ext4 doesn't have the problem is simply because
> ext4 doesn't enable large folios. So that doesn't pin anything down
> either (ie it does *not* say "this is an xfs bug" - it obviously might
> be, but it's probably more likely some large-folio issue).
>
> Other filesystems do enable large folios (afs, bcachefs, erofs, nfs,
> smb), but maybe just not be used under the kind of load to show it.
>
> Honestly, the fact that it hasn't been reverted after apparently
> people knowing about it for months is a bit shocking to me. Filesystem
> people tend to take unknown corruption issues as a big deal. What
> makes this so special? Is it because the XFS people don't consider it
> an XFS issue, so...
So this issue it new to me as well. One of the items this cycle is the
work to enable support for block sizes that are larger than page sizes
via the large block size (LBS) series that's been sitting in -next for a
long time. That work specifically targets xfs and builds on top of the
large folio support.
If the support for large folios is going to be reverted in xfs then I
see no point to merge the LBS work now. So I'm holding off on sending
that pull request until a decision is made (for xfs). As far as I
understand, supporting larger block sizes will not be meaningful without
large folio support.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 3:44 ` Matthew Wilcox
@ 2024-09-13 13:23 ` Christian Theune
0 siblings, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-13 13:23 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Jens Axboe, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, Dave Chinner, clm, regressions,
regressions, mironov.ivan
Hi,
> On 13. Sep 2024, at 05:44, Matthew Wilcox <willy@infradead.org> wrote:
>
> My current best guess is that we have an xarray node with a stray pointer
> in it; that the node is freed from one xarray, allocated to a different
> xarray, but not properly cleared. But I can't reproduce the problem,
> so that's pure speculation on my part.
I’d love to help with the reproduction. I understand that BZ is unloved and I guess putting everything I’ve seen so far from various sources into a single spot might help - unfortunately that creates a pretty long mail. I selectively didn’t inline some more far fetched things.
A tiny bit of context about me: I’m a seasoned developer, but not a kernel developer. I don’t know the subsystems from a code perspective. I stare at kernel code (or C code generally) mostly only when things go wrong. I did my share of debugging hard things over the last 25 years and I am good at trying to attack things from multiple angles.
I have 9 non-production VMs that exhibited the issue last year. I can put those on custom compiled kernels and instrument them as needed. Feel free to use me as a resource here.
Rabbit hole 1: the stalls and stack traces
==========================================
I’ve reviewed all of the stall messages (see below) that I could find and noticed:
- All of the VMs that are affected have at least 2 CPUs. I haven’t seen this on any single CPU VMs AFAICT, but I wouldn’t fully eliminate that it could be possible to also be happening there. OTOH (obviously?) race conditions would be easier to produce on an multi processing machine than with a single core … ;)
- I’ve only ever seen it on virtual machines, however, there’s a redhat bug report from 2023 that shows data that points to an Asus board, so it looks like a physical machine, so I guess a physical/virtual machine distinction is not relevant.
- Most call stacks come from the VFS, but I’ve seen two that originate from a page fault (if I’m reading things correctly) - so trying to swap a page in? That’s interesting because it would hint at a reproducer that doesn’t need FS code being involved.
Here’s the stalls that I could recover from my efforts last year:
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 1-....: (1 GPs behind) idle=bcec/1/0x4000000000000000 softirq=32229387/32229388 fqs=3407711
(t=6825807 jiffies g=51307757 q=12582143 ncpus=2)
CPU: 1 PID: 135430 Comm: systemd-journal Not tainted 6.1.57 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__rcu_read_unlock+0x1d/0x30
Code: ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 3c 25 c0 0b 02 00 83 af 64 04 00 00 01 75 0a 8b 87 68 04 00 00 <85> c0 75 05 c3 cc cc cc cc e9 45 fe ff ff 0f 1f 44 00 00 0f 1f 44
RSP: 0018:ffffa9c442887c78 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffca97c0ed4000 RCX: 0000000000000000
RDX: ffff88a1919bb6d0 RSI: ffff88a1919bb6d0 RDI: ffff88a187480000
RBP: 0000000000000044 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000100cca
R13: ffff88a2a48836b0 R14: 0000000000001be0 R15: ffffca97c0ed4000
FS: 00007fa45ec86c40(0000) GS:ffff88a2fad00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa45f436985 CR3: 000000010b4f8000 CR4: 00000000000006e0
Call Trace:
<IRQ>
? rcu_dump_cpu_stacks+0xc8/0x100
? rcu_sched_clock_irq.cold+0x15b/0x2fb
? sched_slice+0x87/0x140
? perf_event_task_tick+0x64/0x370
? __cgroup_account_cputime_field+0x5b/0xa0
? update_process_times+0x77/0xb0
? tick_sched_handle+0x34/0x50
? tick_sched_timer+0x6f/0x80
? tick_sched_do_timer+0xa0/0xa0
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfe/0x220
? __sysvec_apic_timer_interrupt+0x7f/0x170
? sysvec_apic_timer_interrupt+0x99/0xc0
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? __rcu_read_unlock+0x1d/0x30
? xas_load+0x30/0x40
__filemap_get_folio+0x10a/0x370
filemap_fault+0x139/0x910
? preempt_count_add+0x47/0xa0
__do_fault+0x31/0x80
do_fault+0x299/0x410
__handle_mm_fault+0x623/0xb80
handle_mm_fault+0xdb/0x2d0
do_user_addr_fault+0x19c/0x560
exc_page_fault+0x66/0x150
asm_exc_page_fault+0x22/0x30
RIP: 0033:0x7fa45f4369af
Code: Unable to access opcode bytes at 0x7fa45f436985.
RSP: 002b:00007fff3ec0a580 EFLAGS: 00010246
RAX: 0000002537ea8ea4 RBX: 00007fff3ec0aab0 RCX: 0000000000000000
RDX: 00007fa45a3dffd0 RSI: 00007fa45a3e0010 RDI: 000055e348682520
RBP: 0000000000000015 R08: 000055e34862fd00 R09: 00007fff3ec0b1b0
R10: 0000000000000000 R11: 0000000000000000 R12: 00007fff3ec0a820
R13: 00007fff3ec0a640 R14: 2f4f057952ecadbd R15: 0000000000000000
</TASK>
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 1-....: (21000 ticks this GP) idle=d1e4/1/0x4000000000000000 softirq=87308049/87308049 fqs=5541
(t=21002 jiffies g=363533457 q=100563 ncpus=5)
rcu: rcu_preempt kthread starved for 8417 jiffies! g363533457 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=4
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:0 pid:15 ppid:2 flags:0x00004000
Call Trace:
<TASK>
? rcu_gp_cleanup+0x570/0x570
__schedule+0x35d/0x1370
? get_nohz_timer_target+0x18/0x190
? _raw_spin_unlock_irqrestore+0x23/0x40
? __mod_timer+0x281/0x3d0
? rcu_gp_cleanup+0x570/0x570
schedule+0x5d/0xe0
schedule_timeout+0x94/0x150
? __bpf_trace_tick_stop+0x10/0x10
rcu_gp_fqs_loop+0x15b/0x650
rcu_gp_kthread+0x1a9/0x280
kthread+0xe9/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 4:
NMI backtrace for cpu 4
CPU: 4 PID: 529675 Comm: connection Not tainted 6.1.57 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:xas_descend+0x3/0x90
Code: 48 8b 57 08 48 89 57 10 e9 3a c6 2c 00 48 8b 57 10 48 89 07 48 c1 e8 20 48 89 57 08 e9 26 c6 2c 00 cc cc cc cc cc cc 0f b6 0e <48> 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48
RSP: 0018:ffffa37c47ccfbf8 EFLAGS: 00000246
RAX: ffff92832453e912 RBX: ffffa37c47ccfd78 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffff92832453e910 RDI: ffffa37c47ccfc08
RBP: 0000000000006305 R08: ffffa37c47ccfe70 R09: ffff92830f538138
R10: ffffa37c47ccfe68 R11: ffff92830f538138 R12: 0000000000006305
R13: ffff92832b518900 R14: 0000000000006305 R15: ffffa37c47ccfe98
FS: 00007fcbee42b6c0(0000) GS:ffff9287a9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcc0b10d0d8 CR3: 0000000107632000 CR4: 00000000003506e0
Call Trace:
<NMI>
? nmi_cpu_backtrace.cold+0x1b/0x76
? nmi_cpu_backtrace_handler+0xd/0x20
? nmi_handle+0x5d/0x120
? xas_descend+0x3/0x90
? default_do_nmi+0x69/0x170
? exc_nmi+0x13c/0x170
? end_repeat_nmi+0x16/0x67
? xas_descend+0x3/0x90
? xas_descend+0x3/0x90
? xas_descend+0x3/0x90
</NMI>
<TASK>
xas_load+0x30/0x40
filemap_get_read_batch+0x16e/0x250
filemap_get_pages+0xa9/0x630
? current_time+0x3c/0x100
? atime_needs_update+0x104/0x180
? touch_atime+0x46/0x1f0
filemap_read+0xd2/0x340
xfs_file_buffered_read+0x4f/0xd0 [xfs]
xfs_file_read_iter+0x6a/0xd0 [xfs]
vfs_read+0x23c/0x310
ksys_read+0x6b/0xf0
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x64/0xce
RIP: 0033:0x7fd0ccf0f78c
Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 a9 bb f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 ff bb f8 ff 48
RSP: 002b:00007fcbee427320 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd0ccf0f78c
RDX: 0000000000000014 RSI: 00007fcbee427500 RDI: 0000000000000129
RBP: 00007fcbee427430 R08: 0000000000000000 R09: 00a9b630ab4578b9
R10: 0000000000000001 R11: 0000000000000246 R12: 00007fcbee42a9f8
R13: 0000000000000014 R14: 00000000040ef680 R15: 0000000000000129
</TASK>
CPU: 1 PID: 529591 Comm: connection Not tainted 6.1.57 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:xas_descend+0x18/0x90
Code: c1 e8 20 48 89 57 08 e9 26 c6 2c 00 cc cc cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 <48> 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 75 08 48 3d fd 00 00 00
RSP: 0018:ffffa37c47b7fbf8 EFLAGS: 00000216
RAX: fffff6e88448e000 RBX: ffffa37c47b7fd78 RCX: 0000000000000000
RDX: 000000000000000d RSI: ffff92832453e910 RDI: ffffa37c47b7fc08
RBP: 000000000000630d R08: ffffa37c47b7fe70 R09: ffff92830f538138
R10: ffffa37c47b7fe68 R11: ffff92830f538138 R12: 000000000000630d
R13: ffff92830b9a3b00 R14: 000000000000630d R15: ffffa37c47b7fe98
FS: 00007fcbf07bb6c0(0000) GS:ffff9287a9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcc0ac8e360 CR3: 0000000107632000 CR4: 00000000003506e0
Call Trace:
<IRQ>
? rcu_dump_cpu_stacks+0xc8/0x100
? rcu_sched_clock_irq.cold+0x15b/0x2fb
? sched_slice+0x87/0x140
? perf_event_task_tick+0x64/0x370
? __cgroup_account_cputime_field+0x5b/0xa0
? update_process_times+0x77/0xb0
? tick_sched_handle+0x34/0x50
? tick_sched_timer+0x6f/0x80
? tick_sched_do_timer+0xa0/0xa0
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfe/0x220
? __sysvec_apic_timer_interrupt+0x7f/0x170
? sysvec_apic_timer_interrupt+0x99/0xc0
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? xas_descend+0x18/0x90
xas_load+0x30/0x40
filemap_get_read_batch+0x16e/0x250
filemap_get_pages+0xa9/0x630
? current_time+0x3c/0x100
? atime_needs_update+0x104/0x180
? touch_atime+0x46/0x1f0
filemap_read+0xd2/0x340
xfs_file_buffered_read+0x4f/0xd0 [xfs]
xfs_file_read_iter+0x6a/0xd0 [xfs]
vfs_read+0x23c/0x310
ksys_read+0x6b/0xf0
do_syscall_64+0x3a/0x90
</TASK>
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-....: (21000 ticks this GP) idle=91fc/1/0x4000000000000000 softirq=85252827/85252827 fqs=4704
(t=21002 jiffies g=167843445 q=13889 ncpus=3)
CPU: 0 PID: 2202919 Comm: .postgres-wrapp Not tainted 6.1.31 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:xas_descend+0x26/0x70
Code: cc cc cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 <75> 08 48 3d fd 00 00 0>
RSP: 0018:ffffb427c4917bf0 EFLAGS: 00000246
RAX: ffff98871f8dbdaa RBX: ffffb427c4917d70 RCX: 0000000000000002
RDX: 0000000000000005 RSI: ffff988876d3c000 RDI: ffffb427c4917c00
RBP: 000000000000f177 R08: ffffb427c4917e68 R09: ffff988846485d38
R10: ffffb427c4917e60 R11: ffff988846485d38 R12: 000000000000f177
R13: ffff988827b4ae00 R14: 000000000000f176 R15: ffffb427c4917e90
FS: 00007ff8de817800(0000) GS:ffff98887ac00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff881c8c000 CR3: 000000010dfea000 CR4: 00000000000006f0
(stack trace is missing in this one)
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 1-....: (20915 ticks this GP) idle=4b8c/1/0x4000000000000000 softirq=138338523/138338526 fqs=6063
(t=21000 jiffies g=180955121 q=35490 ncpus=2)
CPU: 1 PID: 1415835 Comm: .postgres-wrapp Not tainted 6.1.57 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:filemap_get_read_batch+0x16e/0x250
Code: 85 ff 00 00 00 48 83 c4 40 5b 5d c3 cc cc cc cc f0 ff 0e 0f 84 e1 00 00 00 48 c7 44 24 18 03 00 00 00 48 89 e7 e8 42 ab 6d 00 <48> 89 c7 48 85 ff 74 ba 48 81 ff 06 04 00 00 0f 85 fe fe ff ff 48
RSP: 0018:ffffac01c6887c00 EFLAGS: 00000246
RAX: ffffe5a104574000 RBX: ffffac01c6887d70 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff96db861bcb68 RDI: ffffac01c6887c00
RBP: 0000000000014781 R08: ffffac01c6887e68 R09: ffff96dad46fad38
R10: ffffac01c6887e60 R11: ffff96dad46fad38 R12: 0000000000014781
R13: ffff96db86f47000 R14: 0000000000014780 R15: ffffac01c6887e90
FS: 00007f9ba0a12800(0000) GS:ffff96dbbbd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9b5050a018 CR3: 000000010ac82000 CR4: 00000000000006e0
Call Trace:
<IRQ>
? rcu_dump_cpu_stacks+0xc8/0x100
? rcu_sched_clock_irq.cold+0x15b/0x2fb
? sched_slice+0x87/0x140
? timekeeping_update+0xdd/0x130
? __cgroup_account_cputime_field+0x5b/0xa0
? update_process_times+0x77/0xb0
? update_wall_time+0xc/0x20
? tick_sched_handle+0x34/0x50
? tick_sched_timer+0x6f/0x80
? tick_sched_do_timer+0xa0/0xa0
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfe/0x220
? __sysvec_apic_timer_interrupt+0x7f/0x170
? sysvec_apic_timer_interrupt+0x99/0xc0
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? filemap_get_read_batch+0x16e/0x250
filemap_get_pages+0xa9/0x630
? iomap_iter+0x78/0x310
? iomap_file_buffered_write+0x8f/0x2f0
filemap_read+0xd2/0x340
xfs_file_buffered_read+0x4f/0xd0 [xfs]
xfs_file_read_iter+0x6a/0xd0 [xfs]
vfs_read+0x23c/0x310
__x64_sys_pread64+0x94/0xc0
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x64/0xce
RIP: 0033:0x7f9ba0b0d787
Code: 48 e8 5d dc f2 ff 41 b8 02 00 00 00 e9 38 f6 ff ff 66 90 f3 0f 1e fa 80 3d 7d bc 0e 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 48 83 ec 28 48 89 54 24 10 48 89 74 24
RSP: 002b:00007ffe56bb0878 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9ba0b0d787
RDX: 0000000000002000 RSI: 00007f9b5c85ce80 RDI: 000000000000003a
RBP: 0000000000000001 R08: 000000000a00000d R09: 0000000000000000
R10: 0000000014780000 R11: 0000000000000202 R12: 00007f9b90052ab0
R13: 00005566dc227f75 R14: 00005566dc22c510 R15: 00005566de3cf0c0
</TASK>
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 1-....: (21000 ticks this GP) idle=947c/1/0x4000000000000000 softirq=299845076/299845076 fqs=5249
(t=21002 jiffies g=500931101 q=17117 ncpus=2)
CPU: 1 PID: 1660396 Comm: nix-collect-gar Not tainted 6.1.55 #1-NixOS
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__xas_next+0x0/0xe0
Code: 48 3d 00 10 00 00 77 c8 48 89 c8 c3 cc cc cc cc e9 f5 fe ff ff 48 c7 47 18 01 00 00 00 31 c9 48 89 c8 c3 cc cc cc cc 0f 1f 00 <48> 8b 47 18 a8 02 75 0e 48 83 47 08 01 48 85 c0 0f 84 b5 00 00 00
RSP: 0018:ffffb170866f7bf8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffb170866f7d70 RCX: 0000000000000000
RDX: 0000000000000020 RSI: ffff8ab97d5306d8 RDI: ffffb170866f7c00
RBP: 00000000000011e4 R08: 0000000000000000 R09: ffff8ab9a4dc3d38
R10: ffffb170866f7e60 R11: ffff8ab9a4dc3d38 R12: 00000000000011e4
R13: ffff8ab946fda400 R14: 00000000000011e4 R15: ffffb170866f7e90
FS: 00007f17d22e3f80(0000) GS:ffff8ab9bdd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000013279e8 CR3: 00000000137c8000 CR4: 00000000000006e0
Call Trace:
<IRQ>
? rcu_dump_cpu_stacks+0xc8/0x100
? rcu_sched_clock_irq.cold+0x15b/0x2fb
? sched_slice+0x87/0x140
? perf_event_task_tick+0x64/0x370
? __cgroup_account_cputime_field+0x5b/0xa0
? update_process_times+0x77/0xb0
? tick_sched_handle+0x34/0x50
? tick_sched_timer+0x6f/0x80
? tick_sched_do_timer+0xa0/0xa0
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfe/0x220
? __sysvec_apic_timer_interrupt+0x7f/0x170
? sysvec_apic_timer_interrupt+0x99/0xc0
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? __xas_prev+0xe0/0xe0
? xas_load+0x30/0x40
filemap_get_read_batch+0x16e/0x250
filemap_get_pages+0xa9/0x630
? atime_needs_update+0x104/0x180
? touch_atime+0x46/0x1f0
filemap_read+0xd2/0x340
xfs_file_buffered_read+0x4f/0xd0 [xfs]
xfs_file_read_iter+0x6a/0xd0 [xfs]
vfs_read+0x23c/0x310
__x64_sys_pread64+0x94/0xc0
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x64/0xce
RIP: 0033:0x7f17d3a2d7c7
Code: 08 89 3c 24 48 89 4c 24 18 e8 75 db f8 ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 c5 db f8 ff 48 8b
RSP: 002b:00007ffffd9d0fb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f17d3a2d7c7
RDX: 0000000000001000 RSI: 000056435be0ccf8 RDI: 0000000000000006
RBP: 00000000011e4000 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000011e4000 R11: 0000000000000293 R12: 0000000000001000
R13: 000056435be0ccf8 R14: 0000000000001000 R15: 000056435bdea370
</TASK>
I’ve pulled together the various states the stall was detected in a more compact form:
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? __rcu_read_unlock+0x1d/0x30
? xas_load+0x30/0x40
__filemap_get_folio+0x10a/0x370
filemap_fault+0x139/0x910
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? __xas_prev+0xe0/0xe0
? xas_load+0x30/0x40
filemap_get_read_batch+0x16e/0x250
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? filemap_get_read_batch+0x16e/0x250
xas_load+0x30/0x40
filemap_get_read_batch+0x16e/0x250
RIP: 0010:xas_descend+0x26/0x70 (this one was missing the stack trace)
I tried reading through the xarray code, but my C and kernel knowledge is stretched thin trying to understand some of the internals: I couldn’t figure out how __rcu_read_unlock appears from within xas_load, similar to __xas_prev. I stopped diving deeper at that point.
My original bug report also includes an initial grab of multiple stall reports over time on a single machine where the situation unfolded with different stack traces over many hours. It’s a bit long so I’m opting to provide the link: https://bugzilla.kernel.org/show_bug.cgi?id=217572#c0
I also was wondering whether the stall is stuck or spinning and one of my early comments noticed that with 3 CPUs I had a total of 60% spent in system time, so this sounds like it might be spinning between xas_load and xas_descend. I see there’s some kind of retry mechanism in there and while-loops that might get stuck if the data structures are borked. I think it’s alternating between xas_load and xas_descend, though, so not stuck in xas_descend’s loop itself.
The redhat report "large folio related page cache iteration hang” (https://bugzilla.redhat.com/show_bug.cgi?id=2213967) does show a “kernel bug” message in addition to the known stack around xas_load:
kernel: watchdog: BUG: soft lockup - CPU#28 stuck for 26s! [rocksdb:low:2195]
kernel: Modules linked in: tls nf_conntrack_netbios_ns nf_conntrack_broadcast nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel tcp_bbr rfkill ip_set nf_tables nfnetlink nct6775 nct6775_core tun hwmon_vid jc42 vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm snd_timer cdc_ether irqbypass acpi_ipmi snd usbnet wmi_bmof rapl ipmi_si k10temp soundcore i2c_piix4 joydev mii ipmi_devintf ipmi_msghandler fuse loop xfs uas usb_storage raid1 hid_cp2112 igb crct10dif_pclmul ast crc32_pclmul nvme crc32c_intel polyval_clmulni dca polyval_generic i2c_algo_bit nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
kernel: CPU: 28 PID: 2195 Comm: rocksdb:low Not tainted 6.3.5-100.fc37.x86_64 #1
kernel: Hardware name: To Be Filled By O.E.M. X570D4U/X570D4U, BIOS T1.29b 05/17/2022
kernel: RIP: 0010:xas_load+0x45/0x50
kernel: Code: 3d 00 10 00 00 77 07 5b 5d c3 cc cc cc cc 0f b6 4b 10 48 8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 cf fd ff ff 80 7d 00 00 <75> c7 eb d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
kernel: RSP: 0018:ffffaab80392fb40 EFLAGS: 00000246
kernel: RAX: fffff69f82a7c000 RBX: ffffaab80392fb58 RCX: 0000000000000000
kernel: RDX: 0000000000000010 RSI: ffff94a4268a6480 RDI: ffffaab80392fb58
kernel: RBP: ffff94a4268a6480 R08: 0000000000000000 R09: 000000000000424a
kernel: R10: ffff94af1ec69ab0 R11: 0000000000000000 R12: 0000000000001610
kernel: R13: 000000000000160c R14: 000000000000160c R15: ffffaab80392fdf0
kernel: FS: 00007f49f7bfe6c0(0000) GS:ffff94b63f100000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f01446e9000 CR3: 000000014a4be000 CR4: 0000000000750ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel: <IRQ>
kernel: ? watchdog_timer_fn+0x1a8/0x210
kernel: ? __pfx_watchdog_timer_fn+0x10/0x10
kernel: ? __hrtimer_run_queues+0x112/0x2b0
kernel: ? hrtimer_interrupt+0xf8/0x230
kernel: ? __sysvec_apic_timer_interrupt+0x61/0x130
kernel: ? sysvec_apic_timer_interrupt+0x6d/0x90
kernel: </IRQ>
kernel: <TASK>
kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
kernel: ? xas_load+0x45/0x50
kernel: filemap_get_read_batch+0x179/0x270
kernel: filemap_get_pages+0xab/0x6a0
kernel: ? touch_atime+0x48/0x1b0
kernel: ? filemap_read+0x33f/0x350
kernel: filemap_read+0xdf/0x350
kernel: xfs_file_buffered_read+0x4f/0xd0 [xfs]
kernel: xfs_file_read_iter+0x74/0xe0 [xfs]
kernel: vfs_read+0x240/0x310
kernel: __x64_sys_pread64+0x98/0xd0
kernel: do_syscall_64+0x5f/0x90
kernel: ? native_flush_tlb_local+0x34/0x40
kernel: ? flush_tlb_func+0x10d/0x240
kernel: ? do_syscall_64+0x6b/0x90
kernel: ? sched_clock_cpu+0xf/0x190
kernel: ? irqtime_account_irq+0x40/0xc0
kernel: ? __irq_exit_rcu+0x4b/0xf0
kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
kernel: RIP: 0033:0x7f4a0c23c227
kernel: Code: 08 89 3c 24 48 89 4c 24 18 e8 b5 e3 f8 ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 05 e4 f8 ff 48 8b
kernel: RSP: 002b:00007f49f7bf8310 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
kernel: RAX: ffffffffffffffda RBX: 000000000000424a RCX: 00007f4a0c23c227
kernel: RDX: 000000000000424a RSI: 00007f04294a35c0 RDI: 00000000000004be
kernel: RBP: 00007f49f7bf8460 R08: 0000000000000000 R09: 00007f49f7bf84a0
kernel: R10: 000000000160c718 R11: 0000000000000293 R12: 000000000000424a
kernel: R13: 000000000160c718 R14: 00007f04294a35c0 R15: 0000000000000000
kernel: </TASK>
...
kernel: ------------[ cut here ]------------
kernel: kernel BUG at fs/inode.c:612!
kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
kernel: CPU: 21 PID: 2195 Comm: rocksdb:low Tainted: G L 6.3.5-100.fc37.x86_64 #1
kernel: Hardware name: To Be Filled By O.E.M. X570D4U/X570D4U, BIOS T1.29b 05/17/2022
kernel: RIP: 0010:clear_inode+0x76/0x80
kernel: Code: 2d a8 40 75 2b 48 8b 93 28 01 00 00 48 8d 83 28 01 00 00 48 39 c2 75 1a 48 c7 83 98 00 00 00 60 00 00 00 5b 5d c3 cc cc cc cc <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 90 90 90 90 90 90 90 90 90 90 90 90
kernel: RSP: 0018:ffffaab80392fe58 EFLAGS: 00010002
kernel: RAX: 0000000000000000 RBX: ffff94af1ec69938 RCX: 0000000000000000
kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff94af1ec69ab8
kernel: RBP: ffff94af1ec69ab8 R08: ffffaab80392fd38 R09: 0000000000000002
kernel: R10: 0000000000000001 R11: 0000000000000005 R12: ffffffffc08b9860
kernel: R13: ffff94af1ec69938 R14: 00000000ffffff9c R15: ffff94979dd5da40
kernel: FS: 00007f49f7bfe6c0(0000) GS:ffff94b63ef40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007eefca8e2000 CR3: 000000014a4be000 CR4: 0000000000750ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel: <TASK>
kernel: ? die+0x36/0x90
kernel: ? do_trap+0xda/0x100
kernel: ? clear_inode+0x76/0x80
kernel: ? do_error_trap+0x6a/0x90
kernel: ? clear_inode+0x76/0x80
kernel: ? exc_invalid_op+0x50/0x70
kernel: ? clear_inode+0x76/0x80
kernel: ? asm_exc_invalid_op+0x1a/0x20
kernel: ? clear_inode+0x76/0x80
kernel: ? clear_inode+0x1d/0x80
kernel: evict+0x1b8/0x1d0
kernel: do_unlinkat+0x174/0x320
kernel: __x64_sys_unlink+0x42/0x70
kernel: do_syscall_64+0x5f/0x90
kernel: ? __irq_exit_rcu+0x4b/0xf0
kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc
kernel: RIP: 0033:0x7f4a0c23faab
kernel: Code: f0 ff ff 73 01 c3 48 8b 0d 82 63 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 63 0d 00 f7 d8 64 89 01 48
kernel: RSP: 002b:00007f49f7bfab58 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
kernel: RAX: ffffffffffffffda RBX: 00007f49f7bfac38 RCX: 00007f4a0c23faab
kernel: RDX: 00007f49f7bfadd0 RSI: 00007f4a0bc2fd30 RDI: 00007f49dd3c32d0
kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: ffffffffffffdf58 R11: 0000000000000206 R12: 0000000000280bc0
kernel: R13: 00007f4a0bca77b8 R14: 00007f49f7bfadd0 R15: 00007f49f7bfadd0
kernel: </TASK>
kernel: Modules linked in: tls nf_conntrack_netbios_ns nf_conntrack_broadcast nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel tcp_bbr rfkill ip_set nf_tables nfnetlink nct6775 nct6775_core tun hwmon_vid jc42 vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm snd_timer cdc_ether irqbypass acpi_ipmi snd usbnet wmi_bmof rapl ipmi_si k10temp soundcore i2c_piix4 joydev mii ipmi_devintf ipmi_msghandler fuse loop xfs uas usb_storage raid1 hid_cp2112 igb crct10dif_pclmul ast crc32_pclmul nvme crc32c_intel polyval_clmulni dca polyval_generic i2c_algo_bit nvme_core ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
kernel: ---[ end trace 0000000000000000 ]---
kernel: RIP: 0010:clear_inode+0x76/0x80
kernel: Code: 2d a8 40 75 2b 48 8b 93 28 01 00 00 48 8d 83 28 01 00 00 48 39 c2 75 1a 48 c7 83 98 00 00 00 60 00 00 00 5b 5d c3 cc cc cc cc <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 90 90 90 90 90 90 90 90 90 90 90 90
kernel: RSP: 0018:ffffaab80392fe58 EFLAGS: 00010002
kernel: RAX: 0000000000000000 RBX: ffff94af1ec69938 RCX: 0000000000000000
kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff94af1ec69ab8
kernel: RBP: ffff94af1ec69ab8 R08: ffffaab80392fd38 R09: 0000000000000002
kernel: R10: 0000000000000001 R11: 0000000000000005 R12: ffffffffc08b9860
kernel: R13: ffff94af1ec69938 R14: 00000000ffffff9c R15: ffff94979dd5da40
kernel: FS: 00007f49f7bfe6c0(0000) GS:ffff94b63ef40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007eefca8e2000 CR3: 000000014a4be000 CR4: 0000000000750ee0
kernel: PKRU: 55555554
kernel: note: rocksdb:low[2195] exited with irqs disabled
kernel: note: rocksdb:low[2195] exited with preempt_count 1
Above report is showing rocksdb as the workload with relatively short uptimes around 30 minutes. Maybe there’s a reproducer around there somewhere? I’ve CCed the reporter from there to maybe get some insight on his workload.
Rabbit hole 2: things that were already considered
==================================================
There were a number of potential vectors and bugfixes that were discussed/referenced but haven’t turned out to fix the issue overall. Some of them might be obvious red herrings by now, but I’m not sure which.
* [GIT PULL] xfs, iomap: fix data corruption due to stale cached iomaps (https://lore.kernel.org/linux-fsdevel/20221129001632.GX3600936@dread.disaster.area/)
* cbc02854331e ("XArray: Do not return sibling entries from xa_load()”) did not help here
* I think i’ve seen that the affected data on disk ended up being null bytes, but I can’t verify that.
* There was a fix close to this in “_filemap_get_folio and NULL pointer dereference” (https://bugzilla.kernel.org/show_bug.cgi?id=217441) and "having TRANSPARENT_HUGEPAGE enabled hangs some applications (supervisor read access in kernel mode)” https://bugzilla.kernel.org/show_bug.cgi?id=216646 but their traces looked slightly different from the ones discussed here as did their outcome. Interestingly: those are also on the page fault path, not an fs path.
* memcg was in the stack and under question at some point but it also happens without it
* i was wondering whether increased readahead sizes might cause issues (most our VMs run 128kb but DB VMs run with 1MiB. However, this might also be a red herring as the single vs. multi core situation correlates strongly in our case).
Maybe offtopic but maybe it spurs ideas – a situation that felt similar to the stalls here: I remember debugging a memory issue in Python’s small object allocator a number of years ago that resulted in segfaults and I wonder whether the stalls we’re seeing are only a delayed symptom of an earlier corruption somewhere else. The Python issue was a third party module that caused an out of bounds write into an adjacent byte that was used as a pointer for arena management. That one was also extremely hard to track down due to this indirection / "magic at a distance” behaviour.
That’s all the input I have.
My offer stands: I can take time and run a number of machines that exhibited the behaviour on custom kernels to gather data.
Cheers,
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:25 ` Linus Torvalds
2024-09-12 22:30 ` Jens Axboe
2024-09-13 12:11 ` Christian Brauner
@ 2024-09-13 15:30 ` Chris Mason
2024-09-13 15:51 ` Matthew Wilcox
2024-09-13 16:04 ` David Howells
2024-09-16 0:00 ` Dave Chinner
4 siblings, 1 reply; 81+ messages in thread
From: Chris Mason @ 2024-09-13 15:30 UTC (permalink / raw)
To: Linus Torvalds, Jens Axboe
Cc: Matthew Wilcox, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
On 9/12/24 6:25 PM, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> When I saw Christian's report, I seemed to recall that we ran into this
>> at Meta too. And we did, and hence have been reverting it since our 5.19
>> release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
>> things that are known broken.
>
> I do think that if we have big sites just reverting it as known broken
> and can't figure out why, we should do so upstream too.
I've mentioned this in the past to both Willy and Dave Chinner, but so
far all of my attempts to reproduce it on purpose have failed. It's
awkward because I don't like to send bug reports that I haven't
reproduced on a non-facebook kernel, but I'm pretty confident this bug
isn't specific to us.
I'll double down on repros again during plumbers and hopefully come up
with a recipe for explosions. On other important datapoint is that we
also enable huge folios via tmpfs mount -o huge=within_size.
That hasn't hit problems, and we've been doing it for years, but of
course the tmpfs usage is pretty different from iomap/xfs.
We have two workloads that have reliably seen large folios bugs in prod.
This is all on bare metal systems, some are two socket, some single,
nothing really exotic.
1) On 5.19 kernels, knfsd reading and writing to XFS. We needed
O(hundreds) of knfsd servers running for about 8 hours to see one hit.
The issue looked similar to Christian Theune's rcu stalls, but since it
was just one CPU spinning away, I was able to perf probe and drgn my way
to some details. The xarray for the file had a series of large folios:
[ index 0 large folio from the correct file ]
[ index 1: large folio from the correct file ]
...
[ index N: large folio from a completely different file ]
[ index N+1: large folio from the correct file ]
I'm being sloppy with index numbers, but the important part is that
we've got a large folio from the wrong file in the middle of the bunch.
filemap_read() iterates over batches of folios from the xarray, but if
one of the folios in the batch has folio->offset out of order with the
rest, the whole thing turns into a infinite loop. It's not really a
filemap_read() bug, the batch coming back from the xarray is just incorrect.
2) On 6.9 kernels, we saw a BUG_ON() during inode eviction because
mapping->nrpages was non-zero. I'm assuming it's really just a
different window into the same bug. Crash dump analysis was less
conclusive because the xarray itself was always empty, but turning off
large folios made the problem go away.
This happened ~5-10 times a day, and the service had a few thousand
machines running 6.9. If I can't make an artificial repro, I'll try and
talk the service owners into setting up a production shadow to hammer on
it with additional debugging.
We also disabled large folios for our 6.4 kernel, but Stefan actually
tracked that bug down:
commit a48d5bdc877b85201e42cef9c2fdf5378164c23a
Author: Stefan Roesch <shr@devkernel.io>
Date: Mon Nov 6 10:19:18 2023 -0800
mm: fix for negative counter: nr_file_hugepages
We didn't have time to revalidate with large folios back on afterwards.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 15:30 ` Chris Mason
@ 2024-09-13 15:51 ` Matthew Wilcox
2024-09-13 16:33 ` Chris Mason
0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-13 15:51 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Jens Axboe, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
On Fri, Sep 13, 2024 at 11:30:41AM -0400, Chris Mason wrote:
> I've mentioned this in the past to both Willy and Dave Chinner, but so
> far all of my attempts to reproduce it on purpose have failed. It's
> awkward because I don't like to send bug reports that I haven't
> reproduced on a non-facebook kernel, but I'm pretty confident this bug
> isn't specific to us.
I don't think the bug is specific to you either. It's been hit by
several people ... but it's really hard to hit ;-(
> I'll double down on repros again during plumbers and hopefully come up
> with a recipe for explosions. On other important datapoint is that we
I appreciate the effort!
> The issue looked similar to Christian Theune's rcu stalls, but since it
> was just one CPU spinning away, I was able to perf probe and drgn my way
> to some details. The xarray for the file had a series of large folios:
>
> [ index 0 large folio from the correct file ]
> [ index 1: large folio from the correct file ]
> ...
> [ index N: large folio from a completely different file ]
> [ index N+1: large folio from the correct file ]
>
> I'm being sloppy with index numbers, but the important part is that
> we've got a large folio from the wrong file in the middle of the bunch.
If you could get the precise index numbers, that would be an important
clue. It would be interesting to know the index number in the xarray
where the folio was found rather than folio->index (as I suspect that
folio->index is completely bogus because folio->mapping is wrong).
But gathering that info is going to be hard.
Maybe something like this?
+++ b/mm/filemap.c
@@ -2317,6 +2317,12 @@ static void filemap_get_read_batch(struct address_space *mapping,
if (unlikely(folio != xas_reload(&xas)))
goto put_folio;
+{
+ struct address_space *fmapping = READ_ONCE(folio->mapping);
+ if (fmapping != NULL && fmapping != mapping)
+ printk("bad folio at %lx\n", xas.xa_index);
+}
+
if (!folio_batch_add(fbatch, folio))
break;
if (!folio_test_uptodate(folio))
(could use VM_BUG_ON_FOLIO() too, but i'm not sure that the identity of
the bad folio we've found is as interesting as where we found it)
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:25 ` Linus Torvalds
` (2 preceding siblings ...)
2024-09-13 15:30 ` Chris Mason
@ 2024-09-13 16:04 ` David Howells
2024-09-13 16:37 ` Chris Mason
2024-09-16 0:00 ` Dave Chinner
4 siblings, 1 reply; 81+ messages in thread
From: David Howells @ 2024-09-13 16:04 UTC (permalink / raw)
To: Chris Mason
Cc: dhowells, Linus Torvalds, Jens Axboe, Matthew Wilcox,
Christian Theune, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, Dave Chinner, regressions, regressions
Chris Mason <clm@meta.com> wrote:
> I've mentioned this in the past to both Willy and Dave Chinner, but so
> far all of my attempts to reproduce it on purpose have failed.
Could it be a splice bug?
David
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 15:51 ` Matthew Wilcox
@ 2024-09-13 16:33 ` Chris Mason
2024-09-13 18:15 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Chris Mason @ 2024-09-13 16:33 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Jens Axboe, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]
On 9/13/24 11:51 AM, Matthew Wilcox wrote:
> On Fri, Sep 13, 2024 at 11:30:41AM -0400, Chris Mason wrote:
>> I've mentioned this in the past to both Willy and Dave Chinner, but so
>> far all of my attempts to reproduce it on purpose have failed. It's
>> awkward because I don't like to send bug reports that I haven't
>> reproduced on a non-facebook kernel, but I'm pretty confident this bug
>> isn't specific to us.
>
> I don't think the bug is specific to you either. It's been hit by
> several people ... but it's really hard to hit ;-(
>
>> I'll double down on repros again during plumbers and hopefully come up
>> with a recipe for explosions. On other important datapoint is that we
>
> I appreciate the effort!
>
>> The issue looked similar to Christian Theune's rcu stalls, but since it
>> was just one CPU spinning away, I was able to perf probe and drgn my way
>> to some details. The xarray for the file had a series of large folios:
>>
>> [ index 0 large folio from the correct file ]
>> [ index 1: large folio from the correct file ]
>> ...
>> [ index N: large folio from a completely different file ]
>> [ index N+1: large folio from the correct file ]
>>
>> I'm being sloppy with index numbers, but the important part is that
>> we've got a large folio from the wrong file in the middle of the bunch.
>
> If you could get the precise index numbers, that would be an important
> clue. It would be interesting to know the index number in the xarray
> where the folio was found rather than folio->index (as I suspect that
> folio->index is completely bogus because folio->mapping is wrong).
> But gathering that info is going to be hard.
This particular debug session was late at night while we were urgently
trying to roll out some NFS features. I didn't really save many of the
details because my plan was to reproduce it and make a full bug report.
Also, I was explaining the details to people in workplace chat, which is
wildly bad at rendering long lines of structured text, especially when
half the people in the chat are on a mobile device.
You're probably wondering why all of that is important...what I'm really
trying to say is that I've attached a screenshot of the debugging output.
It came from a older drgn script, where I'm still clinging to "radix",
and you probably can't trust the string representation of the page flags
because I wasn't yet using Omar's helpers and may have hard coded them
from an older kernel.
-chris
[-- Attachment #2: xarray-debug.png --]
[-- Type: image/png, Size: 933545 bytes --]
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 16:04 ` David Howells
@ 2024-09-13 16:37 ` Chris Mason
0 siblings, 0 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-13 16:37 UTC (permalink / raw)
To: David Howells
Cc: Linus Torvalds, Jens Axboe, Matthew Wilcox, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
Dave Chinner, regressions, regressions
On 9/13/24 12:04 PM, David Howells wrote:
> Chris Mason <clm@meta.com> wrote:
>
>> I've mentioned this in the past to both Willy and Dave Chinner, but so
>> far all of my attempts to reproduce it on purpose have failed.
>
> Could it be a splice bug?
I really wanted it to be a splice bug, but I believe the 6.9 workload I
mentioned isn't using splice. I didn't 100% verify though.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 16:33 ` Chris Mason
@ 2024-09-13 18:15 ` Matthew Wilcox
2024-09-13 21:24 ` Linus Torvalds
0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-13 18:15 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Jens Axboe, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
On Fri, Sep 13, 2024 at 12:33:49PM -0400, Chris Mason wrote:
> > If you could get the precise index numbers, that would be an important
> > clue. It would be interesting to know the index number in the xarray
> > where the folio was found rather than folio->index (as I suspect that
> > folio->index is completely bogus because folio->mapping is wrong).
> > But gathering that info is going to be hard.
>
> This particular debug session was late at night while we were urgently
> trying to roll out some NFS features. I didn't really save many of the
> details because my plan was to reproduce it and make a full bug report.
>
> Also, I was explaining the details to people in workplace chat, which is
> wildly bad at rendering long lines of structured text, especially when
> half the people in the chat are on a mobile device.
>
> You're probably wondering why all of that is important...what I'm really
> trying to say is that I've attached a screenshot of the debugging output.
>
> It came from a older drgn script, where I'm still clinging to "radix",
> and you probably can't trust the string representation of the page flags
> because I wasn't yet using Omar's helpers and may have hard coded them
> from an older kernel.
That's all _fine_. This is enormously helpful.
First, we see the same folio appear three times. I think that's
particularly significant. Modulo 64 (number of entries/node), the indices
the bad folio are found at is 16, 32 and 48. So I think the _current_
order of folio is 4, but at the time the folio was put in the xarray,
it was order 6. Except ... at order-6 we elide a level of the xarray.
So we shouldn't be able to see this. Hm.
Oh! I think split is the key. Let's say we have an order-6 (or
larger) folio. And we call split_huge_page() (whatever it's called
in your kernel version). That calls xas_split_alloc() followed
by xas_split(). xas_split_alloc() puts entry in node->slots[0] and
initialises node->slots[1..XA_CHUNK_SIZE] to a sibling entry.
Now, if we do allocate those node in xas_split_alloc(), we're supposed to
free them with radix_tree_node_rcu_free() which zeroes all the slots.
But what if we don't, somehow? (this is my best current theory).
Then we allocate the node to a different tree, but any time we try to
look something up, unless it's the index for which we allocated the node,
we find a sibling entry and it points to a stale pointer.
I'm going to think on this a bit more, but so far this is all good
evidence for my leading theory.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 18:15 ` Matthew Wilcox
@ 2024-09-13 21:24 ` Linus Torvalds
2024-09-13 21:30 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-13 21:24 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chris Mason, Jens Axboe, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
On Fri, 13 Sept 2024 at 11:15, Matthew Wilcox <willy@infradead.org> wrote:
>
> Oh! I think split is the key. Let's say we have an order-6 (or
> larger) folio. And we call split_huge_page() (whatever it's called
> in your kernel version). That calls xas_split_alloc() followed
> by xas_split(). xas_split_alloc() puts entry in node->slots[0] and
> initialises node->slots[1..XA_CHUNK_SIZE] to a sibling entry.
Hmm. The splitting does seem to be not just indicated by the debug
logs, but it ends up being a fairly complicated case. *The* most
complicated case of adding a new folio by far, I'd say.
And I wonder if it's even necessary?
Because I think the *common* case is through filemap_add_folio(),
isn't it? And that code path really doesn't care what the size of the
folio is.
So instead of splitting, that code path would seem to be perfectly
happy with instead erroring out, and simply re-doing the new folio
allocation using the same size that the old conflicting folio had (at
which point it won't be conflicting any more).
No?
It's possible that I'm entirely missing something, but at least the
filemap_add_folio() case looks like it really would actually be
happier with a "oh, that size conflicts with an existing entry, let's
just allocate a smaller size then"
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 21:24 ` Linus Torvalds
@ 2024-09-13 21:30 ` Matthew Wilcox
0 siblings, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-13 21:30 UTC (permalink / raw)
To: Linus Torvalds
Cc: Chris Mason, Jens Axboe, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, Dave Chinner,
regressions, regressions
On Fri, Sep 13, 2024 at 02:24:02PM -0700, Linus Torvalds wrote:
> On Fri, 13 Sept 2024 at 11:15, Matthew Wilcox <willy@infradead.org> wrote:
> >
> > Oh! I think split is the key. Let's say we have an order-6 (or
> > larger) folio. And we call split_huge_page() (whatever it's called
> > in your kernel version). That calls xas_split_alloc() followed
> > by xas_split(). xas_split_alloc() puts entry in node->slots[0] and
> > initialises node->slots[1..XA_CHUNK_SIZE] to a sibling entry.
>
> Hmm. The splitting does seem to be not just indicated by the debug
> logs, but it ends up being a fairly complicated case. *The* most
> complicated case of adding a new folio by far, I'd say.
>
> And I wonder if it's even necessary?
Unfortunately, we need to handle things like "we are truncating a file
which has a folio which now extends many pages beyond the end of the
file" and so we have to split the folio which now crosses EOF. Or we
could write it back and drop it, but that has its own problems.
Part of the "large block size" patches sitting in Christian's tree is
solving these problems for folios which can't be split down to order-0,
so there may be ways we can handle this better now, but if we don't
split we might end up wasting a lot of memory in file tails.
> It's possible that I'm entirely missing something, but at least the
> filemap_add_folio() case looks like it really would actually be
> happier with a "oh, that size conflicts with an existing entry, let's
> just allocate a smaller size then"
Pretty sure we already do that; it's mostly handled through the
readahead path which checks for conflicting folios already in the cache.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-12 22:25 ` Linus Torvalds
` (3 preceding siblings ...)
2024-09-13 16:04 ` David Howells
@ 2024-09-16 0:00 ` Dave Chinner
2024-09-16 4:20 ` Linus Torvalds
2024-09-16 7:14 ` Christian Theune
4 siblings, 2 replies; 81+ messages in thread
From: Dave Chinner @ 2024-09-16 0:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jens Axboe, Matthew Wilcox, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, clm,
regressions, regressions
On Thu, Sep 12, 2024 at 03:25:50PM -0700, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
> Honestly, the fact that it hasn't been reverted after apparently
> people knowing about it for months is a bit shocking to me. Filesystem
> people tend to take unknown corruption issues as a big deal. What
> makes this so special? Is it because the XFS people don't consider it
> an XFS issue, so...
I don't think this is a data corruption/loss problem - it certainly
hasn't ever appeared that way to me. The "data loss" appeared to be
in incomplete postgres dump files after the system was rebooted and
this is exactly what would happen when you randomly crash the
system. i.e. dirty data in memory is lost, and application data
being written at the time is in an inconsistent state after the
system recovers. IOWs, there was no clear evidence of actual data
corruption occuring, and data loss is definitely expected when the
page cache iteration hangs and the system is forcibly rebooted
without being able to sync or unmount the filesystems...
All the hangs seem to be caused by folio lookup getting stuck
on a rogue xarray entry in truncate or readahead. If we find an
invalid entry or a folio from a different mapping or with a
unexpected index, we skip it and try again. Hence this does not
appear to be a data corruption vector, either - it results in a
livelock from endless retry because of the bad entry in the xarray.
This endless retry livelock appears to be what is being reported.
IOWs, there is no evidence of real runtime data corruption or loss
from this pagecache livelock bug. We also haven't heard of any
random file data corruption events since we've enabled large folios
on XFS. Hence there really is no evidence to indicate that there is
a large folio xarray lookup bug that results in data corruption in
the existing code, and therefore there is no obvious reason for
turning off the functionality we are already building significant
new functionality on top of.
It's been 10 months since I asked Christain to help isolate a
reproducer so we can track this down. Nothing came from that, so
we're still at exactly where we were at back in november 2023 -
waiting for information on a way to reproduce this issue more
reliably.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 0:00 ` Dave Chinner
@ 2024-09-16 4:20 ` Linus Torvalds
2024-09-16 8:47 ` Chris Mason
2024-09-16 7:14 ` Christian Theune
1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-16 4:20 UTC (permalink / raw)
To: Dave Chinner
Cc: Jens Axboe, Matthew Wilcox, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, clm,
regressions, regressions
On Mon, 16 Sept 2024 at 02:00, Dave Chinner <david@fromorbit.com> wrote:
>
> I don't think this is a data corruption/loss problem - it certainly
> hasn't ever appeared that way to me. The "data loss" appeared to be
> in incomplete postgres dump files after the system was rebooted and
> this is exactly what would happen when you randomly crash the
> system.
Ok, that sounds better, indeed.
Of course, "hang due to internal xarray corruption" isn't _much_
better, but still..
> All the hangs seem to be caused by folio lookup getting stuck
> on a rogue xarray entry in truncate or readahead. If we find an
> invalid entry or a folio from a different mapping or with a
> unexpected index, we skip it and try again.
We *could* perhaps change the "retry the optimistic lookup forever" to
be a "retry and take lock after optimistic failure". At least in the
common paths.
That's what we do with some dcache locking, because the "retry on
race" caused some potential latency issues under ridiculous loads.
And if we retry with the lock, at that point we can actually notice
corruption, because at that point we can say "we have the lock, and we
see a bad folio with the wrong mapping pointer, and now it's not some
possible race condition due to RCU".
That, in turn, might then result in better bug reports. Which would at
least be forward progress rather than "we have this bug".
Let me think about it. Unless somebody else gets to it before I do
(hint hint to anybody who is comfy with that filemap_read() path etc).
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 0:00 ` Dave Chinner
2024-09-16 4:20 ` Linus Torvalds
@ 2024-09-16 7:14 ` Christian Theune
2024-09-16 12:16 ` Matthew Wilcox
2024-09-18 8:31 ` Christian Theune
1 sibling, 2 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-16 7:14 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Jens Axboe, Matthew Wilcox, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, clm, regressions,
regressions
> On 16. Sep 2024, at 02:00, Dave Chinner <david@fromorbit.com> wrote:
>
> On Thu, Sep 12, 2024 at 03:25:50PM -0700, Linus Torvalds wrote:
>> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
>> Honestly, the fact that it hasn't been reverted after apparently
>> people knowing about it for months is a bit shocking to me. Filesystem
>> people tend to take unknown corruption issues as a big deal. What
>> makes this so special? Is it because the XFS people don't consider it
>> an XFS issue, so...
>
> I don't think this is a data corruption/loss problem - it certainly
> hasn't ever appeared that way to me. The "data loss" appeared to be
> in incomplete postgres dump files after the system was rebooted and
> this is exactly what would happen when you randomly crash the
> system. i.e. dirty data in memory is lost, and application data
> being written at the time is in an inconsistent state after the
> system recovers. IOWs, there was no clear evidence of actual data
> corruption occuring, and data loss is definitely expected when the
> page cache iteration hangs and the system is forcibly rebooted
> without being able to sync or unmount the filesystems…
> All the hangs seem to be caused by folio lookup getting stuck
> on a rogue xarray entry in truncate or readahead. If we find an
> invalid entry or a folio from a different mapping or with a
> unexpected index, we skip it and try again. Hence this does not
> appear to be a data corruption vector, either - it results in a
> livelock from endless retry because of the bad entry in the xarray.
> This endless retry livelock appears to be what is being reported.
>
> IOWs, there is no evidence of real runtime data corruption or loss
> from this pagecache livelock bug. We also haven't heard of any
> random file data corruption events since we've enabled large folios
> on XFS. Hence there really is no evidence to indicate that there is
> a large folio xarray lookup bug that results in data corruption in
> the existing code, and therefore there is no obvious reason for
> turning off the functionality we are already building significant
> new functionality on top of.
Right, understood.
However, the timeline of one of the encounters with PostgreSQL (the first comment in Bugzilla) involved still makes me feel uneasy:
T0 : one postgresql process blocked with a different trace (not involving xas_load)
T+a few minutes : another process stuck with the relevant xas_load/descend trace
T+a few more minutes : other processes blocked in xas_load (this time the systemd journal)
T+14m : the journal gets coredumped, likely due to some watchdog
Things go back to normal.
T+14h : another postgres process gets fully stuck on the xas_load/descend trace
I agree with your analysis if the process gets stuck in an infinite loop, but I’ve seen at least one instance where it appears to have left the loop at some point and IMHO that would be a condition that would allow data corruption.
> It's been 10 months since I asked Christain to help isolate a
> reproducer so we can track this down. Nothing came from that, so
> we're still at exactly where we were at back in november 2023 -
> waiting for information on a way to reproduce this issue more
> reliably.
Sorry for dropping the ball from my side as well - I’ve learned my lesson from trying to go through Bugzilla here. ;)
You mentioned above that this might involve read-ahead code and that’s something I noticed before: the machines that carry databases do run with a higher read-ahead setting (1MiB vs. 128k elsewhere).
Also, I’m still puzzled about the one variation that seems to involve page faults and not XFS. That’s something I haven’t seen a response to yet whether this IS in fact interesting or not.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 4:20 ` Linus Torvalds
@ 2024-09-16 8:47 ` Chris Mason
2024-09-17 9:32 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Chris Mason @ 2024-09-16 8:47 UTC (permalink / raw)
To: Linus Torvalds, Dave Chinner
Cc: Jens Axboe, Matthew Wilcox, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/16/24 12:20 AM, Linus Torvalds wrote:
> On Mon, 16 Sept 2024 at 02:00, Dave Chinner <david@fromorbit.com> wrote:
>>
>> I don't think this is a data corruption/loss problem - it certainly
>> hasn't ever appeared that way to me. The "data loss" appeared to be
>> in incomplete postgres dump files after the system was rebooted and
>> this is exactly what would happen when you randomly crash the
>> system.
>
> Ok, that sounds better, indeed.
I think Dave is right because in practice most filesystems have enough
files of various sizes that we're likely to run into the lockups or BUGs
already mentioned.
But, if the impacted files are relatively small (say 16K), and all
exactly the same size, we could probably share pages between them and
give the wrong data to applications.
It should crash eventually, that's probably the nrpages > 0 assertions
we hit during inode eviction on 6.9, but it seems like there's a window
to return the wrong data.
filemap_fault() has:
if (unlikely(folio->mapping != mapping)) {
So I think we're probably in better shape on mmap.
>
> Of course, "hang due to internal xarray corruption" isn't _much_
> better, but still..
>
>> All the hangs seem to be caused by folio lookup getting stuck
>> on a rogue xarray entry in truncate or readahead. If we find an
>> invalid entry or a folio from a different mapping or with a
>> unexpected index, we skip it and try again.
>
> We *could* perhaps change the "retry the optimistic lookup forever" to
> be a "retry and take lock after optimistic failure". At least in the
> common paths.
>
> That's what we do with some dcache locking, because the "retry on
> race" caused some potential latency issues under ridiculous loads.
>
> And if we retry with the lock, at that point we can actually notice
> corruption, because at that point we can say "we have the lock, and we
> see a bad folio with the wrong mapping pointer, and now it's not some
> possible race condition due to RCU".
>
> That, in turn, might then result in better bug reports. Which would at
> least be forward progress rather than "we have this bug".
>
> Let me think about it. Unless somebody else gets to it before I do
> (hint hint to anybody who is comfy with that filemap_read() path etc).
I've got a bunch of assertions around incorrect folio->mapping and I'm
trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
on those, and our systems do run pretty short on ram, so it feels right
at least. We'll see.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 7:14 ` Christian Theune
@ 2024-09-16 12:16 ` Matthew Wilcox
2024-09-18 8:31 ` Christian Theune
1 sibling, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-16 12:16 UTC (permalink / raw)
To: Christian Theune
Cc: Dave Chinner, Linus Torvalds, Jens Axboe, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, clm, regressions,
regressions
On Mon, Sep 16, 2024 at 09:14:45AM +0200, Christian Theune wrote:
> Also, I’m still puzzled about the one variation that seems to involve page faults and not XFS. That’s something I haven’t seen a response to yet whether this IS in fact interesting or not.
It's not; once the page cache is corrupted, it doesn't matter whether
we go through the filesystem to get the page or not.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-13 12:11 ` Christian Brauner
@ 2024-09-16 13:29 ` Matthew Wilcox
2024-09-18 9:51 ` Christian Brauner
0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-16 13:29 UTC (permalink / raw)
To: Christian Brauner
Cc: Linus Torvalds, Pankaj Raghav, Luis Chamberlain, Jens Axboe,
Christian Theune, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, Dave Chinner, clm, regressions,
regressions
On Fri, Sep 13, 2024 at 02:11:22PM +0200, Christian Brauner wrote:
> So this issue it new to me as well. One of the items this cycle is the
> work to enable support for block sizes that are larger than page sizes
> via the large block size (LBS) series that's been sitting in -next for a
> long time. That work specifically targets xfs and builds on top of the
> large folio support.
>
> If the support for large folios is going to be reverted in xfs then I
> see no point to merge the LBS work now. So I'm holding off on sending
> that pull request until a decision is made (for xfs). As far as I
> understand, supporting larger block sizes will not be meaningful without
> large folio support.
This is unwarranted; please send this pull request. We're not going to
rip out all of the infrastructure although we might end up disabling it
by default. There's a bunch of other work queued up behind that, and not
having it in Linus' tree is just going to make everything more painful.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 8:47 ` Chris Mason
@ 2024-09-17 9:32 ` Matthew Wilcox
2024-09-17 9:36 ` Chris Mason
` (2 more replies)
0 siblings, 3 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-17 9:32 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
> I've got a bunch of assertions around incorrect folio->mapping and I'm
> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
> on those, and our systems do run pretty short on ram, so it feels right
> at least. We'll see.
I've been running with some variant of this patch the whole way across
the Atlantic, and not hit any problems. But maybe with the right
workload ...?
There are two things being tested here. One is whether we have a
cross-linked node (ie a node that's in two trees at the same time).
The other is whether the slab allocator is giving us a node that already
contains non-NULL entries.
If you could throw this on top of your kernel, we might stand a chance
of catching the problem sooner. If it is one of these problems and not
something weirder.
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 0b618ec04115..006556605eb3 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1179,6 +1179,8 @@ struct xa_node {
void xa_dump(const struct xarray *);
void xa_dump_node(const struct xa_node *);
+void xa_dump_index(unsigned long index, unsigned int shift);
+void xa_dump_entry(const void *entry, unsigned long index, unsigned long shift);
#ifdef XA_DEBUG
#define XA_BUG_ON(xa, x) do { \
diff --git a/lib/xarray.c b/lib/xarray.c
index 32d4bac8c94c..6bb35bdca30e 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -6,6 +6,8 @@
* Author: Matthew Wilcox <willy@infradead.org>
*/
+#define XA_DEBUG
+
#include <linux/bitmap.h>
#include <linux/export.h>
#include <linux/list.h>
@@ -206,6 +208,7 @@ static __always_inline void *xas_descend(struct xa_state *xas,
unsigned int offset = get_offset(xas->xa_index, node);
void *entry = xa_entry(xas->xa, node, offset);
+ XA_NODE_BUG_ON(node, node->array != xas->xa);
xas->xa_node = node;
while (xa_is_sibling(entry)) {
offset = xa_to_sibling(entry);
@@ -309,6 +312,7 @@ bool xas_nomem(struct xa_state *xas, gfp_t gfp)
return false;
xas->xa_alloc->parent = NULL;
XA_NODE_BUG_ON(xas->xa_alloc, !list_empty(&xas->xa_alloc->private_list));
+ XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE));
xas->xa_node = XAS_RESTART;
return true;
}
@@ -345,6 +349,7 @@ static bool __xas_nomem(struct xa_state *xas, gfp_t gfp)
return false;
xas->xa_alloc->parent = NULL;
XA_NODE_BUG_ON(xas->xa_alloc, !list_empty(&xas->xa_alloc->private_list));
+ XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE));
xas->xa_node = XAS_RESTART;
return true;
}
@@ -388,6 +393,7 @@ static void *xas_alloc(struct xa_state *xas, unsigned int shift)
}
XA_NODE_BUG_ON(node, shift > BITS_PER_LONG);
XA_NODE_BUG_ON(node, !list_empty(&node->private_list));
+ XA_NODE_BUG_ON(node, memchr_inv(&node->slots, 0, sizeof(void *) * XA_CHUNK_SIZE));
node->shift = shift;
node->count = 0;
node->nr_values = 0;
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-17 9:32 ` Matthew Wilcox
@ 2024-09-17 9:36 ` Chris Mason
2024-09-17 10:11 ` Christian Theune
2024-09-17 11:13 ` Chris Mason
2 siblings, 0 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-17 9:36 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On 9/17/24 5:32 AM, Matthew Wilcox wrote:
> On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
>> I've got a bunch of assertions around incorrect folio->mapping and I'm
>> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
>> on those, and our systems do run pretty short on ram, so it feels right
>> at least. We'll see.
>
> I've been running with some variant of this patch the whole way across
> the Atlantic, and not hit any problems. But maybe with the right
> workload ...?
>
> There are two things being tested here. One is whether we have a
> cross-linked node (ie a node that's in two trees at the same time).
> The other is whether the slab allocator is giving us a node that already
> contains non-NULL entries.
>
> If you could throw this on top of your kernel, we might stand a chance
> of catching the problem sooner. If it is one of these problems and not
> something weirder.
>
I was able to corrupt the xarray one time, hitting a crash during
unmount. It wasn't the xfs filesystem I was actually hammering so I
guess that tells us something, but it was after ~3 hours of stress runs,
so not really useful.
I'll try with your patch as well.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-17 9:32 ` Matthew Wilcox
2024-09-17 9:36 ` Chris Mason
@ 2024-09-17 10:11 ` Christian Theune
2024-09-17 11:13 ` Chris Mason
2 siblings, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-17 10:11 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chris Mason, Linus Torvalds, Dave Chinner, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
> On 17. Sep 2024, at 11:32, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
>> I've got a bunch of assertions around incorrect folio->mapping and I'm
>> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
>> on those, and our systems do run pretty short on ram, so it feels right
>> at least. We'll see.
>
> I've been running with some variant of this patch the whole way across
> the Atlantic, and not hit any problems. But maybe with the right
> workload ...?
I can start running my non-prod machines that were affected previously. I’d run this on top of 6.11?
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-17 9:32 ` Matthew Wilcox
2024-09-17 9:36 ` Chris Mason
2024-09-17 10:11 ` Christian Theune
@ 2024-09-17 11:13 ` Chris Mason
2024-09-17 13:25 ` Matthew Wilcox
2 siblings, 1 reply; 81+ messages in thread
From: Chris Mason @ 2024-09-17 11:13 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
[-- Attachment #1: Type: text/plain, Size: 7813 bytes --]
On 9/17/24 5:32 AM, Matthew Wilcox wrote:
> On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
>> I've got a bunch of assertions around incorrect folio->mapping and I'm
>> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
>> on those, and our systems do run pretty short on ram, so it feels right
>> at least. We'll see.
>
> I've been running with some variant of this patch the whole way across
> the Atlantic, and not hit any problems. But maybe with the right
> workload ...?
>
> There are two things being tested here. One is whether we have a
> cross-linked node (ie a node that's in two trees at the same time).
> The other is whether the slab allocator is giving us a node that already
> contains non-NULL entries.
>
> If you could throw this on top of your kernel, we might stand a chance
> of catching the problem sooner. If it is one of these problems and not
> something weirder.
>
This fires in roughly 10 seconds for me on top of v6.11. Since array seems
to always be 1, I'm not sure if the assertion is right, but hopefully you
can trigger yourself.
reader.c is attached. It just has one thread doing large reads and two
threads fadvising things away. The important part seems to be two threads
in parallel calling fadvise DONTNEED at the same time, just one thread
wasn't enough.
root@kerneltest003-kvm ~]# cat small.sh
#!/bin/bash
mkfs.xfs -f /dev/vdb
mount /dev/vdb /xfs
fallocate -l10g /xfs/file1
./reader /xfs/file1
[root@kerneltest003-kvm ~]# ./small.sh
meta-data=/dev/vdb isize=512 agcount=10, agsize=268435455 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=1 inobtcount=1 nrext64=0
data = bsize=4096 blocks=2684354550, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
[ 102.013720] XFS (vdb): Mounting V5 Filesystem c3531255-dee1-4b86-8e14-2baa3cc900f8
[ 102.029638] XFS (vdb): Ending clean mount
[ 104.204205] node ffff888119f86ba8 offset 13 parent ffff888119f84988 shift 6 count 0 values 0 array 0000000000000001 list ffffffff81f93230 0000000000000000 marks 0 0 0
+[ 104.206996] ------------[ cut here ]------------
[ 104.207948] kernel BUG at lib/xarray.c:211!
[ 104.208729] Oops: invalid opcode: 0000 [#1] SMP PTI
[ 104.209627] CPU: 51 UID: 0 PID: 862 Comm: reader Not tainted 6.11.0-dirty #24
[ 104.211232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 104.213402] RIP: 0010:xas_load+0xe4/0x120
[ 104.214144] Code: 00 10 00 00 76 c4 48 83 fa 02 75 ad 41 b8 02 04 00 00 eb a5 40 f6 c6 03 75 12 48 89 f7 e8 44 f5 ff ff 0f 0b 49 83 f8 02 75 10 <0f> 0b 48 c7 c7 76 58 98 82 e8 7e 3b 1a ff eb e8 40 f6 c6 03 75 0a
[ 104.217593] RSP: 0018:ffffc90001b57b90 EFLAGS: 00010296
[ 104.218729] RAX: 0000000000000000 RBX: ffffc90001b57bc8 RCX: 0000000000000000
[ 104.220019] RDX: ffff88b177aee180 RSI: ffff88b177ae0b80 RDI: ffff88b177ae0b80
[ 104.221394] RBP: 000000000027ffff R08: ffffffff8396b4a8 R09: 0000000000000003
[ 104.222679] R10: ffffffff8326b4c0 R11: ffffffff837eb4c0 R12: ffffc90001b57d48
[ 104.223985] R13: ffffc90001b57c48 R14: ffffc90001b57c50 R15: 0000000000000000
[ 104.225277] FS: 00007fcee02006c0(0000) GS:ffff88b177ac0000(0000) knlGS:0000000000000000
[ 104.226726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 104.227768] CR2: 00007fcee01fff78 CR3: 000000011bdc2004 CR4: 0000000000770ef0
[ 104.229055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 104.230341] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 104.231625] PKRU: 55555554
[ 104.232131] Call Trace:
[ 104.232586] <TASK>
[ 104.232984] ? die+0x33/0x90
[ 104.233531] ? do_trap+0xda/0x100
[ 104.234206] ? do_error_trap+0x65/0x80
[ 104.234893] ? xas_load+0xe4/0x120
[ 104.235524] ? exc_invalid_op+0x4e/0x70
[ 104.236231] ? xas_load+0xe4/0x120
[ 104.236855] ? asm_exc_invalid_op+0x16/0x20
[ 104.237638] ? xas_load+0xe4/0x120
[ 104.238268] xas_find+0x18c/0x1f0
[ 104.238878] find_lock_entries+0x6d/0x2f0
[ 104.239617] mapping_try_invalidate+0x5e/0x150
[ 104.240432] ? update_load_avg+0x78/0x750
[ 104.241167] ? psi_group_change+0x122/0x310
[ 104.241929] ? sched_balance_newidle+0x306/0x3b0
[ 104.242770] ? psi_task_switch+0xd6/0x230
[ 104.243506] ? __switch_to_asm+0x2a/0x60
[ 104.244224] ? __schedule+0x316/0xa00
[ 104.244896] ? schedule+0x1c/0xd0
[ 104.245530] ? schedule_preempt_disabled+0xa/0x10
[ 104.246386] ? __mutex_lock.constprop.0+0x2cf/0x5a0
[ 104.247274] ? __lru_add_drain_all+0x150/0x1e0
[ 104.248089] generic_fadvise+0x230/0x280
[ 104.248802] ? __fdget+0x8c/0xe0
[ 104.249407] ksys_fadvise64_64+0x4c/0xa0
[ 104.250126] __x64_sys_fadvise64+0x18/0x20
[ 104.250868] do_syscall_64+0x5b/0x170
[ 104.251543] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 104.252463] RIP: 0033:0x7fcee0e5cd6e
[ 104.253131] Code: b8 ff ff ff ff eb c3 67 e8 7f cf 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 41 89 ca b8 dd 00 00 00 0f 05 <89> c2 f7 da 3d 00 f0 ff ff b8 00 00 00 00 0f 47 c2 c3 41 57 41 56
[ 104.256446] RSP: 002b:00007fcee01ffe88 EFLAGS: 00000202 ORIG_RAX: 00000000000000dd
[ 104.257800] RAX: ffffffffffffffda RBX: 00007fcee0200cdc RCX: 00007fcee0e5cd6e
[ 104.259085] RDX: 0000000280000000 RSI: 0000000000000000 RDI: 0000000000000003
[ 104.260365] RBP: 00007fcee01ffed0 R08: 0000000000000000 R09: 00007fcee02006c0
[ 104.261648] R10: 0000000000000004 R11: 0000000000000202 R12: ffffffffffffff88
[ 104.262964] R13: 0000000000000000 R14: 00007ffc16078a70 R15: 00007fcedfa00000
[ 104.264258] </TASK>
[ 104.264669] Modules linked in: intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel bochs drm_vram_helper drm_kms_helper kvm drm_ttm_helper intel_agp ttm i2c_piix4 intel_gtt agpgart i2c_smbus evdev button serio_raw sch_fq_codel usbip_core drm loop drm_panel_orientation_quirks backlight bpf_preload virtio_rng ip_tables autofs4
[ 104.270152] ---[ end trace 0000000000000000 ]---
[ 104.271179] RIP: 0010:xas_load+0xe4/0x120
[ 104.271968] Code: 00 10 00 00 76 c4 48 83 fa 02 75 ad 41 b8 02 04 00 00 eb a5 40 f6 c6 03 75 12 48 89 f7 e8 44 f5 ff ff 0f 0b 49 83 f8 02 75 10 <0f> 0b 48 c7 c7 76 58 98 82 e8 7e 3b 1a ff eb e8 40 f6 c6 03 75 0a
[ 104.275460] RSP: 0018:ffffc90001b57b90 EFLAGS: 00010296
[ 104.276481] RAX: 0000000000000000 RBX: ffffc90001b57bc8 RCX: 0000000000000000
[ 104.277797] RDX: ffff88b177aee180 RSI: ffff88b177ae0b80 RDI: ffff88b177ae0b80
[ 104.279101] RBP: 000000000027ffff R08: ffffffff8396b4a8 R09: 0000000000000003
[ 104.280400] R10: ffffffff8326b4c0 R11: ffffffff837eb4c0 R12: ffffc90001b57d48
[ 104.281705] R13: ffffc90001b57c48 R14: ffffc90001b57c50 R15: 0000000000000000
[ 104.283014] FS: 00007fcee02006c0(0000) GS:ffff88b177ac0000(0000) knlGS:0000000000000000
[ 104.284487] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 104.285539] CR2: 00007fcee01fff78 CR3: 000000011bdc2004 CR4: 0000000000770ef0
[ 104.286838] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 104.288139] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 104.289468] PKRU: 55555554
[ 104.289983] Kernel panic - not syncing: Fatal exception
[ 104.292343] Kernel Offset: disabled
[ 104.292990] ---[ end Kernel panic - not syncing: Fatal exception ]---
[-- Attachment #2: reader.c --]
[-- Type: text/plain, Size: 2147 bytes --]
/*
* gcc -Wall -o reader reader.c -lpthread
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/sendfile.h>
#include <unistd.h>
#include <errno.h>
#include <err.h>
#include <pthread.h>
struct thread_data {
int fd;
size_t size;
};
static void *drop_pages(void *arg)
{
struct thread_data *td = arg;
int ret;
unsigned long nr_pages = td->size / 4096;
unsigned int seed = 0x55443322;
off_t offset;
unsigned long nr_drops = 0;
while (1) {
offset = rand_r(&seed) % nr_pages;
offset = offset * 4096;
ret = posix_fadvise(td->fd, offset, 4096, POSIX_FADV_DONTNEED);
if (ret < 0)
err(1, "fadvise dontneed");
/* every once and a while, drop everything */
if (nr_drops > nr_pages / 2) {
ret = posix_fadvise(td->fd, 0, td->size, POSIX_FADV_DONTNEED);
if (ret < 0)
err(1, "fadvise dontneed");
fprintf(stderr, "+");
nr_drops = 0;
}
nr_drops++;
}
return NULL;
}
#define READ_BUF (2 * 1024 * 1024)
static void *read_pages(void *arg)
{
struct thread_data *td = arg;
char buf[READ_BUF];
ssize_t ret;
loff_t offset;
while (1) {
offset = 0;
while(offset < td->size) {
ret = pread(td->fd, buf, READ_BUF, offset);
if (ret < 0)
err(1, "read");
if (ret == 0)
break;
offset += ret;
}
}
return NULL;
}
int main(int ac, char **av)
{
int fd;
int ret;
struct stat st;
struct thread_data td;
pthread_t drop_tid;
pthread_t drop2_tid;
pthread_t read_tid;
if (ac != 2)
err(1, "usage: reader filename\n");
fd = open(av[1], O_RDONLY, 0600);
if (fd < 0)
err(1, "unable to open %s", av[1]);
ret = fstat(fd, &st);
if (ret < 0)
err(1, "stat");
td.fd = fd;
td.size = st.st_size;
ret = pthread_create(&drop_tid, NULL, drop_pages, &td);
if (ret)
err(1, "pthread_create");
ret = pthread_create(&drop2_tid, NULL, drop_pages, &td);
if (ret)
err(1, "pthread_create");
ret = pthread_create(&read_tid, NULL, read_pages, &td);
if (ret)
err(1, "pthread_create");
pthread_join(drop_tid, NULL);
pthread_join(drop2_tid, NULL);
pthread_join(read_tid, NULL);
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-17 11:13 ` Chris Mason
@ 2024-09-17 13:25 ` Matthew Wilcox
2024-09-18 6:37 ` Jens Axboe
0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-17 13:25 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Tue, Sep 17, 2024 at 01:13:05PM +0200, Chris Mason wrote:
> On 9/17/24 5:32 AM, Matthew Wilcox wrote:
> > On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
> >> I've got a bunch of assertions around incorrect folio->mapping and I'm
> >> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
> >> on those, and our systems do run pretty short on ram, so it feels right
> >> at least. We'll see.
> >
> > I've been running with some variant of this patch the whole way across
> > the Atlantic, and not hit any problems. But maybe with the right
> > workload ...?
> >
> > There are two things being tested here. One is whether we have a
> > cross-linked node (ie a node that's in two trees at the same time).
> > The other is whether the slab allocator is giving us a node that already
> > contains non-NULL entries.
> >
> > If you could throw this on top of your kernel, we might stand a chance
> > of catching the problem sooner. If it is one of these problems and not
> > something weirder.
> >
>
> This fires in roughly 10 seconds for me on top of v6.11. Since array seems
> to always be 1, I'm not sure if the assertion is right, but hopefully you
> can trigger yourself.
Whoops.
$ git grep XA_RCU_FREE
lib/xarray.c:#define XA_RCU_FREE ((struct xarray *)1)
lib/xarray.c: node->array = XA_RCU_FREE;
so you walked into a node which is currently being freed by RCU. Which
isn't a problem, of course. I don't know why I do that; it doesn't seem
like anyone tests it. The jetlag is seriously kicking in right now,
so I'm going to refrain from saying anything more because it probably
won't be coherent.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-17 13:25 ` Matthew Wilcox
@ 2024-09-18 6:37 ` Jens Axboe
2024-09-18 9:28 ` Chris Mason
0 siblings, 1 reply; 81+ messages in thread
From: Jens Axboe @ 2024-09-18 6:37 UTC (permalink / raw)
To: Matthew Wilcox, Chris Mason
Cc: Linus Torvalds, Dave Chinner, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/17/24 7:25 AM, Matthew Wilcox wrote:
> On Tue, Sep 17, 2024 at 01:13:05PM +0200, Chris Mason wrote:
>> On 9/17/24 5:32 AM, Matthew Wilcox wrote:
>>> On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
>>>> I've got a bunch of assertions around incorrect folio->mapping and I'm
>>>> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
>>>> on those, and our systems do run pretty short on ram, so it feels right
>>>> at least. We'll see.
>>>
>>> I've been running with some variant of this patch the whole way across
>>> the Atlantic, and not hit any problems. But maybe with the right
>>> workload ...?
>>>
>>> There are two things being tested here. One is whether we have a
>>> cross-linked node (ie a node that's in two trees at the same time).
>>> The other is whether the slab allocator is giving us a node that already
>>> contains non-NULL entries.
>>>
>>> If you could throw this on top of your kernel, we might stand a chance
>>> of catching the problem sooner. If it is one of these problems and not
>>> something weirder.
>>>
>>
>> This fires in roughly 10 seconds for me on top of v6.11. Since array seems
>> to always be 1, I'm not sure if the assertion is right, but hopefully you
>> can trigger yourself.
>
> Whoops.
>
> $ git grep XA_RCU_FREE
> lib/xarray.c:#define XA_RCU_FREE ((struct xarray *)1)
> lib/xarray.c: node->array = XA_RCU_FREE;
>
> so you walked into a node which is currently being freed by RCU. Which
> isn't a problem, of course. I don't know why I do that; it doesn't seem
> like anyone tests it. The jetlag is seriously kicking in right now,
> so I'm going to refrain from saying anything more because it probably
> won't be coherent.
Based on a modified reproducer from Chris (N threads reading from a
file, M threads dropping pages), I can pretty quickly reproduce the
xas_descend() spin on 6.9 in a vm with 128 cpus. Here's some debugging
output with a modified version of your patch too, that ignores
XA_RCU_FREE:
node ffff8e838a01f788 max 59 parent 0000000000000000 shift 0 count 0 values 0 array ffff8e839dfa86a0 list ffff8e838a01f7a0 ffff8e838a01f7a0 marks 0 0 0
WARNING: CPU: 106 PID: 1554 at lib/xarray.c:405 xas_alloc.cold+0x26/0x4b
which is:
XA_NODE_BUG_ON(node, memchr_inv(&node->slots, 0, sizeof(void *) * XA_CHUN K_SIZE));
and:
node ffff8e838a01f788 offset 59 parent ffff8e838b0419c8 shift 0 count 252 values 0 array ffff8e839dfa86a0 list ffff8e838a01f7a0 ffff8e838a01f7a0 marks 0 0 0
which is:
XA_NODE_BUG_ON(node, node->count > XA_CHUNK_SIZE);
and for this particular run, 2 threads spinning:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-1 rcu_node (CPUs 16-31): P1555
rcu: Tasks blocked on level-1 rcu_node (CPUs 64-79): P1556
rcu: (detected by 97, t=2102 jiffies, g=7821, q=293800 ncpus=128)
task:reader state:R running task stack:0 pid:1555 tgid:1551 ppid:1 flags:0x00004006
Call Trace:
<TASK>
? __schedule+0x37f/0xaa0
? sysvec_apic_timer_interrupt+0x96/0xb0
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? xas_load+0x74/0xe0
? xas_load+0x10/0xe0
? xas_find+0x162/0x1b0
? find_lock_entries+0x1ac/0x360
? find_lock_entries+0x76/0x360
? mapping_try_invalidate+0x5d/0x130
? generic_fadvise+0x110/0x240
? xfd_validate_state+0x1e/0x70
? ksys_fadvise64_64+0x50/0x90
? __x64_sys_fadvise64+0x18/0x20
? do_syscall_64+0x5d/0x180
? entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
task:reader state:R running task stack:0 pid:1556 tgid:1551 ppid:1 flags:0x00004006
The reproducer takes ~30 seconds, and will lead to anywhere from 1..N
threads spinning here.
Now for the kicker - this doesn't reproduce in 6.10 and onwards. There
are only a few changes here that are relevant, seemingly, and the prime
candidates are:
commit a4864671ca0bf51c8e78242951741df52c06766f
Author: Kairui Song <kasong@tencent.com>
Date: Tue Apr 16 01:18:55 2024 +0800
lib/xarray: introduce a new helper xas_get_order
and the followup filemap change:
commit 6758c1128ceb45d1a35298912b974eb4895b7dd9
Author: Kairui Song <kasong@tencent.com>
Date: Tue Apr 16 01:18:56 2024 +0800
mm/filemap: optimize filemap folio adding
and reverting those two on 6.10 hits it again almost immediately. Didn't
look into these commit, but looks like they inadvertently also fixed
this corruption issue.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 7:14 ` Christian Theune
2024-09-16 12:16 ` Matthew Wilcox
@ 2024-09-18 8:31 ` Christian Theune
1 sibling, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-18 8:31 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Jens Axboe, Matthew Wilcox, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, clm, regressions,
regressions
> On 16. Sep 2024, at 09:14, Christian Theune <ct@flyingcircus.io> wrote:
>
>>
>> On 16. Sep 2024, at 02:00, Dave Chinner <david@fromorbit.com> wrote:
>>
>> I don't think this is a data corruption/loss problem - it certainly
>> hasn't ever appeared that way to me. The "data loss" appeared to be
>> in incomplete postgres dump files after the system was rebooted and
>> this is exactly what would happen when you randomly crash the
>> system. i.e. dirty data in memory is lost, and application data
>> being written at the time is in an inconsistent state after the
>> system recovers. IOWs, there was no clear evidence of actual data
>> corruption occuring, and data loss is definitely expected when the
>> page cache iteration hangs and the system is forcibly rebooted
>> without being able to sync or unmount the filesystems…
>> All the hangs seem to be caused by folio lookup getting stuck
>> on a rogue xarray entry in truncate or readahead. If we find an
>> invalid entry or a folio from a different mapping or with a
>> unexpected index, we skip it and try again. Hence this does not
>> appear to be a data corruption vector, either - it results in a
>> livelock from endless retry because of the bad entry in the xarray.
>> This endless retry livelock appears to be what is being reported.
>>
>> IOWs, there is no evidence of real runtime data corruption or loss
>> from this pagecache livelock bug. We also haven't heard of any
>> random file data corruption events since we've enabled large folios
>> on XFS. Hence there really is no evidence to indicate that there is
>> a large folio xarray lookup bug that results in data corruption in
>> the existing code, and therefore there is no obvious reason for
>> turning off the functionality we are already building significant
>> new functionality on top of.
I’ve been chewing more on this and reviewed the tickets I have. We did see a PostgreSQL database ending up reporting "ERROR: invalid page in block 30896 of relation base/16389/103292”.
My understanding of the argument that this bug does not corrupt data is that the error would only lead to a crash-consistent state. So applications that can properly recover from a crash-consistent state would only experience data loss to the point of the crash (which is fine and expected) but should not end up in a further corrupted state.
PostgreSQL reporting this error indicates - to my knowledge - that it did not see a crash consistent state of the file system.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 6:37 ` Jens Axboe
@ 2024-09-18 9:28 ` Chris Mason
2024-09-18 12:23 ` Chris Mason
2024-09-18 13:34 ` Matthew Wilcox
0 siblings, 2 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-18 9:28 UTC (permalink / raw)
To: Jens Axboe, Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
[-- Attachment #1: Type: text/plain, Size: 5699 bytes --]
One or more of the originally attached files triggered the rule module.access.rule.exestrip_notify
The following attachments were deleted from the original message.
radixcheck.py
Original Message:
On 9/18/24 2:37 AM, Jens Axboe wrote:
> On 9/17/24 7:25 AM, Matthew Wilcox wrote:
>> On Tue, Sep 17, 2024 at 01:13:05PM +0200, Chris Mason wrote:
>>> On 9/17/24 5:32 AM, Matthew Wilcox wrote:
>>>> On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote:
>>>>> I've got a bunch of assertions around incorrect folio->mapping and I'm
>>>>> trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN
>>>>> on those, and our systems do run pretty short on ram, so it feels right
>>>>> at least. We'll see.
>>>>
>>>> I've been running with some variant of this patch the whole way across
>>>> the Atlantic, and not hit any problems. But maybe with the right
>>>> workload ...?
>>>>
>>>> There are two things being tested here. One is whether we have a
>>>> cross-linked node (ie a node that's in two trees at the same time).
>>>> The other is whether the slab allocator is giving us a node that already
>>>> contains non-NULL entries.
>>>>
>>>> If you could throw this on top of your kernel, we might stand a chance
>>>> of catching the problem sooner. If it is one of these problems and not
>>>> something weirder.
>>>>
>>>
>>> This fires in roughly 10 seconds for me on top of v6.11. Since array seems
>>> to always be 1, I'm not sure if the assertion is right, but hopefully you
>>> can trigger yourself.
>>
>> Whoops.
>>
>> $ git grep XA_RCU_FREE
>> lib/xarray.c:#define XA_RCU_FREE ((struct xarray *)1)
>> lib/xarray.c: node->array = XA_RCU_FREE;
>>
>> so you walked into a node which is currently being freed by RCU. Which
>> isn't a problem, of course. I don't know why I do that; it doesn't seem
>> like anyone tests it. The jetlag is seriously kicking in right now,
>> so I'm going to refrain from saying anything more because it probably
>> won't be coherent.
>
> Based on a modified reproducer from Chris (N threads reading from a
> file, M threads dropping pages), I can pretty quickly reproduce the
> xas_descend() spin on 6.9 in a vm with 128 cpus. Here's some debugging
> output with a modified version of your patch too, that ignores
> XA_RCU_FREE:
Jens and I are running slightly different versions of reader.c, but we're
seeing the same thing. v6.11 is lasts all night long, and reverting those
two commits falls over in about 5 minutes or less.
I switched from a VM to bare metal, and managed to hit an assertion I'd
added to filemap_get_read_batch() (should look familiar):
{
struct address_space *fmapping = READ_ONCE(folio->mapping);
BUG_ON(fmapping && fmapping != mapping);
}
Walking the xarray in the crashdump shows that it's probably the same
corruption I saw in 5.19. drgn is printing like so:
print("0x%x mapping 0x%x radix index %d page index %d flags 0x%x (%s) size %d" % (page.address_of_(), page.mapping.value_(), index, page.index, page.flags, decode_page_flags(page), folio._folio_nr_pages))
And I attached radixcheck.py if you want to see the full script.
These are all from the correct mapping:
0xffffea0088b17200 mapping 0xffff88a22a9614e8 radix index 53 page index 53 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 59472
0xffffea008773e940 mapping 0xffff88a22a9614e8 radix index 54 page index 54 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 4244589144
0xffffea0084ad1d00 mapping 0xffff88a22a9614e8 radix index 55 page index 55 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 4040059330
0xffffea0088c9d840 mapping 0xffff88a22a9614e8 radix index 56 page index 56 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 5958
0xffffea00879c6300 mapping 0xffff88a22a9614e8 radix index 57 page index 57 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 112
0xffffea0086630980 mapping 0xffff88a22a9614e8 radix index 58 page index 58 flags 0x15ffff000000000c (PG_referenced|PG_uptodate|PG_reported) size 4025236287
0xffffea0008eb6580 mapping 0xffff88a22a9614e8 radix index 59 page index 59 flags 0x5ffff000000012c (PG_referenced|PG_uptodate|PG_lru|PG_active|PG_reported) size 269
0xffffea00072db000 mapping 0xffff88a22a9614e8 radix index 60 page index 60 flags 0x5ffff000000416c (PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|PG_private|PG_reported) size 4
0xffffea000919b600 mapping 0xffff88a22a9614e8 radix index 64 page index 64 flags 0x5ffff000000416c (PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|PG_private|PG_reported) size 4
These last 3 are not:
0xffffea0008fa7000 mapping 0xffff888124910768 radix index 208 page index 192 flags 0x5ffff000000416c (PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|PG_private|PG_reported) size 64
0xffffea0008fa7000 mapping 0xffff888124910768 radix index 224 page index 192 flags 0x5ffff000000416c (PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|PG_private|PG_reported) size 64
0xffffea0008fa7000 mapping 0xffff888124910768 radix index 240 page index 192 flags 0x5ffff000000416c (PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|PG_private|PG_reported) size 64
I think the bug was in __filemap_add_folio()'s usage of xarray_split_alloc()
and the tree changing before taking the lock. It's just a guess, but that
was always my biggest suspect.
To reproduce, I used:
mkfs.xfs -f <some device>
mount some_device /xfs
for x in `seq 1 8` ; do
fallocate -l100m /xfs/file$x
./reader /xfs/file$x &
done
New reader.c attached. Jens changed his so that every
reader thread was using its own offset in the file,
and he found that reproduced more consistently.
-chris
[-- Attachment #2: reader.c --]
[-- Type: text/plain, Size: 1808 bytes --]
/*
* gcc -Wall -o reader reader.c -lpthread
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/sendfile.h>
#include <unistd.h>
#include <errno.h>
#include <err.h>
#include <pthread.h>
struct thread_data {
int fd;
int read_size;
size_t size;
};
static void *drop_pages(void *arg)
{
struct thread_data *td = arg;
int ret;
while (1) {
ret = posix_fadvise(td->fd, 0, td->size, POSIX_FADV_DONTNEED);
if (ret < 0)
err(1, "fadvise dontneed");
}
return NULL;
}
#define READ_BUF (2 * 1024 * 1024)
static void *read_pages(void *arg)
{
struct thread_data *td = arg;
char buf[READ_BUF];
ssize_t ret;
loff_t offset = 8192;
while (1) {
ret = pread(td->fd, buf, td->read_size, offset);
if (ret < 0)
err(1, "read");
if (ret == 0)
break;
}
return NULL;
}
int main(int ac, char **av)
{
int fd;
int ret;
struct stat st;
int sizes[9] = { 0, 0, 8192, 16834, 32768, 65536, 128 * 1024, 256 * 1024, 1024 * 1024 };
int nr_tids = 9;
struct thread_data tds[9];
int i;
int sleeps = 0;
pthread_t tids[nr_tids];
if (ac != 2)
err(1, "usage: reader filename\n");
fd = open(av[1], O_RDONLY, 0600);
if (fd < 0)
err(1, "unable to open %s", av[1]);
ret = fstat(fd, &st);
if (ret < 0)
err(1, "stat");
for (i = 0; i < nr_tids; i++) {
struct thread_data *td = tds + i;
td->fd = fd;
td->size = st.st_size;
td->read_size = sizes[i];
if (i < 2)
ret = pthread_create(tids + i, NULL, drop_pages, td);
else
ret = pthread_create(tids + i, NULL, read_pages, td);
if (ret)
err(1, "pthread_create");
}
for (i = 0; i < nr_tids; i++) {
pthread_detach(tids[i]);
}
while(1) {
sleep(122);
sleeps++;
fprintf(stderr, ":%d:", sleeps * 122);
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-16 13:29 ` Matthew Wilcox
@ 2024-09-18 9:51 ` Christian Brauner
0 siblings, 0 replies; 81+ messages in thread
From: Christian Brauner @ 2024-09-18 9:51 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Pankaj Raghav, Luis Chamberlain, Jens Axboe,
Christian Theune, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, Dave Chinner, clm, regressions,
regressions
On Mon, Sep 16, 2024 at 02:29:49PM GMT, Matthew Wilcox wrote:
> On Fri, Sep 13, 2024 at 02:11:22PM +0200, Christian Brauner wrote:
> > So this issue it new to me as well. One of the items this cycle is the
> > work to enable support for block sizes that are larger than page sizes
> > via the large block size (LBS) series that's been sitting in -next for a
> > long time. That work specifically targets xfs and builds on top of the
> > large folio support.
> >
> > If the support for large folios is going to be reverted in xfs then I
> > see no point to merge the LBS work now. So I'm holding off on sending
> > that pull request until a decision is made (for xfs). As far as I
> > understand, supporting larger block sizes will not be meaningful without
> > large folio support.
>
> This is unwarranted; please send this pull request. We're not going to
> rip out all of the infrastructure although we might end up disabling it
> by default. There's a bunch of other work queued up behind that, and not
> having it in Linus' tree is just going to make everything more painful.
Now that there's a reproducer and hopefully soon a fix I think we can
try and merge this next week.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 9:28 ` Chris Mason
@ 2024-09-18 12:23 ` Chris Mason
2024-09-18 13:34 ` Matthew Wilcox
1 sibling, 0 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-18 12:23 UTC (permalink / raw)
To: Jens Axboe, Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/18/24 5:28 AM, Chris Mason wrote:
> And I attached radixcheck.py if you want to see the full script.
Since the attachment didn't actually make it through:
#!/usr/bin/env -S drgn -c vmcore
from drgn.helpers.linux.fs import *
from drgn.helpers.linux.mm import *
from drgn.helpers.linux.list import *
from drgn.helpers.linux.xarray import *
from drgn import *
import os
import sys
import time
mapping = Object(prog, 'struct address_space', address=0xffff88a22a9614e8)
#p = path_lookup(prog, sys.argv[1]);
#mapping = p.dentry.d_inode.i_mapping
for index, x in xa_for_each(mapping.i_pages.address_of_()):
if xa_is_zero(x):
continue
if xa_is_value(x):
continue
page = Object(prog, 'struct page', address=x)
folio = Object(prog, 'struct folio', address=x)
print("0x%x mapping 0x%x radix index %d page index %d flags 0x%x (%s) size %d" % (page.address_of_(), page.mapping.value_(), index, page.index, page.flags, decode_page_flags(page), folio._folio_nr_pages))
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 9:28 ` Chris Mason
2024-09-18 12:23 ` Chris Mason
@ 2024-09-18 13:34 ` Matthew Wilcox
2024-09-18 13:51 ` Linus Torvalds
2024-09-19 1:43 ` Dave Chinner
1 sibling, 2 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-18 13:34 UTC (permalink / raw)
To: Chris Mason
Cc: Jens Axboe, Linus Torvalds, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, Sep 18, 2024 at 11:28:52AM +0200, Chris Mason wrote:
> I think the bug was in __filemap_add_folio()'s usage of xarray_split_alloc()
> and the tree changing before taking the lock. It's just a guess, but that
> was always my biggest suspect.
Oh god, that's it.
there should have been an xas_reset() after calling xas_split_alloc().
and 6758c1128ceb calls xas_reset() after calling xas_split_alloc().
i wonder if xas_split_alloc() should call xas_reset() to prevent this
from ever being a problem again?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 13:34 ` Matthew Wilcox
@ 2024-09-18 13:51 ` Linus Torvalds
2024-09-18 14:12 ` Matthew Wilcox
2024-09-19 1:43 ` Dave Chinner
1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-18 13:51 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chris Mason, Jens Axboe, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, 18 Sept 2024 at 15:35, Matthew Wilcox <willy@infradead.org> wrote:
>
> Oh god, that's it.
>
> there should have been an xas_reset() after calling xas_split_alloc().
I think it is worse than that.
Even *without* an xas_split_alloc(), I think the old code was wrong,
because it drops the xas lock without doing the xas_reset.
> i wonder if xas_split_alloc() should call xas_reset() to prevent this
> from ever being a problem again?
See above: I think the code in filemap_add_folio() was buggy entirely
unrelated to the xas_split_alloc(), although it is probably *much*
easier to trigger issues with it (ie the alloc will just make any
races much bigger)
But even when it doesn't do the alloc, it takes and drops the lock,
and it's unclear how much xas state it just randomly re-uses over the
lock drop.
(Maybe none of the other operations end up mattering, but it does look
very wrong).
So I think it might be better to do the xas_reset() when you do the
xas_lock_irq(), no? Isn't _that_ the a more logical point where "any
old state is unreliable, now we need to reset the walk"?
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 13:51 ` Linus Torvalds
@ 2024-09-18 14:12 ` Matthew Wilcox
2024-09-18 14:39 ` Linus Torvalds
2024-09-18 16:37 ` Chris Mason
0 siblings, 2 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-18 14:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Chris Mason, Jens Axboe, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, Sep 18, 2024 at 03:51:39PM +0200, Linus Torvalds wrote:
> On Wed, 18 Sept 2024 at 15:35, Matthew Wilcox <willy@infradead.org> wrote:
> >
> > Oh god, that's it.
> >
> > there should have been an xas_reset() after calling xas_split_alloc().
>
> I think it is worse than that.
>
> Even *without* an xas_split_alloc(), I think the old code was wrong,
> because it drops the xas lock without doing the xas_reset.
That's actually OK. The first time around the loop, we haven't walked the
tree, so we start from the top as you'd expect. The only other reason to
go around the loop again is that memory allocation failed for a node, and
in that case we call xas_nomem() and that (effectively) calls xas_reset().
So in terms of the expected API for xa_state users, it would be consistent
for xas_split_alloc() to call xas_reset().
You might argue that this API is too subtle, but it was intended to
be easy to use. The problem was that xas_split_alloc() got added much
later and I forgot to maintain the invariant that makes it work as well
as be easy to use.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 14:12 ` Matthew Wilcox
@ 2024-09-18 14:39 ` Linus Torvalds
2024-09-18 17:12 ` Matthew Wilcox
2024-09-18 16:37 ` Chris Mason
1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-18 14:39 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chris Mason, Jens Axboe, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, 18 Sept 2024 at 16:12, Matthew Wilcox <willy@infradead.org> wrote:
>
>
> That's actually OK. The first time around the loop, we haven't walked the
> tree, so we start from the top as you'd expect. The only other reason to
> go around the loop again is that memory allocation failed for a node, and
> in that case we call xas_nomem() and that (effectively) calls xas_reset().
Well, that's quite subtle and undocumented. But yes, I see the
(open-coded) xas_reset() in xas_nomem().
So yes, in practice it seems to be only the xas_split_alloc() path in
there that can have this problem, but maybe this should at the very
least be very documented.
The fact that this bug was fixed basically entirely by mistake does
say "this is much too subtle".
Of course, the fact that an xas_reset() not only resets the walk, but
also clears any pending errors (because it's all the same "xa_node"
thing), doesn't make things more obvious. Because right now you
*could* treat errors as "cumulative", but if a xas_split_alloc() does
an xas_reset() on success, that means that it's actually a big
conceptual change and you can't do the "cumulative" thing any more.
End result: it would probably make sense to change "cas_split_alloc()"
to explicitly *not* have that "check xas_error() afterwards as if it
could be cumulative", and instead make it very clearly have no history
and change the semantics to
(a) return the error - instead of having people have to check for
errors separately afterwards
(b) do the xas_reset() in the success path
so that it explicitly does *not* work for accumulating previous errors
(which presumably was never really the intent of the interface, but
people certainly _could_ use it that way).
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 14:12 ` Matthew Wilcox
2024-09-18 14:39 ` Linus Torvalds
@ 2024-09-18 16:37 ` Chris Mason
1 sibling, 0 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-18 16:37 UTC (permalink / raw)
To: Matthew Wilcox, Linus Torvalds
Cc: Jens Axboe, Dave Chinner, Christian Theune, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/18/24 10:12 AM, Matthew Wilcox wrote:
> On Wed, Sep 18, 2024 at 03:51:39PM +0200, Linus Torvalds wrote:
>> On Wed, 18 Sept 2024 at 15:35, Matthew Wilcox <willy@infradead.org> wrote:
>>>
>>> Oh god, that's it.
>>>
>>> there should have been an xas_reset() after calling xas_split_alloc().
>>
>> I think it is worse than that.
>>
>> Even *without* an xas_split_alloc(), I think the old code was wrong,
>> because it drops the xas lock without doing the xas_reset.
>
> That's actually OK. The first time around the loop, we haven't walked the
> tree, so we start from the top as you'd expect. The only other reason to
> go around the loop again is that memory allocation failed for a node, and
> in that case we call xas_nomem() and that (effectively) calls xas_reset().
>
> So in terms of the expected API for xa_state users, it would be consistent
> for xas_split_alloc() to call xas_reset().
>
> You might argue that this API is too subtle, but it was intended to
> be easy to use. The problem was that xas_split_alloc() got added much
> later and I forgot to maintain the invariant that makes it work as well
> as be easy to use.
>
Ok, missing xas_reset() makes a ton of sense as the root cause, and it
also explains why tmpfs hasn't seen the problem.
We'll start validating 6.11 and make noise if the large folios cause
problems again. Thanks everyone!
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 14:39 ` Linus Torvalds
@ 2024-09-18 17:12 ` Matthew Wilcox
0 siblings, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-18 17:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Chris Mason, Jens Axboe, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, Sep 18, 2024 at 04:39:56PM +0200, Linus Torvalds wrote:
> The fact that this bug was fixed basically entirely by mistake does
> say "this is much too subtle".
Yup.
> Of course, the fact that an xas_reset() not only resets the walk, but
> also clears any pending errors (because it's all the same "xa_node"
> thing), doesn't make things more obvious. Because right now you
> *could* treat errors as "cumulative", but if a xas_split_alloc() does
> an xas_reset() on success, that means that it's actually a big
> conceptual change and you can't do the "cumulative" thing any more.
So ... the way xas was intended to work is that the first thing we did
that set an error meant that everything after it was a no-op. You
can see that in functions like xas_start() which do:
if (xas_error(xas))
return NULL;
obviously something like xas_unlock() isn't a noop because you still
want to unlock even if you had an error.
The xas_split_alloc() was done in too much of a hurry. I had thought
that I wouldn't need it, and then found out that it was a prerequisite
for something I needed to do, and so I wasn't in the right frame of mind
when I wrote it.
It's actually a giant pain and I wanted to redo it even before this, as
well as clear up some pieces from xas_nomem() / __xas_nomem(). The
restriction on "we can only split to one additional level" is awful,
and has caused some contortions elsewhere.
> End result: it would probably make sense to change "xas_split_alloc()"
> to explicitly *not* have that "check xas_error() afterwards as if it
> could be cumulative", and instead make it very clearly have no history
> and change the semantics to
What it really should do is just return if it's already in an error state.
That makes it consistent with the rest of the API, and we don't have to
worry about it losing an already-found error.
But also all the other infelicities with it need to be fixed.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-18 13:34 ` Matthew Wilcox
2024-09-18 13:51 ` Linus Torvalds
@ 2024-09-19 1:43 ` Dave Chinner
2024-09-19 3:03 ` Linus Torvalds
1 sibling, 1 reply; 81+ messages in thread
From: Dave Chinner @ 2024-09-19 1:43 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Chris Mason, Jens Axboe, Linus Torvalds, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, Sep 18, 2024 at 02:34:57PM +0100, Matthew Wilcox wrote:
> On Wed, Sep 18, 2024 at 11:28:52AM +0200, Chris Mason wrote:
> > I think the bug was in __filemap_add_folio()'s usage of xarray_split_alloc()
> > and the tree changing before taking the lock. It's just a guess, but that
> > was always my biggest suspect.
>
> Oh god, that's it.
>
> there should have been an xas_reset() after calling xas_split_alloc().
>
> and 6758c1128ceb calls xas_reset() after calling xas_split_alloc().
Should we be asking for 6758c1128ceb to be backported to all
stable kernels then?
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 1:43 ` Dave Chinner
@ 2024-09-19 3:03 ` Linus Torvalds
2024-09-19 3:12 ` Linus Torvalds
0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-19 3:03 UTC (permalink / raw)
To: Dave Chinner
Cc: Matthew Wilcox, Chris Mason, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Thu, 19 Sept 2024 at 03:43, Dave Chinner <david@fromorbit.com> wrote:
>
> Should we be asking for 6758c1128ceb to be backported to all
> stable kernels then?
I think we should just do the simple one-liner of adding a
"xas_reset()" to after doing xas_split_alloc() (or do it inside the
xas_split_alloc()).
That said, I do also think it would be really good if the 'xa_lock*()'
family of functions also had something like a
WARN_ON_ONCE(xas->xa_node && !xa_err(xas->xa_node));
which I think would have caught this. Because right now nothing at all
checks "we dropped the xa lock, and held xas state over it".
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 3:03 ` Linus Torvalds
@ 2024-09-19 3:12 ` Linus Torvalds
2024-09-19 3:38 ` Jens Axboe
2024-09-19 6:34 ` Christian Theune
0 siblings, 2 replies; 81+ messages in thread
From: Linus Torvalds @ 2024-09-19 3:12 UTC (permalink / raw)
To: Dave Chinner
Cc: Matthew Wilcox, Chris Mason, Jens Axboe, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I think we should just do the simple one-liner of adding a
> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
> xas_split_alloc()).
.. and obviously that should be actually *verified* to fix the issue
not just with the test-case that Chris and Jens have been using, but
on Christian's real PostgreSQL load.
Christian?
Note that the xas_reset() needs to be done after the check for errors
- or like Willy suggested, xas_split_alloc() needs to be re-organized.
So the simplest fix is probably to just add a
if (xas_error(&xas))
goto error;
}
+ xas_reset(&xas);
xas_lock_irq(&xas);
xas_for_each_conflict(&xas, entry) {
old = entry;
in __filemap_add_folio() in mm/filemap.c
(The above is obviously a whitespace-damaged pseudo-patch for the
pre-6758c1128ceb state. I don't actually carry a stable tree around on
my laptop, but I hope it's clear enough what I'm rambling about)
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 3:12 ` Linus Torvalds
@ 2024-09-19 3:38 ` Jens Axboe
2024-09-19 4:32 ` Linus Torvalds
2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 6:34 ` Christian Theune
1 sibling, 2 replies; 81+ messages in thread
From: Jens Axboe @ 2024-09-19 3:38 UTC (permalink / raw)
To: Linus Torvalds, Dave Chinner
Cc: Matthew Wilcox, Chris Mason, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/18/24 9:12 PM, Linus Torvalds wrote:
> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> I think we should just do the simple one-liner of adding a
>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
>> xas_split_alloc()).
>
> .. and obviously that should be actually *verified* to fix the issue
> not just with the test-case that Chris and Jens have been using, but
> on Christian's real PostgreSQL load.
>
> Christian?
>
> Note that the xas_reset() needs to be done after the check for errors
> - or like Willy suggested, xas_split_alloc() needs to be re-organized.
>
> So the simplest fix is probably to just add a
>
> if (xas_error(&xas))
> goto error;
> }
> + xas_reset(&xas);
> xas_lock_irq(&xas);
> xas_for_each_conflict(&xas, entry) {
> old = entry;
>
> in __filemap_add_folio() in mm/filemap.c
>
> (The above is obviously a whitespace-damaged pseudo-patch for the
> pre-6758c1128ceb state. I don't actually carry a stable tree around on
> my laptop, but I hope it's clear enough what I'm rambling about)
I kicked off a quick run with this on 6.9 with my debug patch as well,
and it still fails for me... I'll double check everything is sane. For
reference, below is the 6.9 filemap patch.
diff --git a/mm/filemap.c b/mm/filemap.c
index 30de18c4fd28..88093e2b7256 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
if (order > folio_order(folio))
xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
order, gfp);
+ xas_reset(&xas);
xas_lock_irq(&xas);
xas_for_each_conflict(&xas, entry) {
old = entry;
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 3:38 ` Jens Axboe
@ 2024-09-19 4:32 ` Linus Torvalds
2024-09-19 4:42 ` Jens Axboe
2024-09-19 4:36 ` Matthew Wilcox
1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-19 4:32 UTC (permalink / raw)
To: Jens Axboe
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Thu, 19 Sept 2024 at 05:38, Jens Axboe <axboe@kernel.dk> wrote:
>
> I kicked off a quick run with this on 6.9 with my debug patch as well,
> and it still fails for me... I'll double check everything is sane. For
> reference, below is the 6.9 filemap patch.
Ok, that's interesting. So it's *not* just about "that code didn't do
xas_reset() after xas_split_alloc()".
Now, another thing that commit 6758c1128ceb ("mm/filemap: optimize
filemap folio adding") does is that it now *only* calls xa_get_order()
under the xa lock, and then it verifies it against the
xas_split_alloc() that it did earlier.
The old code did "xas_split_alloc()" with one order (all outside the
lock), and then re-did the xas_get_order() lookup inside the lock. But
if it changed in between, it ended up doing the "xas_split()" with the
new order, even though "xas_split_alloc()" was done with the *old*
order.
That seems dangerous, and maybe the lack of xas_reset() was never the
*major* issue?
Willy? You know this code much better than I do. Maybe we should just
back-port 6758c1128ceb in its entirety.
Regardless, I'd want to make sure that we really understand the root
cause. Because it certainly looks like *just* the lack of xas_reset()
wasn't it.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 3:38 ` Jens Axboe
2024-09-19 4:32 ` Linus Torvalds
@ 2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 4:46 ` Jens Axboe
` (2 more replies)
1 sibling, 3 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-19 4:36 UTC (permalink / raw)
To: Jens Axboe
Cc: Linus Torvalds, Dave Chinner, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Wed, Sep 18, 2024 at 09:38:41PM -0600, Jens Axboe wrote:
> On 9/18/24 9:12 PM, Linus Torvalds wrote:
> > On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >>
> >> I think we should just do the simple one-liner of adding a
> >> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
> >> xas_split_alloc()).
> >
> > .. and obviously that should be actually *verified* to fix the issue
> > not just with the test-case that Chris and Jens have been using, but
> > on Christian's real PostgreSQL load.
> >
> > Christian?
> >
> > Note that the xas_reset() needs to be done after the check for errors
> > - or like Willy suggested, xas_split_alloc() needs to be re-organized.
> >
> > So the simplest fix is probably to just add a
> >
> > if (xas_error(&xas))
> > goto error;
> > }
> > + xas_reset(&xas);
> > xas_lock_irq(&xas);
> > xas_for_each_conflict(&xas, entry) {
> > old = entry;
> >
> > in __filemap_add_folio() in mm/filemap.c
> >
> > (The above is obviously a whitespace-damaged pseudo-patch for the
> > pre-6758c1128ceb state. I don't actually carry a stable tree around on
> > my laptop, but I hope it's clear enough what I'm rambling about)
>
> I kicked off a quick run with this on 6.9 with my debug patch as well,
> and it still fails for me... I'll double check everything is sane. For
> reference, below is the 6.9 filemap patch.
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 30de18c4fd28..88093e2b7256 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
> if (order > folio_order(folio))
> xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
> order, gfp);
> + xas_reset(&xas);
> xas_lock_irq(&xas);
> xas_for_each_conflict(&xas, entry) {
> old = entry;
My brain is still mushy, but I think there is still a problem (both with
the simple fix for 6.9 and indeed with 6.10).
For splitting a folio, we have the folio locked, so we know it's not
going anywhere. The tree may get rearranged around it while we don't
have the xa_lock, but we're somewhat protected.
In this case we're splitting something that was, at one point, a shadow
entry. There's no struct there to lock. So I think we can have a
situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9)
into the nodes we allocate in xas_split_alloc(). In 6.10, that's at
least guaranteed to be a shadow entry, but in 6.9, it might already be a
folio by this point because we've raced with something else also doing a
split.
Probably xas_split_alloc() needs to just do the alloc, like the name
says, and drop the 'entry' argument. ICBW, but I think it explains
what you're seeing? Maybe it doesn't?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 4:32 ` Linus Torvalds
@ 2024-09-19 4:42 ` Jens Axboe
0 siblings, 0 replies; 81+ messages in thread
From: Jens Axboe @ 2024-09-19 4:42 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On 9/18/24 10:32 PM, Linus Torvalds wrote:
> On Thu, 19 Sept 2024 at 05:38, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> I kicked off a quick run with this on 6.9 with my debug patch as well,
>> and it still fails for me... I'll double check everything is sane. For
>> reference, below is the 6.9 filemap patch.
Confirmed with a few more runs, still hits, basically as quickly as it
did before. So no real change observed with the added xas_reset().
> Ok, that's interesting. So it's *not* just about "that code didn't do
> xas_reset() after xas_split_alloc()".
>
> Now, another thing that commit 6758c1128ceb ("mm/filemap: optimize
> filemap folio adding") does is that it now *only* calls xa_get_order()
> under the xa lock, and then it verifies it against the
> xas_split_alloc() that it did earlier.
>
> The old code did "xas_split_alloc()" with one order (all outside the
> lock), and then re-did the xas_get_order() lookup inside the lock. But
> if it changed in between, it ended up doing the "xas_split()" with the
> new order, even though "xas_split_alloc()" was done with the *old*
> order.
>
> That seems dangerous, and maybe the lack of xas_reset() was never the
> *major* issue?
>
> Willy? You know this code much better than I do. Maybe we should just
> back-port 6758c1128ceb in its entirety.
>
> Regardless, I'd want to make sure that we really understand the root
> cause. Because it certainly looks like *just* the lack of xas_reset()
> wasn't it.
Just for sanity's sake, I backported 6758c1128ceb (and the associated
xarray xas_get_order() change) to 6.9 and kicked that off.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 4:36 ` Matthew Wilcox
@ 2024-09-19 4:46 ` Jens Axboe
2024-09-19 5:20 ` Jens Axboe
2024-09-19 4:46 ` Linus Torvalds
2024-09-20 13:54 ` Chris Mason
2 siblings, 1 reply; 81+ messages in thread
From: Jens Axboe @ 2024-09-19 4:46 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On 9/18/24 10:36 PM, Matthew Wilcox wrote:
> On Wed, Sep 18, 2024 at 09:38:41PM -0600, Jens Axboe wrote:
>> On 9/18/24 9:12 PM, Linus Torvalds wrote:
>>> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>>
>>>> I think we should just do the simple one-liner of adding a
>>>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
>>>> xas_split_alloc()).
>>>
>>> .. and obviously that should be actually *verified* to fix the issue
>>> not just with the test-case that Chris and Jens have been using, but
>>> on Christian's real PostgreSQL load.
>>>
>>> Christian?
>>>
>>> Note that the xas_reset() needs to be done after the check for errors
>>> - or like Willy suggested, xas_split_alloc() needs to be re-organized.
>>>
>>> So the simplest fix is probably to just add a
>>>
>>> if (xas_error(&xas))
>>> goto error;
>>> }
>>> + xas_reset(&xas);
>>> xas_lock_irq(&xas);
>>> xas_for_each_conflict(&xas, entry) {
>>> old = entry;
>>>
>>> in __filemap_add_folio() in mm/filemap.c
>>>
>>> (The above is obviously a whitespace-damaged pseudo-patch for the
>>> pre-6758c1128ceb state. I don't actually carry a stable tree around on
>>> my laptop, but I hope it's clear enough what I'm rambling about)
>>
>> I kicked off a quick run with this on 6.9 with my debug patch as well,
>> and it still fails for me... I'll double check everything is sane. For
>> reference, below is the 6.9 filemap patch.
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 30de18c4fd28..88093e2b7256 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
>> if (order > folio_order(folio))
>> xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
>> order, gfp);
>> + xas_reset(&xas);
>> xas_lock_irq(&xas);
>> xas_for_each_conflict(&xas, entry) {
>> old = entry;
>
> My brain is still mushy, but I think there is still a problem (both with
> the simple fix for 6.9 and indeed with 6.10).
>
> For splitting a folio, we have the folio locked, so we know it's not
> going anywhere. The tree may get rearranged around it while we don't
> have the xa_lock, but we're somewhat protected.
>
> In this case we're splitting something that was, at one point, a shadow
> entry. There's no struct there to lock. So I think we can have a
> situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9)
> into the nodes we allocate in xas_split_alloc(). In 6.10, that's at
> least guaranteed to be a shadow entry, but in 6.9, it might already be a
> folio by this point because we've raced with something else also doing a
> split.
>
> Probably xas_split_alloc() needs to just do the alloc, like the name
> says, and drop the 'entry' argument. ICBW, but I think it explains
> what you're seeing? Maybe it doesn't?
Since I can hit it pretty reliably and quickly, I'm happy to test
whatever you want on top of 6.9. From the other email, I backported:
a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
6758c1128ceb ("mm/filemap: optimize filemap folio adding")
to 6.9 and kicked off a test with that 5 min ago, and it's still going.
I'd say with 90% confidence that it should've hit already, but let's
leave it churning for an hour and see what pops out the other end.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 4:46 ` Jens Axboe
@ 2024-09-19 4:46 ` Linus Torvalds
2024-09-20 13:54 ` Chris Mason
2 siblings, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2024-09-19 4:46 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Jens Axboe, Dave Chinner, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Thu, 19 Sept 2024 at 06:36, Matthew Wilcox <willy@infradead.org> wrote:
>
> Probably xas_split_alloc() needs to just do the alloc, like the name
> says, and drop the 'entry' argument. ICBW, but I think it explains
> what you're seeing? Maybe it doesn't?
.. or we make the rule be that you have to re-check that the order and
the entry still matches when you do the actual xas_split()..
Like commit 6758c1128ceb does, in this case.
We do have another xas_split_alloc() - in the hugepage case - but
there we do have
xas_lock(&xas);
xas_reset(&xas);
if (xas_load(&xas) != folio)
goto fail;
and the folio is locked over the whole sequence, so I think that code
is probably safe and guarantees that we're splitting with the same
details we alloc'ed.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 4:46 ` Jens Axboe
@ 2024-09-19 5:20 ` Jens Axboe
0 siblings, 0 replies; 81+ messages in thread
From: Jens Axboe @ 2024-09-19 5:20 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Dave Chinner, Chris Mason, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On 9/18/24 10:46 PM, Jens Axboe wrote:
> On 9/18/24 10:36 PM, Matthew Wilcox wrote:
>> On Wed, Sep 18, 2024 at 09:38:41PM -0600, Jens Axboe wrote:
>>> On 9/18/24 9:12 PM, Linus Torvalds wrote:
>>>> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
>>>> <torvalds@linux-foundation.org> wrote:
>>>>>
>>>>> I think we should just do the simple one-liner of adding a
>>>>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
>>>>> xas_split_alloc()).
>>>>
>>>> .. and obviously that should be actually *verified* to fix the issue
>>>> not just with the test-case that Chris and Jens have been using, but
>>>> on Christian's real PostgreSQL load.
>>>>
>>>> Christian?
>>>>
>>>> Note that the xas_reset() needs to be done after the check for errors
>>>> - or like Willy suggested, xas_split_alloc() needs to be re-organized.
>>>>
>>>> So the simplest fix is probably to just add a
>>>>
>>>> if (xas_error(&xas))
>>>> goto error;
>>>> }
>>>> + xas_reset(&xas);
>>>> xas_lock_irq(&xas);
>>>> xas_for_each_conflict(&xas, entry) {
>>>> old = entry;
>>>>
>>>> in __filemap_add_folio() in mm/filemap.c
>>>>
>>>> (The above is obviously a whitespace-damaged pseudo-patch for the
>>>> pre-6758c1128ceb state. I don't actually carry a stable tree around on
>>>> my laptop, but I hope it's clear enough what I'm rambling about)
>>>
>>> I kicked off a quick run with this on 6.9 with my debug patch as well,
>>> and it still fails for me... I'll double check everything is sane. For
>>> reference, below is the 6.9 filemap patch.
>>>
>>> diff --git a/mm/filemap.c b/mm/filemap.c
>>> index 30de18c4fd28..88093e2b7256 100644
>>> --- a/mm/filemap.c
>>> +++ b/mm/filemap.c
>>> @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
>>> if (order > folio_order(folio))
>>> xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
>>> order, gfp);
>>> + xas_reset(&xas);
>>> xas_lock_irq(&xas);
>>> xas_for_each_conflict(&xas, entry) {
>>> old = entry;
>>
>> My brain is still mushy, but I think there is still a problem (both with
>> the simple fix for 6.9 and indeed with 6.10).
>>
>> For splitting a folio, we have the folio locked, so we know it's not
>> going anywhere. The tree may get rearranged around it while we don't
>> have the xa_lock, but we're somewhat protected.
>>
>> In this case we're splitting something that was, at one point, a shadow
>> entry. There's no struct there to lock. So I think we can have a
>> situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9)
>> into the nodes we allocate in xas_split_alloc(). In 6.10, that's at
>> least guaranteed to be a shadow entry, but in 6.9, it might already be a
>> folio by this point because we've raced with something else also doing a
>> split.
>>
>> Probably xas_split_alloc() needs to just do the alloc, like the name
>> says, and drop the 'entry' argument. ICBW, but I think it explains
>> what you're seeing? Maybe it doesn't?
>
> Since I can hit it pretty reliably and quickly, I'm happy to test
> whatever you want on top of 6.9. From the other email, I backported:
>
> a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
> 6758c1128ceb ("mm/filemap: optimize filemap folio adding")
>
> to 6.9 and kicked off a test with that 5 min ago, and it's still going.
> I'd say with 90% confidence that it should've hit already, but let's
> leave it churning for an hour and see what pops out the other end.
45 min later, I think I can conclusively call the backport of those two
on top of 6.9 good.
Below is what I'm running, which is those two commits (modulo the test
bits, for clarify). Rather than attempt to fix this differently for 6.9,
perhaps not a bad idea to just get those two into stable? It's not a lot
of churn, and at least that keeps it consistent rather than doing
something differently for stable.
I'll try and do a patch that just ensures the order is consistent across
lock cycles as Linus suggested, just to verify that this is indeed the
main issue. Will keep the xas_reset() as well.
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index cb571dfcf4b1..da2f5bba7944 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1548,6 +1551,7 @@ void xas_create_range(struct xa_state *);
#ifdef CONFIG_XARRAY_MULTI
int xa_get_order(struct xarray *, unsigned long index);
+int xas_get_order(struct xa_state *xas);
void xas_split(struct xa_state *, void *entry, unsigned int order);
void xas_split_alloc(struct xa_state *, void *entry, unsigned int order, gfp_t);
#else
@@ -1556,6 +1560,11 @@ static inline int xa_get_order(struct xarray *xa, unsigned long index)
return 0;
}
+static inline int xas_get_order(struct xa_state *xas)
+{
+ return 0;
+}
+
static inline void xas_split(struct xa_state *xas, void *entry,
unsigned int order)
{
diff --git a/lib/xarray.c b/lib/xarray.c
index 5e7d6334d70d..c0514fb16d33 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1765,39 +1780,52 @@ void *xa_store_range(struct xarray *xa, unsigned long first,
EXPORT_SYMBOL(xa_store_range);
/**
- * xa_get_order() - Get the order of an entry.
- * @xa: XArray.
- * @index: Index of the entry.
+ * xas_get_order() - Get the order of an entry.
+ * @xas: XArray operation state.
+ *
+ * Called after xas_load, the xas should not be in an error state.
*
* Return: A number between 0 and 63 indicating the order of the entry.
*/
-int xa_get_order(struct xarray *xa, unsigned long index)
+int xas_get_order(struct xa_state *xas)
{
- XA_STATE(xas, xa, index);
- void *entry;
int order = 0;
- rcu_read_lock();
- entry = xas_load(&xas);
-
- if (!entry)
- goto unlock;
-
- if (!xas.xa_node)
- goto unlock;
+ if (!xas->xa_node)
+ return 0;
for (;;) {
- unsigned int slot = xas.xa_offset + (1 << order);
+ unsigned int slot = xas->xa_offset + (1 << order);
if (slot >= XA_CHUNK_SIZE)
break;
- if (!xa_is_sibling(xas.xa_node->slots[slot]))
+ if (!xa_is_sibling(xa_entry(xas->xa, xas->xa_node, slot)))
break;
order++;
}
- order += xas.xa_node->shift;
-unlock:
+ order += xas->xa_node->shift;
+ return order;
+}
+EXPORT_SYMBOL_GPL(xas_get_order);
+
+/**
+ * xa_get_order() - Get the order of an entry.
+ * @xa: XArray.
+ * @index: Index of the entry.
+ *
+ * Return: A number between 0 and 63 indicating the order of the entry.
+ */
+int xa_get_order(struct xarray *xa, unsigned long index)
+{
+ XA_STATE(xas, xa, index);
+ int order = 0;
+ void *entry;
+
+ rcu_read_lock();
+ entry = xas_load(&xas);
+ if (entry)
+ order = xas_get_order(&xas);
rcu_read_unlock();
return order;
diff --git a/mm/filemap.c b/mm/filemap.c
index 30de18c4fd28..b8d525825d3f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -852,7 +852,9 @@ noinline int __filemap_add_folio(struct address_space *mapping,
struct folio *folio, pgoff_t index, gfp_t gfp, void **shadowp)
{
XA_STATE(xas, &mapping->i_pages, index);
- bool huge = folio_test_hugetlb(folio);
+ void *alloced_shadow = NULL;
+ int alloced_order = 0;
+ bool huge;
bool charged = false;
long nr = 1;
@@ -869,6 +871,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio);
xas_set_order(&xas, index, folio_order(folio));
+ huge = folio_test_hugetlb(folio);
nr = folio_nr_pages(folio);
gfp &= GFP_RECLAIM_MASK;
@@ -876,13 +879,10 @@ noinline int __filemap_add_folio(struct address_space *mapping,
folio->mapping = mapping;
folio->index = xas.xa_index;
- do {
- unsigned int order = xa_get_order(xas.xa, xas.xa_index);
+ for (;;) {
+ int order = -1, split_order = 0;
void *entry, *old = NULL;
- if (order > folio_order(folio))
- xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
- order, gfp);
xas_lock_irq(&xas);
xas_for_each_conflict(&xas, entry) {
old = entry;
@@ -890,19 +890,33 @@ noinline int __filemap_add_folio(struct address_space *mapping,
xas_set_err(&xas, -EEXIST);
goto unlock;
}
+ /*
+ * If a larger entry exists,
+ * it will be the first and only entry iterated.
+ */
+ if (order == -1)
+ order = xas_get_order(&xas);
+ }
+
+ /* entry may have changed before we re-acquire the lock */
+ if (alloced_order && (old != alloced_shadow || order != alloced_order)) {
+ xas_destroy(&xas);
+ alloced_order = 0;
}
if (old) {
- if (shadowp)
- *shadowp = old;
- /* entry may have been split before we acquired lock */
- order = xa_get_order(xas.xa, xas.xa_index);
- if (order > folio_order(folio)) {
+ if (order > 0 && order > folio_order(folio)) {
/* How to handle large swap entries? */
BUG_ON(shmem_mapping(mapping));
+ if (!alloced_order) {
+ split_order = order;
+ goto unlock;
+ }
xas_split(&xas, old, order);
xas_reset(&xas);
}
+ if (shadowp)
+ *shadowp = old;
}
xas_store(&xas, folio);
@@ -918,9 +932,24 @@ noinline int __filemap_add_folio(struct address_space *mapping,
__lruvec_stat_mod_folio(folio,
NR_FILE_THPS, nr);
}
+
unlock:
xas_unlock_irq(&xas);
- } while (xas_nomem(&xas, gfp));
+
+ /* split needed, alloc here and retry. */
+ if (split_order) {
+ xas_split_alloc(&xas, old, split_order, gfp);
+ if (xas_error(&xas))
+ goto error;
+ alloced_shadow = old;
+ alloced_order = split_order;
+ xas_reset(&xas);
+ continue;
+ }
+
+ if (!xas_nomem(&xas, gfp))
+ break;
+ }
if (xas_error(&xas))
goto error;
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 3:12 ` Linus Torvalds
2024-09-19 3:38 ` Jens Axboe
@ 2024-09-19 6:34 ` Christian Theune
2024-09-19 6:57 ` Linus Torvalds
1 sibling, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-09-19 6:34 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
> On 19. Sep 2024, at 05:12, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> I think we should just do the simple one-liner of adding a
>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
>> xas_split_alloc()).
>
> .. and obviously that should be actually *verified* to fix the issue
> not just with the test-case that Chris and Jens have been using, but
> on Christian's real PostgreSQL load.
>
> Christian?
Happy to! I see there’s still some back and forth on the specific patches. Let me know which kernel version and which patches I should start trying out. I’m loosing track while following the discussion.
In preparation: I’m wondering whether the known reproducer gives insight how I might force my load to trigger it more easily? Would running the reproducer above and combining that with a running PostgreSQL benchmark make sense?
Otherwise we’d likely only be getting insight after weeks of not seeing crashes …
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 6:34 ` Christian Theune
@ 2024-09-19 6:57 ` Linus Torvalds
2024-09-19 10:19 ` Christian Theune
0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-19 6:57 UTC (permalink / raw)
To: Christian Theune
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On Thu, 19 Sept 2024 at 08:35, Christian Theune <ct@flyingcircus.io> wrote:
>
> Happy to! I see there’s still some back and forth on the specific
> patches. Let me know which kernel version and which patches I should
> start trying out. I’m loosing track while following the discussion.
Yeah, right now Jens is still going to run some more testing, but I
think the plan is to just backport
a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
6758c1128ceb ("mm/filemap: optimize filemap folio adding")
and I think we're at the point where you might as well start testing
that if you have the cycles for it. Jens is mostly trying to confirm
the root cause, but even without that, I think you running your load
with those two changes back-ported is worth it.
(Or even just try running it on plain 6.10 or 6.11, both of which
already has those commits)
> In preparation: I’m wondering whether the known reproducer gives
> insight how I might force my load to trigger it more easily? Would
> running the reproducer above and combining that with a running
> PostgreSQL benchmark make sense?
>
> Otherwise we’d likely only be getting insight after weeks of not
> seeing crashes …
So considering how well the reproducer works for Jens and Chris, my
main worry is whether your load might have some _additional_ issue.
Unlikely, but still .. The two commits fix the repproducer, so I think
the important thing to make sure is that it really fixes the original
issue too.
And yeah, I'd be surprised if it doesn't, but at the same time I would
_not_ suggest you try to make your load look more like the case we
already know gets fixed.
So yes, it will be "weeks of not seeing crashes" until we'd be
_really_ confident it's all the same thing, but I'd rather still have
you test that, than test something else than what caused issues
originally, if you see what I mean.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 6:57 ` Linus Torvalds
@ 2024-09-19 10:19 ` Christian Theune
2024-09-30 17:34 ` Christian Theune
0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-09-19 10:19 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
> On 19. Sep 2024, at 08:57, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> Yeah, right now Jens is still going to run some more testing, but I
> think the plan is to just backport
>
> a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
> 6758c1128ceb ("mm/filemap: optimize filemap folio adding")
>
> and I think we're at the point where you might as well start testing
> that if you have the cycles for it. Jens is mostly trying to confirm
> the root cause, but even without that, I think you running your load
> with those two changes back-ported is worth it.
>
> (Or even just try running it on plain 6.10 or 6.11, both of which
> already has those commits)
I’ve discussed this with my team and we’re preparing to switch all our
non-prod machines as well as those production machines that have shown
the error before.
This will require a bit of user communication and reboot scheduling.
Our release prep will be able to roll this out starting early next week
and the production machines in question around Sept 30.
We would run with 6.11 as our understanding so far is that running the
most current kernel would generate the most insight and is easier to
work with for you all?
(Generally we run the mostly vanilla LTS that has surpassed x.y.50+ so
we might later downgrade to 6.6 when this is fixed.)
> So considering how well the reproducer works for Jens and Chris, my
> main worry is whether your load might have some _additional_ issue.
>
> Unlikely, but still .. The two commits fix the repproducer, so I think
> the important thing to make sure is that it really fixes the original
> issue too.
>
> And yeah, I'd be surprised if it doesn't, but at the same time I would
> _not_ suggest you try to make your load look more like the case we
> already know gets fixed.
>
> So yes, it will be "weeks of not seeing crashes" until we'd be
> _really_ confident it's all the same thing, but I'd rather still have
> you test that, than test something else than what caused issues
> originally, if you see what I mean.
Agreed, I’m all onboard with that.
Liebe Grüße,
Christian Theune
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 4:46 ` Jens Axboe
2024-09-19 4:46 ` Linus Torvalds
@ 2024-09-20 13:54 ` Chris Mason
2024-09-24 15:58 ` Matthew Wilcox
` (2 more replies)
2 siblings, 3 replies; 81+ messages in thread
From: Chris Mason @ 2024-09-20 13:54 UTC (permalink / raw)
To: Matthew Wilcox, Jens Axboe
Cc: Linus Torvalds, Dave Chinner, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/19/24 12:36 AM, Matthew Wilcox wrote:
> On Wed, Sep 18, 2024 at 09:38:41PM -0600, Jens Axboe wrote:
>> On 9/18/24 9:12 PM, Linus Torvalds wrote:
>>> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>>
>>>> I think we should just do the simple one-liner of adding a
>>>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the
>>>> xas_split_alloc()).
>>>
>>> .. and obviously that should be actually *verified* to fix the issue
>>> not just with the test-case that Chris and Jens have been using, but
>>> on Christian's real PostgreSQL load.
>>>
>>> Christian?
>>>
>>> Note that the xas_reset() needs to be done after the check for errors
>>> - or like Willy suggested, xas_split_alloc() needs to be re-organized.
>>>
>>> So the simplest fix is probably to just add a
>>>
>>> if (xas_error(&xas))
>>> goto error;
>>> }
>>> + xas_reset(&xas);
>>> xas_lock_irq(&xas);
>>> xas_for_each_conflict(&xas, entry) {
>>> old = entry;
>>>
>>> in __filemap_add_folio() in mm/filemap.c
>>>
>>> (The above is obviously a whitespace-damaged pseudo-patch for the
>>> pre-6758c1128ceb state. I don't actually carry a stable tree around on
>>> my laptop, but I hope it's clear enough what I'm rambling about)
>>
>> I kicked off a quick run with this on 6.9 with my debug patch as well,
>> and it still fails for me... I'll double check everything is sane. For
>> reference, below is the 6.9 filemap patch.
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 30de18c4fd28..88093e2b7256 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping,
>> if (order > folio_order(folio))
>> xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
>> order, gfp);
>> + xas_reset(&xas);
>> xas_lock_irq(&xas);
>> xas_for_each_conflict(&xas, entry) {
>> old = entry;
>
> My brain is still mushy, but I think there is still a problem (both with
> the simple fix for 6.9 and indeed with 6.10).
>
> For splitting a folio, we have the folio locked, so we know it's not
> going anywhere. The tree may get rearranged around it while we don't
> have the xa_lock, but we're somewhat protected.
>
> In this case we're splitting something that was, at one point, a shadow
> entry. There's no struct there to lock. So I think we can have a
> situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9)
> into the nodes we allocate in xas_split_alloc(). In 6.10, that's at
> least guaranteed to be a shadow entry, but in 6.9, it might already be a
> folio by this point because we've raced with something else also doing a
> split.
>
> Probably xas_split_alloc() needs to just do the alloc, like the name
> says, and drop the 'entry' argument. ICBW, but I think it explains
> what you're seeing? Maybe it doesn't?
Jens and I went through a lot of iterations making the repro more
reliable, and we were able to pretty consistently show a UAF with
the debug code that Willy suggested:
XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE));
But, I didn't really catch what Willy was saying about xas_split_alloc()
until this morning.
xas_split_alloc() does the allocation and also shoves an entry into some of
the slots. When the tree changes, the entry we've stored is wildly
wrong, but xas_reset() doesn't undo any of that. So when we actually
use the xas->xa_alloc nodes we've setup, they are pointing to the
wrong things.
Which is probably why the commits in 6.10 added this:
/* entry may have changed before we re-acquire the lock */
if (alloced_order && (old != alloced_shadow || order != alloced_order)) {
xas_destroy(&xas);
alloced_order = 0;
}
The only way to undo the work done by xas_split_alloc() is to call
xas_destroy().
To prove this theory, I tried making a minimal version that also
called destroy, but it all ended up less minimal than the code
that's actually in 6.10. I've got a long test going now with
an extra cond_resched() to make the race bigger, and a printk of victory.
It hasn't fired yet, and I need to hop on an airplane, so I'll just leave
it running for now. But long story short, I think we should probably
just tag all of these for stable:
https://lore.kernel.org/all/20240415171857.19244-2-ryncsn@gmail.com/T/#mdb85922624c39ea7efb775a044af4731890ff776
Also, Willy's proposed changes to xas_split_alloc() seem like a good
idea.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-20 13:54 ` Chris Mason
@ 2024-09-24 15:58 ` Matthew Wilcox
2024-09-24 17:16 ` Sam James
2024-09-24 19:17 ` Chris Mason
2 siblings, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-24 15:58 UTC (permalink / raw)
To: Chris Mason
Cc: Jens Axboe, Linus Torvalds, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Fri, Sep 20, 2024 at 03:54:55PM +0200, Chris Mason wrote:
> On 9/19/24 12:36 AM, Matthew Wilcox wrote:
> > My brain is still mushy, but I think there is still a problem (both with
> > the simple fix for 6.9 and indeed with 6.10).
> >
> > For splitting a folio, we have the folio locked, so we know it's not
> > going anywhere. The tree may get rearranged around it while we don't
> > have the xa_lock, but we're somewhat protected.
> >
> > In this case we're splitting something that was, at one point, a shadow
> > entry. There's no struct there to lock. So I think we can have a
> > situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9)
> > into the nodes we allocate in xas_split_alloc(). In 6.10, that's at
> > least guaranteed to be a shadow entry, but in 6.9, it might already be a
> > folio by this point because we've raced with something else also doing a
> > split.
> >
> > Probably xas_split_alloc() needs to just do the alloc, like the name
> > says, and drop the 'entry' argument. ICBW, but I think it explains
> > what you're seeing? Maybe it doesn't?
>
> Jens and I went through a lot of iterations making the repro more
> reliable, and we were able to pretty consistently show a UAF with
> the debug code that Willy suggested:
>
> XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE));
>
> But, I didn't really catch what Willy was saying about xas_split_alloc()
> until this morning.
>
> xas_split_alloc() does the allocation and also shoves an entry into some of
> the slots. When the tree changes, the entry we've stored is wildly
> wrong, but xas_reset() doesn't undo any of that. So when we actually
> use the xas->xa_alloc nodes we've setup, they are pointing to the
> wrong things.
>
> Which is probably why the commits in 6.10 added this:
>
> /* entry may have changed before we re-acquire the lock */
> if (alloced_order && (old != alloced_shadow || order != alloced_order)) {
> xas_destroy(&xas);
> alloced_order = 0;
> }
>
> The only way to undo the work done by xas_split_alloc() is to call
> xas_destroy().
I hadn't fully understood this until today. Here's what the code in 6.9
did (grossly simplified):
do {
unsigned int order = xa_get_order(xas.xa, xas.xa_index);
if (order > folio_order(folio))
xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
order, gfp);
xas_lock_irq(&xas);
if (old) {
order = xa_get_order(xas.xa, xas.xa_index);
if (order > folio_order(folio)) {
xas_split(&xas, old, order);
}
}
xas_store(&xas, folio);
xas_unlock_irq(&xas);
} while (xas_nomem(&xas, gfp));
The intent was that xas_store() would use the node allocated by
xas_nomem() and xas_split() would use the nodes allocated by
xas_split_alloc(). That doesn't end up happening if the split already
happened before getting the lock. So if we were looking for a minimal
fix for pre-6.10, calling xas_destroy if we don't call xas_split()
would fix the problem. But I think we're better off backporting the
6.10 patches.
For 6.12, I'm going to put this in -next:
http://git.infradead.org/?p=users/willy/xarray.git;a=commitdiff;h=6684aba0780da9f505c202f27e68ee6d18c0aa66
and then send it to Linus in a couple of weeks as an "obviously correct"
bit of hardening. We really should have called xas_reset() before
retaking the lock.
Beyond that, I really want to revisit how, when and what we split.
A few months ago we came to the realisation that splitting order-9
folios to 512 order-0 folios was just legacy thinking. What each user
really wants is to specify a precise page and say "I want this page to
end up in a folio that is of order N" (where N is smaller than the order
of the folio that it's currently in). That is, if we truncate a file
which is currently a multiple of 2MB in size to one which has a tail of,
say, 13377ea bytes, we'd want to create a 1MB folio which we leave at
the end of the file, then a 512kB folio which we free, then a 256kB
folio which we keep, a 128kB folio which we discard, a 64kB folio which
we discard, ...
So we need to do that first, then all this code becomes way easier and
xas_split_alloc() no longer needs to fill in the node at the wrong time.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-20 13:54 ` Chris Mason
2024-09-24 15:58 ` Matthew Wilcox
@ 2024-09-24 17:16 ` Sam James
2024-09-25 16:06 ` Kairui Song
2024-09-24 19:17 ` Chris Mason
2 siblings, 1 reply; 81+ messages in thread
From: Sam James @ 2024-09-24 17:16 UTC (permalink / raw)
To: clm, stable, Kairui Song, Matthew Wilcox
Cc: axboe, ct, david, dqminh, linux-fsdevel, linux-kernel, linux-mm,
linux-xfs, regressions, regressions, torvalds
Kairui, could you send them to the stable ML to be queued if Willy is
fine with it?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-20 13:54 ` Chris Mason
2024-09-24 15:58 ` Matthew Wilcox
2024-09-24 17:16 ` Sam James
@ 2024-09-24 19:17 ` Chris Mason
2024-09-24 19:24 ` Linus Torvalds
2 siblings, 1 reply; 81+ messages in thread
From: Chris Mason @ 2024-09-24 19:17 UTC (permalink / raw)
To: Matthew Wilcox, Jens Axboe
Cc: Linus Torvalds, Dave Chinner, Christian Theune, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On 9/20/24 3:54 PM, Chris Mason wrote:
[ ... ]
> xas_split_alloc() does the allocation and also shoves an entry into some of
> the slots. When the tree changes, the entry we've stored is wildly
> wrong, but xas_reset() doesn't undo any of that. So when we actually
> use the xas->xa_alloc nodes we've setup, they are pointing to the
> wrong things.
>
> Which is probably why the commits in 6.10 added this:
>
> /* entry may have changed before we re-acquire the lock */
> if (alloced_order && (old != alloced_shadow || order != alloced_order)) {
> xas_destroy(&xas);
> alloced_order = 0;
> }
>
> The only way to undo the work done by xas_split_alloc() is to call
> xas_destroy().
>
> To prove this theory, I tried making a minimal version that also
> called destroy, but it all ended up less minimal than the code
> that's actually in 6.10. I've got a long test going now with
> an extra cond_resched() to make the race bigger, and a printk of victory.
>
> It hasn't fired yet, and I need to hop on an airplane, so I'll just leave
> it running for now. But long story short, I think we should probably
> just tag all of these for stable:
>
> https://lore.kernel.org/all/20240415171857.19244-2-ryncsn@gmail.com/T/#mdb85922624c39ea7efb775a044af4731890ff776
>
> Also, Willy's proposed changes to xas_split_alloc() seem like a good
> idea.
A few days of load later and some extra printks, it turns out that
taking the writer lock in __filemap_add_folio() makes us dramatically
more likely to just return EEXIST than go into the xas_split_alloc() dance.
With the changes in 6.10, we only get into that xas_destroy() case above
when the conflicting entry is a shadow entry, so I changed my repro to
use memory pressure instead of fadvise.
I also added a schedule_timeout(1) after the split alloc, and with all
of that I'm able to consistently make the xas_destroy() case trigger
without causing any system instability. Kairui Song's patches do seem
to have fixed things nicely.
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-24 19:17 ` Chris Mason
@ 2024-09-24 19:24 ` Linus Torvalds
0 siblings, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2024-09-24 19:24 UTC (permalink / raw)
To: Chris Mason
Cc: Matthew Wilcox, Jens Axboe, Dave Chinner, Christian Theune,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Tue, 24 Sept 2024 at 12:18, Chris Mason <clm@meta.com> wrote:
>
> A few days of load later and some extra printks, it turns out that
> taking the writer lock in __filemap_add_folio() makes us dramatically
> more likely to just return EEXIST than go into the xas_split_alloc() dance.
.. and that sounds like a good thing, except for the test coverage, I guess.
Which you seem to have fixed:
> With the changes in 6.10, we only get into that xas_destroy() case above
> when the conflicting entry is a shadow entry, so I changed my repro to
> use memory pressure instead of fadvise.
>
> I also added a schedule_timeout(1) after the split alloc, and with all
> of that I'm able to consistently make the xas_destroy() case trigger
> without causing any system instability. Kairui Song's patches do seem
> to have fixed things nicely.
<confused thumbs up / fingers crossed emoji>
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-24 17:16 ` Sam James
@ 2024-09-25 16:06 ` Kairui Song
2024-09-25 16:42 ` Christian Theune
2024-09-27 14:51 ` Sam James
0 siblings, 2 replies; 81+ messages in thread
From: Kairui Song @ 2024-09-25 16:06 UTC (permalink / raw)
To: Sam James, stable
Cc: clm, Matthew Wilcox, axboe, ct, david, dqminh, linux-fsdevel,
linux-kernel, linux-mm, linux-xfs, regressions, regressions,
torvalds
On Wed, Sep 25, 2024 at 1:16 AM Sam James <sam@gentoo.org> wrote:
>
> Kairui, could you send them to the stable ML to be queued if Willy is
> fine with it?
>
Hi Sam,
Thanks for adding me to the discussion.
Yes I'd like to, just not sure if people are still testing and
checking the commits.
And I haven't sent seperate fix just for stable fix before, so can
anyone teach me, should I send only two patches for a minimal change,
or send a whole series (with some minor clean up patch as dependency)
for minimal conflicts? Or the stable team can just pick these up?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-25 16:06 ` Kairui Song
@ 2024-09-25 16:42 ` Christian Theune
2024-09-27 14:51 ` Sam James
1 sibling, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-25 16:42 UTC (permalink / raw)
To: Kairui Song
Cc: Sam James, stable, clm, Matthew Wilcox, axboe, Dave Chinner,
dqminh, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
regressions, regressions, torvalds
> On 25. Sep 2024, at 18:06, Kairui Song <ryncsn@gmail.com> wrote:
>
> On Wed, Sep 25, 2024 at 1:16 AM Sam James <sam@gentoo.org> wrote:
>>
>> Kairui, could you send them to the stable ML to be queued if Willy is
>> fine with it?
>>
>
> Hi Sam,
>
> Thanks for adding me to the discussion.
>
> Yes I'd like to, just not sure if people are still testing and
> checking the commits.
As the one who raised the issue recently: we’re rolling out 6.11 for testing on a couple hundred machines right now. I’ve scheduled this internally to run 8-12 weeks due to the fleeting nature and will report back if it pops up again or after that time has elapsed.
AFAICT this is a fix in any case even if we should find more issues in my fleet later.
Cheers,
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-25 16:06 ` Kairui Song
2024-09-25 16:42 ` Christian Theune
@ 2024-09-27 14:51 ` Sam James
2024-09-27 14:58 ` Jens Axboe
1 sibling, 1 reply; 81+ messages in thread
From: Sam James @ 2024-09-27 14:51 UTC (permalink / raw)
To: Kairui Song, Greg KH
Cc: stable, clm, Matthew Wilcox, axboe, ct, david, dqminh,
linux-fsdevel, linux-kernel, linux-mm, linux-xfs, regressions,
regressions, torvalds
Kairui Song <ryncsn@gmail.com> writes:
> On Wed, Sep 25, 2024 at 1:16 AM Sam James <sam@gentoo.org> wrote:
>>
>> Kairui, could you send them to the stable ML to be queued if Willy is
>> fine with it?
>>
>
> Hi Sam,
Hi Kairui,
>
> Thanks for adding me to the discussion.
>
> Yes I'd like to, just not sure if people are still testing and
> checking the commits.
>
> And I haven't sent seperate fix just for stable fix before, so can
> anyone teach me, should I send only two patches for a minimal change,
> or send a whole series (with some minor clean up patch as dependency)
> for minimal conflicts? Or the stable team can just pick these up?
Please see https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html.
If Option 2 can't work (because of conflicts), please follow Option 3
(https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html#option-3).
Just explain the background and link to this thread in a cover letter
and mention it's your first time. Greg didn't bite me when I fumbled my
way around it :)
(greg, please correct me if I'm talking rubbish)
thanks,
sam
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-27 14:51 ` Sam James
@ 2024-09-27 14:58 ` Jens Axboe
2024-10-01 21:10 ` Kairui Song
0 siblings, 1 reply; 81+ messages in thread
From: Jens Axboe @ 2024-09-27 14:58 UTC (permalink / raw)
To: Sam James, Kairui Song, Greg KH
Cc: stable, clm, Matthew Wilcox, ct, david, dqminh, linux-fsdevel,
linux-kernel, linux-mm, linux-xfs, regressions, regressions,
torvalds
On 9/27/24 8:51 AM, Sam James wrote:
> Kairui Song <ryncsn@gmail.com> writes:
>
>> On Wed, Sep 25, 2024 at 1:16?AM Sam James <sam@gentoo.org> wrote:
>>>
>>> Kairui, could you send them to the stable ML to be queued if Willy is
>>> fine with it?
>>>
>>
>> Hi Sam,
>
> Hi Kairui,
>
>>
>> Thanks for adding me to the discussion.
>>
>> Yes I'd like to, just not sure if people are still testing and
>> checking the commits.
>>
>> And I haven't sent seperate fix just for stable fix before, so can
>> anyone teach me, should I send only two patches for a minimal change,
>> or send a whole series (with some minor clean up patch as dependency)
>> for minimal conflicts? Or the stable team can just pick these up?
>
> Please see https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html.
>
> If Option 2 can't work (because of conflicts), please follow Option 3
> (https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html#option-3).
>
> Just explain the background and link to this thread in a cover letter
> and mention it's your first time. Greg didn't bite me when I fumbled my
> way around it :)
>
> (greg, please correct me if I'm talking rubbish)
It needs two cherry picks, one of them won't pick cleanly. So I suggest
whoever submits this to stable does:
1) Cherry pick the two commits, fixup the simple issue with one of them.
I forget what it was since it's been a week and a half since I did
it, but it's trivial to fixup.
Don't forget to add the "commit XXX upstream" to the commit message.
2) Test that it compiles and boots and send an email to
stable@vger.kernel.org with the patches attached and CC the folks in
this thread, to help spot if there are mistakes.
and that should be it. Worst case, we'll need a few different patches
since this affects anything back to 5.19, and each currently maintained
stable kernel version will need it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-19 10:19 ` Christian Theune
@ 2024-09-30 17:34 ` Christian Theune
2024-09-30 18:46 ` Linus Torvalds
` (2 more replies)
0 siblings, 3 replies; 81+ messages in thread
From: Christian Theune @ 2024-09-30 17:34 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
Hi,
we’ve been running a number of VMs since last week on 6.11. We’ve encountered one hung task situation multiple times now that seems to be resolving itself after a bit of time, though. I do not see spinning CPU during this time.
The situation seems to be related to cgroups-based IO throttling / weighting so far:
Here are three examples of similar tracebacks where jobs that do perform a certain amount of IO (either given a weight or given an explicit limit like this:
IOWeight=10
IOReadIOPSMax=/dev/vda 188
IOWriteIOPSMax=/dev/vda 188
Telemetry for the affected VM does not show that it actually reaches 188 IOPS (the load is mostly writing) but creates a kind of gaussian curve …
The underlying storage and network was completely inconspicuous during the whole time.
Sep 27 00:51:20 <redactedhostname>13 kernel: INFO: task nix-build:5300 blocked for more than 122 seconds.
Sep 27 00:51:20 <redactedhostname>13 kernel: Not tainted 6.11.0 #1-NixOS
Sep 27 00:51:20 <redactedhostname>13 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 27 00:51:20 <redactedhostname>13 kernel: task:nix-build state:D stack:0 pid:5300 tgid:5298 ppid:5297 flags:0x00000002
Sep 27 00:51:20 <redactedhostname>13 kernel: Call Trace:
Sep 27 00:51:20 <redactedhostname>13 kernel: <TASK>
Sep 27 00:51:20 <redactedhostname>13 kernel: __schedule+0x3a3/0x1300
Sep 27 00:51:20 <redactedhostname>13 kernel: ? xfs_vm_writepages+0x67/0x90 [xfs]
Sep 27 00:51:20 <redactedhostname>13 kernel: schedule+0x27/0xf0
Sep 27 00:51:20 <redactedhostname>13 kernel: io_schedule+0x46/0x70
Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_bit_common+0x13f/0x340
Sep 27 00:51:20 <redactedhostname>13 kernel: ? __pfx_wake_page_function+0x10/0x10
Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_writeback+0x2b/0x80
Sep 27 00:51:20 <redactedhostname>13 kernel: __filemap_fdatawait_range+0x80/0xe0
Sep 27 00:51:20 <redactedhostname>13 kernel: filemap_write_and_wait_range+0x85/0xb0
Sep 27 00:51:20 <redactedhostname>13 kernel: xfs_setattr_size+0xd9/0x3c0 [xfs]
Sep 27 00:51:20 <redactedhostname>13 kernel: xfs_vn_setattr+0x81/0x150 [xfs]
Sep 27 00:51:20 <redactedhostname>13 kernel: notify_change+0x2ed/0x4f0
Sep 27 00:51:20 <redactedhostname>13 kernel: ? do_truncate+0x98/0xf0
Sep 27 00:51:20 <redactedhostname>13 kernel: do_truncate+0x98/0xf0
Sep 27 00:51:20 <redactedhostname>13 kernel: do_ftruncate+0xfe/0x160
Sep 27 00:51:20 <redactedhostname>13 kernel: __x64_sys_ftruncate+0x3e/0x70
Sep 27 00:51:20 <redactedhostname>13 kernel: do_syscall_64+0xb7/0x200
Sep 27 00:51:20 <redactedhostname>13 kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f
Sep 27 00:51:20 <redactedhostname>13 kernel: RIP: 0033:0x7f1ed1912c2b
Sep 27 00:51:20 <redactedhostname>13 kernel: RSP: 002b:00007f1eb73fd3f8 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
Sep 27 00:51:20 <redactedhostname>13 kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1ed1912c2b
Sep 27 00:51:20 <redactedhostname>13 kernel: RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000012
Sep 27 00:51:20 <redactedhostname>13 kernel: RBP: 0000000000000012 R08: 0000000000000000 R09: 00007f1eb73fd3a0
Sep 27 00:51:20 <redactedhostname>13 kernel: R10: 0000000000132000 R11: 0000000000000246 R12: 00005601d0150290
Sep 27 00:51:20 <redactedhostname>13 kernel: R13: 00005601d58ae0b8 R14: 0000000000000001 R15: 00005601d58bec58
Sep 27 00:51:20 <redactedhostname>13 kernel: </TASK>
Sep 28 10:13:04 release2405dev00 kernel: INFO: task nix-channel:507080 blocked for more than 122 seconds.
Sep 28 10:13:04 release2405dev00 kernel: Not tainted 6.11.0 #1-NixOS
Sep 28 10:13:04 release2405dev00 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:13:04 release2405dev00 kernel: task:nix-channel state:D stack:0 pid:507080 tgid:507080 ppid:507061 flags:0x00000002
Sep 28 10:13:04 release2405dev00 kernel: Call Trace:
Sep 28 10:13:04 release2405dev00 kernel: <TASK>
Sep 28 10:13:04 release2405dev00 kernel: __schedule+0x3a3/0x1300
Sep 28 10:13:04 release2405dev00 kernel: ? xfs_vm_writepages+0x67/0x90 [xfs]
Sep 28 10:13:04 release2405dev00 kernel: schedule+0x27/0xf0
Sep 28 10:13:04 release2405dev00 kernel: io_schedule+0x46/0x70
Sep 28 10:13:04 release2405dev00 kernel: folio_wait_bit_common+0x13f/0x340
Sep 28 10:13:04 release2405dev00 kernel: ? __pfx_wake_page_function+0x10/0x10
Sep 28 10:13:04 release2405dev00 kernel: folio_wait_writeback+0x2b/0x80
Sep 28 10:13:04 release2405dev00 kernel: __filemap_fdatawait_range+0x80/0xe0
Sep 28 10:13:04 release2405dev00 kernel: file_write_and_wait_range+0x88/0xb0
Sep 28 10:13:04 release2405dev00 kernel: xfs_file_fsync+0x5e/0x2a0 [xfs]
Sep 28 10:13:04 release2405dev00 kernel: __x64_sys_fdatasync+0x52/0x90
Sep 28 10:13:04 release2405dev00 kernel: do_syscall_64+0xb7/0x200
Sep 28 10:13:04 release2405dev00 kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f
Sep 28 10:13:04 release2405dev00 kernel: RIP: 0033:0x7f5b9371270a
Sep 28 10:13:04 release2405dev00 kernel: RSP: 002b:00007ffd678149f0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
Sep 28 10:13:04 release2405dev00 kernel: RAX: ffffffffffffffda RBX: 0000559a4d023a18 RCX: 00007f5b9371270a
Sep 28 10:13:04 release2405dev00 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
Sep 28 10:13:04 release2405dev00 kernel: RBP: 0000000000000000 R08: 0000000000000001 R09: 0000559a4d027878
Sep 28 10:13:04 release2405dev00 kernel: R10: 0000000000000016 R11: 0000000000000293 R12: 0000000000000001
Sep 28 10:13:04 release2405dev00 kernel: R13: 000000000000002e R14: 0000559a4d0278fc R15: 00007ffd67814bf0
Sep 28 10:13:04 release2405dev00 kernel: </TASK>
Sep 28 03:39:19 <redactedhostname>10 kernel: INFO: task nix-build:94696 blocked for more than 122 seconds.
Sep 28 03:39:19 <redactedhostname>10 kernel: Not tainted 6.11.0 #1-NixOS
Sep 28 03:39:19 <redactedhostname>10 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 03:39:19 <redactedhostname>10 kernel: task:nix-build state:D stack:0 pid:94696 tgid:94696 ppid:94695 flags:0x00000002
Sep 28 03:39:19 <redactedhostname>10 kernel: Call Trace:
Sep 28 03:39:19 <redactedhostname>10 kernel: <TASK>
Sep 28 03:39:19 <redactedhostname>10 kernel: __schedule+0x3a3/0x1300
Sep 28 03:39:19 <redactedhostname>10 kernel: schedule+0x27/0xf0
Sep 28 03:39:19 <redactedhostname>10 kernel: io_schedule+0x46/0x70
Sep 28 03:39:19 <redactedhostname>10 kernel: folio_wait_bit_common+0x13f/0x340
Sep 28 03:39:19 <redactedhostname>10 kernel: ? __pfx_wake_page_function+0x10/0x10
Sep 28 03:39:19 <redactedhostname>10 kernel: folio_wait_writeback+0x2b/0x80
Sep 28 03:39:19 <redactedhostname>10 kernel: truncate_inode_partial_folio+0x5e/0x1b0
Sep 28 03:39:19 <redactedhostname>10 kernel: truncate_inode_pages_range+0x1de/0x400
Sep 28 03:39:19 <redactedhostname>10 kernel: evict+0x29f/0x2c0
Sep 28 03:39:19 <redactedhostname>10 kernel: ? iput+0x6e/0x230
Sep 28 03:39:19 <redactedhostname>10 kernel: ? _atomic_dec_and_lock+0x39/0x50
Sep 28 03:39:19 <redactedhostname>10 kernel: do_unlinkat+0x2de/0x330
Sep 28 03:39:19 <redactedhostname>10 kernel: __x64_sys_unlink+0x3f/0x70
Sep 28 03:39:19 <redactedhostname>10 kernel: do_syscall_64+0xb7/0x200
Sep 28 03:39:19 <redactedhostname>10 kernel: entry_SYSCALL_64_after_hwframe+0x77/0x7f
Sep 28 03:39:19 <redactedhostname>10 kernel: RIP: 0033:0x7f37c062d56b
Sep 28 03:39:19 <redactedhostname>10 kernel: RSP: 002b:00007fff71638018 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
Sep 28 03:39:19 <redactedhostname>10 kernel: RAX: ffffffffffffffda RBX: 0000562038c30500 RCX: 00007f37c062d56b
Sep 28 03:39:19 <redactedhostname>10 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000562038c31c80
Sep 28 03:39:19 <redactedhostname>10 kernel: RBP: 0000562038c30690 R08: 0000000000016020 R09: 0000000000000000
Sep 28 03:39:19 <redactedhostname>10 kernel: R10: 0000000000000050 R11: 0000000000000206 R12: 00007fff71638058
Sep 28 03:39:19 <redactedhostname>10 kernel: R13: 00007fff7163803c R14: 00007fff71638960 R15: 0000562040b8a500
Sep 28 03:39:19 <redactedhostname>10 kernel: </TASK>
Hope this helps,
Christian
> On 19. Sep 2024, at 12:19, Christian Theune <ct@flyingcircus.io> wrote:
>
>
>
>> On 19. Sep 2024, at 08:57, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>
>> Yeah, right now Jens is still going to run some more testing, but I
>> think the plan is to just backport
>>
>> a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
>> 6758c1128ceb ("mm/filemap: optimize filemap folio adding")
>>
>> and I think we're at the point where you might as well start testing
>> that if you have the cycles for it. Jens is mostly trying to confirm
>> the root cause, but even without that, I think you running your load
>> with those two changes back-ported is worth it.
>>
>> (Or even just try running it on plain 6.10 or 6.11, both of which
>> already has those commits)
>
> I’ve discussed this with my team and we’re preparing to switch all our
> non-prod machines as well as those production machines that have shown
> the error before.
>
> This will require a bit of user communication and reboot scheduling.
> Our release prep will be able to roll this out starting early next week
> and the production machines in question around Sept 30.
>
> We would run with 6.11 as our understanding so far is that running the
> most current kernel would generate the most insight and is easier to
> work with for you all?
>
> (Generally we run the mostly vanilla LTS that has surpassed x.y.50+ so
> we might later downgrade to 6.6 when this is fixed.)
>
>> So considering how well the reproducer works for Jens and Chris, my
>> main worry is whether your load might have some _additional_ issue.
>>
>> Unlikely, but still .. The two commits fix the repproducer, so I think
>> the important thing to make sure is that it really fixes the original
>> issue too.
>>
>> And yeah, I'd be surprised if it doesn't, but at the same time I would
>> _not_ suggest you try to make your load look more like the case we
>> already know gets fixed.
>>
>> So yes, it will be "weeks of not seeing crashes" until we'd be
>> _really_ confident it's all the same thing, but I'd rather still have
>> you test that, than test something else than what caused issues
>> originally, if you see what I mean.
>
> Agreed, I’m all onboard with that.
>
> Liebe Grüße,
> Christian Theune
>
> --
> Christian Theune · ct@flyingcircus.io · +49 345 219401 0
> Flying Circus Internet Operations GmbH · https://flyingcircus.io
> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
>
Liebe Grüße,
Christian Theune
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 17:34 ` Christian Theune
@ 2024-09-30 18:46 ` Linus Torvalds
2024-09-30 19:25 ` Christian Theune
2024-10-01 0:56 ` Chris Mason
2024-10-01 2:22 ` Dave Chinner
2 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-30 18:46 UTC (permalink / raw)
To: Christian Theune
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On Mon, 30 Sept 2024 at 10:35, Christian Theune <ct@flyingcircus.io> wrote:
>
> Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_bit_common+0x13f/0x340
> Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_writeback+0x2b/0x80
Gaah. Every single case you point to is that folio_wait_writeback() case.
And this might be an old old annoyance.
folio_wait_writeback() is insane. It does
while (folio_test_writeback(folio)) {
trace_folio_wait_writeback(folio, folio_mapping(folio));
folio_wait_bit(folio, PG_writeback);
}
and the reason that is insane is that PG_writeback isn't some kind of
exclusive state. So folio_wait_bit() will return once somebody has
ended writeback, but *new* writeback can easily have been started
afterwards. So then we go back to wait...
And even after it eventually returns (possibly after having waited for
hundreds of other processes writing back that folio - imagine lots of
other threads doing writes to it and 'fdatasync()' or whatever) the
caller *still* can't actually assume that the writeback bit is clear,
because somebody else might have started writeback again.
Anyway, it's insane, but it's insane for a *reason*. We've tried to
fix this before, long before it was a folio op. See commit
c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple
pending writebacks").
IOW, this code is known-broken and might have extreme unfairness
issues (although I had blissfully forgotten about it), because while
the actual writeback *bit* itself is set and cleared atomically, the
wakeup for the bit is asynchronous and can be delayed almost
arbitrarily, so you can get basically spurious wakeups that were from
a previous bit clear.
So the "wait many times" is crazy, but it's sadly a necessary crazy as
things are right now.
Now, many callers hold the page lock while doing this, and in that
case new writeback cases shouldn't happen, and so repeating the loop
should be extremely limited.
But "many" is not "all". For example, __filemap_fdatawait_range() very
much doesn't hold the lock on the pages it waits for, so afaik this
can cause that unfairness and starvation issue.
That said, while every one of your traces are for that
folio_wait_writeback(), the last one is for the truncate case, and
that one *does* hold the page lock and so shouldn't see this potential
unfairness issue.
So the code here is questionable, and might cause some issues, but the
starvation of folio_wait_writeback() can't explain _all_ the cases you
see.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 18:46 ` Linus Torvalds
@ 2024-09-30 19:25 ` Christian Theune
2024-09-30 20:12 ` Linus Torvalds
0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-09-30 19:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
> On 30. Sep 2024, at 20:46, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Mon, 30 Sept 2024 at 10:35, Christian Theune <ct@flyingcircus.io> wrote:
>>
>> Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_bit_common+0x13f/0x340
>> Sep 27 00:51:20 <redactedhostname>13 kernel: folio_wait_writeback+0x2b/0x80
>
> Gaah. Every single case you point to is that folio_wait_writeback() case.
>
> And this might be an old old annoyance.
I’m being told that I’m somewhat of a truffle pig for dirty code … how long ago does “old old” refer to, btw?
> […]
> IOW, this code is known-broken and might have extreme unfairness
> issues (although I had blissfully forgotten about it), because while
> the actual writeback *bit* itself is set and cleared atomically, the
> wakeup for the bit is asynchronous and can be delayed almost
> arbitrarily, so you can get basically spurious wakeups that were from
> a previous bit clear.
I wonder whether the extreme unfairness gets exacerbated when in a cgroup throttled context … It’s a limited number of workloads we
have seen this with, some of which are parallelized and others aren’t. (and I guess non-parallelized code shouldn’t suffer much from this?)
Maybe I can reproduce this more easily and ...
> So the code here is questionable, and might cause some issues, but the
> starvation of folio_wait_writeback() can't explain _all_ the cases you
> see.
… also get you more data and dig for maybe more cases more systematically.
Anything particular you’d like me to look for? Any specific additional data
points that would help?
We’re going to keep with 6.11 in staging and avoid rolling it out to the production machines for now.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 19:25 ` Christian Theune
@ 2024-09-30 20:12 ` Linus Torvalds
2024-09-30 20:56 ` Matthew Wilcox
0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-09-30 20:12 UTC (permalink / raw)
To: Christian Theune
Cc: Dave Chinner, Matthew Wilcox, Chris Mason, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
On Mon, 30 Sept 2024 at 12:25, Christian Theune <ct@flyingcircus.io> wrote:
>
> I’m being told that I’m somewhat of a truffle pig for dirty code … how long ago does “old old” refer to, btw?
It's basically been that way forever. The code has changed many times,
but we've basically always had that "wait on bit will wait not until
the next wakeup, but until it actually sees the bit being clear".
And by "always" I mean "going back at least to before the git tree". I
didn't search further. It's not new.
The only reason I pointed at that (relatively recent) commit from 2021
is that when we rewrote the page bit waiting logic (for some unrelated
horrendous scalability issues with tens of thousands of pages on wait
queues), the rewritten code _tried_ to not do it, and instead go "we
were woken up by a bit clear op, so now we've waited enough".
And that then caused problems as explained in that commit c2407cf7d22d
("mm: make wait_on_page_writeback() wait for multiple pending
writebacks") because the wakeups aren't atomic wrt the actual bit
setting/clearing/testing.
IOW - that 2021 commit didn't _introduce_ the issue, it just went back
to the horrendous behavior that we've always had, and temporarily
tried to avoid.
Note that "horrendous behavior" is really "you probably can't hit it
under any normal load". So it's not like it's a problem in practice.
Except your load clearly triggers *something*. And maybe this is part of it.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 20:12 ` Linus Torvalds
@ 2024-09-30 20:56 ` Matthew Wilcox
2024-09-30 22:42 ` Davidlohr Bueso
2024-09-30 23:53 ` Linus Torvalds
0 siblings, 2 replies; 81+ messages in thread
From: Matthew Wilcox @ 2024-09-30 20:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christian Theune, Dave Chinner, Chris Mason, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Mon, Sep 30, 2024 at 01:12:37PM -0700, Linus Torvalds wrote:
> It's basically been that way forever. The code has changed many times,
> but we've basically always had that "wait on bit will wait not until
> the next wakeup, but until it actually sees the bit being clear".
>
> And by "always" I mean "going back at least to before the git tree". I
> didn't search further. It's not new.
>
> The only reason I pointed at that (relatively recent) commit from 2021
> is that when we rewrote the page bit waiting logic (for some unrelated
> horrendous scalability issues with tens of thousands of pages on wait
> queues), the rewritten code _tried_ to not do it, and instead go "we
> were woken up by a bit clear op, so now we've waited enough".
>
> And that then caused problems as explained in that commit c2407cf7d22d
> ("mm: make wait_on_page_writeback() wait for multiple pending
> writebacks") because the wakeups aren't atomic wrt the actual bit
> setting/clearing/testing.
Could we break out if folio->mapping has changed? Clearly if it has,
we're no longer waiting for the folio we thought we were waiting for,
but for a folio which now belongs to a different file.
maybe this:
+void __folio_wait_writeback(struct address_space *mapping, struct folio *folio)
+{
+ while (folio_test_writeback(folio) && folio->mapping == mapping) {
+ trace_folio_wait_writeback(folio, mapping);
+ folio_wait_bit(folio, PG_writeback);
+ }
+}
[...]
void folio_wait_writeback(struct folio *folio)
{
- while (folio_test_writeback(folio)) {
- trace_folio_wait_writeback(folio, folio_mapping(folio));
- folio_wait_bit(folio, PG_writeback);
- }
+ __folio_wait_writeback(folio->mapping, folio);
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 20:56 ` Matthew Wilcox
@ 2024-09-30 22:42 ` Davidlohr Bueso
2024-09-30 23:00 ` Davidlohr Bueso
2024-09-30 23:53 ` Linus Torvalds
1 sibling, 1 reply; 81+ messages in thread
From: Davidlohr Bueso @ 2024-09-30 22:42 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Linus Torvalds, Christian Theune, Dave Chinner, Chris Mason,
Jens Axboe, linux-mm, linux-xfs, linux-fsdevel, linux-kernel,
Daniel Dao, regressions, regressions
On Mon, 30 Sep 2024, Matthew Wilcox wrote:\n
>On Mon, Sep 30, 2024 at 01:12:37PM -0700, Linus Torvalds wrote:
>> It's basically been that way forever. The code has changed many times,
>> but we've basically always had that "wait on bit will wait not until
>> the next wakeup, but until it actually sees the bit being clear".
>>
>> And by "always" I mean "going back at least to before the git tree". I
>> didn't search further. It's not new.
>>
>> The only reason I pointed at that (relatively recent) commit from 2021
>> is that when we rewrote the page bit waiting logic (for some unrelated
>> horrendous scalability issues with tens of thousands of pages on wait
>> queues), the rewritten code _tried_ to not do it, and instead go "we
>> were woken up by a bit clear op, so now we've waited enough".
>>
>> And that then caused problems as explained in that commit c2407cf7d22d
>> ("mm: make wait_on_page_writeback() wait for multiple pending
>> writebacks") because the wakeups aren't atomic wrt the actual bit
>> setting/clearing/testing.
>
>Could we break out if folio->mapping has changed? Clearly if it has,
>we're no longer waiting for the folio we thought we were waiting for,
>but for a folio which now belongs to a different file.
>
>maybe this:
>
>+void __folio_wait_writeback(struct address_space *mapping, struct folio *folio)
>+{
>+ while (folio_test_writeback(folio) && folio->mapping == mapping) {
READ_ONCE(folio->mapping)?
>+ trace_folio_wait_writeback(folio, mapping);
>+ folio_wait_bit(folio, PG_writeback);
>+ }
>+}
>
>[...]
>
> void folio_wait_writeback(struct folio *folio)
> {
>- while (folio_test_writeback(folio)) {
>- trace_folio_wait_writeback(folio, folio_mapping(folio));
>- folio_wait_bit(folio, PG_writeback);
>- }
>+ __folio_wait_writeback(folio->mapping, folio);
> }
Also, the last sentence in the description would need to be dropped.
Thanks,
Davidlohr
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 22:42 ` Davidlohr Bueso
@ 2024-09-30 23:00 ` Davidlohr Bueso
0 siblings, 0 replies; 81+ messages in thread
From: Davidlohr Bueso @ 2024-09-30 23:00 UTC (permalink / raw)
To: Matthew Wilcox, Linus Torvalds, Christian Theune, Dave Chinner,
Chris Mason, Jens Axboe, linux-mm, linux-xfs, linux-fsdevel,
linux-kernel, Daniel Dao, regressions, regressions
On Mon, 30 Sep 2024, Davidlohr Bueso wrote:\n
>Also, the last sentence in the description would need to be dropped.
No never mind this, it is fine. I was mostly thinking about the pathological
unbounded scenario which is removed, but after re-reading the description
it is still valid.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 20:56 ` Matthew Wilcox
2024-09-30 22:42 ` Davidlohr Bueso
@ 2024-09-30 23:53 ` Linus Torvalds
1 sibling, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2024-09-30 23:53 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Christian Theune, Dave Chinner, Chris Mason, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Mon, 30 Sept 2024 at 13:57, Matthew Wilcox <willy@infradead.org> wrote:
>
> Could we break out if folio->mapping has changed? Clearly if it has,
> we're no longer waiting for the folio we thought we were waiting for,
> but for a folio which now belongs to a different file.
Sounds like a sane check to me, but it's also not clear that this
would make any difference.
The most likely reason for starvation I can see is a slow thread
(possibly due to cgroup throttling like Christian alluded to) would
simply be continually unlucky, because every time it gets woken up,
some other thread has already dirtied the data and caused writeback
again.
I would think that kind of behavior (perhaps some DB transaction
header kind of folio) would be more likely than the mapping changing
(and then remaining under writeback for some other mapping).
But I really don't know.
I would much prefer to limit the folio_wait_bit() loop based on something else.
For example, the basic reason for that loop (unless there is some
other hidden one) is that the folio writeback bit is not atomic wrt
the wakeup. Maybe we could *make* it atomic, by simply taking the
folio waitqueue lock before clearing the bit?
(Only if it has the "waiters" bit set, of course!)
Handwavy.
Anyway, this writeback handling is nasty. folio_end_writeback() has a
big comment about the subtle folio reference issue too, and ignoring
that we also have this:
if (__folio_end_writeback(folio))
folio_wake_bit(folio, PG_writeback);
(which is the cause of the non-atomicity: __folio_end_writeback() will
clear the bit, and return the "did we have waiters", and then
folio_wake_bit() will get the waitqueue lock and wake people up).
And notice how __folio_end_writeback() clears the bit with
ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback);
which does that "clear bit and look it it had waiters" atomically. But
that function then has a comment that says
* This must only be used for flags which are changed with the folio
* lock held. For example, it is unsafe to use for PG_dirty as that
* can be set without the folio lock held. [...]
but the code that uses it here does *NOT* hold the folio lock.
I think the comment is wrong, and the code is fine (the important
point is that the folio lock _serialized_ the writers, and while
clearing doesn't hold the folio lock, you can't clear it without
setting it, and setting the writeback flag *does* hold the folio
lock).
So my point is not that this code is wrong, but that this code is all
kinds of subtle and complex. I think it would be good to change the
rules so that we serialize with waiters, but being complex and subtle
means it sounds all kinds of nasty.
Linus
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 17:34 ` Christian Theune
2024-09-30 18:46 ` Linus Torvalds
@ 2024-10-01 0:56 ` Chris Mason
2024-10-01 7:54 ` Christian Theune
2024-10-10 6:29 ` Christian Theune
2024-10-01 2:22 ` Dave Chinner
2 siblings, 2 replies; 81+ messages in thread
From: Chris Mason @ 2024-10-01 0:56 UTC (permalink / raw)
To: Christian Theune, Linus Torvalds
Cc: Dave Chinner, Matthew Wilcox, Jens Axboe, linux-mm, linux-xfs,
linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
[-- Attachment #1: Type: text/plain, Size: 2681 bytes --]
On 9/30/24 7:34 PM, Christian Theune wrote:
> Hi,
>
> we’ve been running a number of VMs since last week on 6.11. We’ve encountered one hung task situation multiple times now that seems to be resolving itself after a bit of time, though. I do not see spinning CPU during this time.
>
> The situation seems to be related to cgroups-based IO throttling / weighting so far:
>
> Here are three examples of similar tracebacks where jobs that do perform a certain amount of IO (either given a weight or given an explicit limit like this:
>
> IOWeight=10
> IOReadIOPSMax=/dev/vda 188
> IOWriteIOPSMax=/dev/vda 188
>
> Telemetry for the affected VM does not show that it actually reaches 188 IOPS (the load is mostly writing) but creates a kind of gaussian curve …
>
> The underlying storage and network was completely inconspicuous during the whole time.
Not disagreeing with Linus at all, but given that you've got IO
throttling too, we might really just be waiting. It's hard to tell
because the hung task timeouts only give you information about one process.
I've attached a minimal version of a script we use here to show all the
D state processes, it might help explain things. The only problem is
you have to actually ssh to the box and run it when you're stuck.
The idea is to print the stack trace of every D state process, and then
also print out how often each unique stack trace shows up. When we're
deadlocked on something, there are normally a bunch of the same stack
(say waiting on writeback) and then one jerk sitting around in a
different stack who is causing all the trouble.
(I made some quick changes to make this smaller, so apologies if you get
silly errors)
Example output:
sudo ./walker.py
15 rcu_tasks_trace_kthread D
[<0>] __wait_rcu_gp+0xab/0x120
[<0>] synchronize_rcu+0x46/0xd0
[<0>] rcu_tasks_wait_gp+0x86/0x2a0
[<0>] rcu_tasks_one_gp+0x300/0x430
[<0>] rcu_tasks_kthread+0x9a/0xb0
[<0>] kthread+0xad/0xe0
[<0>] ret_from_fork+0x1f/0x30
1440504 dd D
[<0>] folio_wait_bit_common+0x149/0x2d0
[<0>] filemap_read+0x7bd/0xd10
[<0>] blkdev_read_iter+0x5b/0x130
[<0>] __x64_sys_read+0x1ce/0x3f0
[<0>] do_syscall_64+0x3d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
-----
stack summary
1 hit:
[<0>] __wait_rcu_gp+0xab/0x120
[<0>] synchronize_rcu+0x46/0xd0
[<0>] rcu_tasks_wait_gp+0x86/0x2a0
[<0>] rcu_tasks_one_gp+0x300/0x430
[<0>] rcu_tasks_kthread+0x9a/0xb0
[<0>] kthread+0xad/0xe0
[<0>] ret_from_fork+0x1f/0x30
-----
[<0>] folio_wait_bit_common+0x149/0x2d0
[<0>] filemap_read+0x7bd/0xd10
[<0>] blkdev_read_iter+0x5b/0x130
[<0>] __x64_sys_read+0x1ce/0x3f0
[<0>] do_syscall_64+0x3d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[-- Attachment #2: walker.py.txt --]
[-- Type: text/plain, Size: 3020 bytes --]
#!/usr/bin/env python3
#
# this walks all the tasks on the system and prints out a stack trace
# of any tasks waiting in D state. If you pass -a, it will print out
# the stack of every task it finds.
#
# It also makes a histogram of the common stacks so you can see where
# more of the tasks are. Usually when we're deadlocked, we care about
# the least common stacks.
#
import sys
import os
import argparse
parser = argparse.ArgumentParser(description='Show kernel stacks')
parser.add_argument('-a', '--all_tasks', action='store_true', help='Dump all stacks')
parser.add_argument('-p', '--pid', type=str, help='Filter on pid')
parser.add_argument('-c', '--command', type=str, help='Filter on command name')
options = parser.parse_args()
stacks = {}
# parse the units from a number and normalize into KB
def parse_number(s):
try:
words = s.split()
unit = words[-1].lower()
number = int(words[1])
tag = words[0].lower().rstrip(':')
# we store in kb
if unit == "mb":
number = number * 1024
elif unit == "gb":
number = number * 1024 * 1024
elif unit == "tb":
number = number * 1024 * 1024
return (tag, number)
except:
return (None, None)
# read /proc/pid/stack and add it to the hashes
def add_stack(path, pid, cmd, status):
global stacks
try:
stack = open(os.path.join(path, "stack"), 'r').read()
except:
return
if (status != "D" and not options.all_tasks):
return
print("%s %s %s" % (pid, cmd, status))
print(stack)
v = stacks.get(stack)
if v:
v += 1
else:
v = 1
stacks[stack] = v
# worker to read all the files for one individual task
def run_one_task(path):
try:
stat = open(os.path.join(path, "stat"), 'r').read()
except:
return
words = stat.split()
pid, cmd, status = words[0:3]
cmd = cmd.lstrip('(')
cmd = cmd.rstrip(')')
if options.command and options.command != cmd:
return
add_stack(path, pid, cmd, status)
def print_usage():
sys.stderr.write("Usage: %s [-a]\n" % sys.argv[0])
sys.exit(1)
# for a given pid in string form, read the files from proc
def run_pid(name):
try:
pid = int(name)
except:
return
p = os.path.join("/proc", name, "task")
if not os.path.exists(p):
return
try:
for t in os.listdir(p):
run_one_task(os.path.join(p, t))
except:
pass
if options.pid:
run_pid(options.pid)
else:
for name in os.listdir("/proc"):
run_pid(name)
values = {}
for stack, count in stacks.items():
l = values.setdefault(count, [])
l.append(stack)
counts = list(values.keys())
counts.sort(reverse=True)
if counts:
print("-----\nstack summary\n")
for x in counts:
if x == 1:
print("1 hit:")
else:
print("%d hits: " % x)
l = values[x]
for stack in l:
print(stack)
print("-----")
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-30 17:34 ` Christian Theune
2024-09-30 18:46 ` Linus Torvalds
2024-10-01 0:56 ` Chris Mason
@ 2024-10-01 2:22 ` Dave Chinner
2 siblings, 0 replies; 81+ messages in thread
From: Dave Chinner @ 2024-10-01 2:22 UTC (permalink / raw)
To: Christian Theune
Cc: Linus Torvalds, Matthew Wilcox, Chris Mason, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On Mon, Sep 30, 2024 at 07:34:39PM +0200, Christian Theune wrote:
> Hi,
>
> we’ve been running a number of VMs since last week on 6.11. We’ve
> encountered one hung task situation multiple times now that seems
> to be resolving itself after a bit of time, though. I do not see
> spinning CPU during this time.
>
> The situation seems to be related to cgroups-based IO throttling /
> weighting so far:
.....
> Sep 28 03:39:19 <redactedhostname>10 kernel: INFO: task nix-build:94696 blocked for more than 122 seconds.
> Sep 28 03:39:19 <redactedhostname>10 kernel: Not tainted 6.11.0 #1-NixOS
> Sep 28 03:39:19 <redactedhostname>10 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 03:39:19 <redactedhostname>10 kernel: task:nix-build state:D stack:0 pid:94696 tgid:94696 ppid:94695 flags:0x00000002
> Sep 28 03:39:19 <redactedhostname>10 kernel: Call Trace:
> Sep 28 03:39:19 <redactedhostname>10 kernel: <TASK>
> Sep 28 03:39:19 <redactedhostname>10 kernel: __schedule+0x3a3/0x1300
> Sep 28 03:39:19 <redactedhostname>10 kernel: schedule+0x27/0xf0
> Sep 28 03:39:19 <redactedhostname>10 kernel: io_schedule+0x46/0x70
> Sep 28 03:39:19 <redactedhostname>10 kernel: folio_wait_bit_common+0x13f/0x340
> Sep 28 03:39:19 <redactedhostname>10 kernel: folio_wait_writeback+0x2b/0x80
> Sep 28 03:39:19 <redactedhostname>10 kernel: truncate_inode_partial_folio+0x5e/0x1b0
> Sep 28 03:39:19 <redactedhostname>10 kernel: truncate_inode_pages_range+0x1de/0x400
> Sep 28 03:39:19 <redactedhostname>10 kernel: evict+0x29f/0x2c0
> Sep 28 03:39:19 <redactedhostname>10 kernel: do_unlinkat+0x2de/0x330
That's not what I'd call expected behaviour.
By the time we are that far through eviction of a newly unlinked
inode, we've already removed the inode from the writeback lists and
we've supposedly waited for all writeback to complete.
IOWs, there shouldn't be a cached folio in writeback state at this
point in time - we're supposed to have guaranteed all writeback has
already compelted before we call truncate_inode_pages_final()....
So how are we getting a partial folio that is still under writeback
at this point in time?
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-01 0:56 ` Chris Mason
@ 2024-10-01 7:54 ` Christian Theune
2024-10-10 6:29 ` Christian Theune
1 sibling, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-10-01 7:54 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
> On 1. Oct 2024, at 02:56, Chris Mason <clm@meta.com> wrote:
>
> I've attached a minimal version of a script we use here to show all the
> D state processes, it might help explain things. The only problem is
> you have to actually ssh to the box and run it when you're stuck.
Thanks, I’ll dig into this next week when I’m back from vacation.
I can set up alerts when this happens and hope that I’ll be fast enough as the situation does seem to resolve itselve at some point. It’s happened quite a bit in the fleet so I guess I should be able to catch it.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-09-27 14:58 ` Jens Axboe
@ 2024-10-01 21:10 ` Kairui Song
0 siblings, 0 replies; 81+ messages in thread
From: Kairui Song @ 2024-10-01 21:10 UTC (permalink / raw)
To: Jens Axboe
Cc: Sam James, Greg KH, stable, clm, Matthew Wilcox, ct, david,
dqminh, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
regressions, regressions, torvalds
On Fri, Sep 27, 2024 at 10:58 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 9/27/24 8:51 AM, Sam James wrote:
> > Kairui Song <ryncsn@gmail.com> writes:
> >
> >> On Wed, Sep 25, 2024 at 1:16?AM Sam James <sam@gentoo.org> wrote:
> >>>
> >>> Kairui, could you send them to the stable ML to be queued if Willy is
> >>> fine with it?
> >>>
> >>
> >> Hi Sam,
> >
> > Hi Kairui,
> >
> >>
> >> Thanks for adding me to the discussion.
> >>
> >> Yes I'd like to, just not sure if people are still testing and
> >> checking the commits.
> >>
> >> And I haven't sent seperate fix just for stable fix before, so can
> >> anyone teach me, should I send only two patches for a minimal change,
> >> or send a whole series (with some minor clean up patch as dependency)
> >> for minimal conflicts? Or the stable team can just pick these up?
> >
> > Please see https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html.
> >
> > If Option 2 can't work (because of conflicts), please follow Option 3
> > (https://www.kernel.org/doc/html/v6.11/process/stable-kernel-rules.html#option-3).
> >
> > Just explain the background and link to this thread in a cover letter
> > and mention it's your first time. Greg didn't bite me when I fumbled my
> > way around it :)y
> >
> > (greg, please correct me if I'm talking rubbish)
>
> It needs two cherry picks, one of them won't pick cleanly. So I suggest
> whoever submits this to stable does:
>
> 1) Cherry pick the two commits, fixup the simple issue with one of them.
> I forget what it was since it's been a week and a half since I did
> it, but it's trivial to fixup.
>
> Don't forget to add the "commit XXX upstream" to the commit message.
>
> 2) Test that it compiles and boots and send an email to
> stable@vger.kernel.org with the patches attached and CC the folks in
> this thread, to help spot if there are mistakes.
>
> and that should be it. Worst case, we'll need a few different patches
> since this affects anything back to 5.19, and each currently maintained
> stable kernel version will need it.
>
Hi Sam, Jens,
Thanks very much, currently maintained upstream kernels are
6.10, 6.6, 6.1, 5.15, 5.10, 5.4, 4.19.
I think only 6.6 and 6.1 need backport, I've sent a fix for these two,
it's three checkpicks from the one 6.10 series so the conflict is
minimal. The stable series can be applied without conflict for both.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-01 0:56 ` Chris Mason
2024-10-01 7:54 ` Christian Theune
@ 2024-10-10 6:29 ` Christian Theune
2024-10-11 7:27 ` Christian Theune
1 sibling, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-10-10 6:29 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
> On 1. Oct 2024, at 02:56, Chris Mason <clm@meta.com> wrote:
>
> Not disagreeing with Linus at all, but given that you've got IO
> throttling too, we might really just be waiting. It's hard to tell
> because the hung task timeouts only give you information about one process.
>
> I've attached a minimal version of a script we use here to show all the
> D state processes, it might help explain things. The only problem is
> you have to actually ssh to the box and run it when you're stuck.
>
> The idea is to print the stack trace of every D state process, and then
> also print out how often each unique stack trace shows up. When we're
> deadlocked on something, there are normally a bunch of the same stack
> (say waiting on writeback) and then one jerk sitting around in a
> different stack who is causing all the trouble.
I think I should be able to trigger this. I’ve seen around a 100 of those issues over the last week and the chance of it happening correlates with a certain workload that should be easy to trigger. Also, the condition remains for at around 5 minutes, so I should be able to trace it when I see the alert in an interactive session.
I’ve verified I can run your script and I’ll get back to you in the next days.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-10 6:29 ` Christian Theune
@ 2024-10-11 7:27 ` Christian Theune
2024-10-11 9:08 ` Christian Theune
0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-10-11 7:27 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
Hi,
> On 10. Oct 2024, at 08:29, Christian Theune <ct@flyingcircus.io> wrote:
>
>
>> On 1. Oct 2024, at 02:56, Chris Mason <clm@meta.com> wrote:
>>
>> Not disagreeing with Linus at all, but given that you've got IO
>> throttling too, we might really just be waiting. It's hard to tell
>> because the hung task timeouts only give you information about one process.
>>
>> I've attached a minimal version of a script we use here to show all the
>> D state processes, it might help explain things. The only problem is
>> you have to actually ssh to the box and run it when you're stuck.
>>
>> The idea is to print the stack trace of every D state process, and then
>> also print out how often each unique stack trace shows up. When we're
>> deadlocked on something, there are normally a bunch of the same stack
>> (say waiting on writeback) and then one jerk sitting around in a
>> different stack who is causing all the trouble.
>
> I think I should be able to trigger this. I’ve seen around a 100 of those issues over the last week and the chance of it happening correlates with a certain workload that should be easy to trigger. Also, the condition remains for at around 5 minutes, so I should be able to trace it when I see the alert in an interactive session.
>
> I’ve verified I can run your script and I’ll get back to you in the next days.
I wasn’t able to create a reproducer after all so I’ve set up alerting.
I just caught one right away, but it unblocked quickly after I logged in:
The original message that triggered the alert was:
[Oct11 09:18] INFO: task nix-build:157920 blocked for more than 122 seconds.
[ +0.000937] Not tainted 6.11.0 #1-NixOS
[ +0.000540] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.000902] task:nix-build state:D stack:0 pid:157920 tgid:157920 ppid:157919 flags:0x00000002
[ +0.001098] Call Trace:
[ +0.000306] <TASK>
[ +0.000279] __schedule+0x3a3/0x1300
[ +0.000478] schedule+0x27/0xf0
[ +0.000392] io_schedule+0x46/0x70
[ +0.000436] folio_wait_bit_common+0x13f/0x340
[ +0.000572] ? __pfx_wake_page_function+0x10/0x10
[ +0.000592] folio_wait_writeback+0x2b/0x80
[ +0.000466] truncate_inode_partial_folio+0x5e/0x1b0
[ +0.000586] truncate_inode_pages_range+0x1de/0x400
[ +0.000595] evict+0x29f/0x2c0
[ +0.000396] ? iput+0x6e/0x230
[ +0.000408] ? _atomic_dec_and_lock+0x39/0x50
[ +0.000542] do_unlinkat+0x2de/0x330
[ +0.000402] __x64_sys_unlink+0x3f/0x70
[ +0.000419] do_syscall_64+0xb7/0x200
[ +0.000407] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ +0.000556] RIP: 0033:0x7f2bb5d1056b
[ +0.000473] RSP: 002b:00007ffc013c8588 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
[ +0.000942] RAX: ffffffffffffffda RBX: 000055963c267500 RCX: 00007f2bb5d1056b
[ +0.000859] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055963c268c80
[ +0.000800] RBP: 000055963c267690 R08: 0000000000016020 R09: 0000000000000000
[ +0.000977] R10: 00000000000000f0 R11: 0000000000000206 R12: 00007ffc013c85c8
[ +0.000826] R13: 00007ffc013c85ac R14: 00007ffc013c8ed0 R15: 00005596441e42b0
[ +0.000833] </TASK>
Then after logging in I caught it once with walker.py - this was about a minute after the alert triggered I think. I’ll add timestamps to walker.py in the next instances:
157920 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] truncate_inode_partial_folio+0x5e/0x1b0
[<0>] truncate_inode_pages_range+0x1de/0x400
[<0>] evict+0x29f/0x2c0
[<0>] do_unlinkat+0x2de/0x330
[<0>] __x64_sys_unlink+0x3f/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] truncate_inode_partial_folio+0x5e/0x1b0
[<0>] truncate_inode_pages_range+0x1de/0x400
[<0>] evict+0x29f/0x2c0
[<0>] do_unlinkat+0x2de/0x330
[<0>] __x64_sys_unlink+0x3f/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
I tried once again after 1-2 seconds and got this:
157920 nix-build D
[<0>] xlog_wait_on_iclog+0x167/0x180 [xfs]
[<0>] xfs_log_force_seq+0x8d/0x150 [xfs]
[<0>] xfs_file_fsync+0x195/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] xlog_wait_on_iclog+0x167/0x180 [xfs]
[<0>] xfs_log_force_seq+0x8d/0x150 [xfs]
[<0>] xfs_file_fsync+0x195/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
and after that the process was done and exited. The last traceback looks unlocked already.
I’m going to gather a few more instances during the day and will post them as a batch later.
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-11 7:27 ` Christian Theune
@ 2024-10-11 9:08 ` Christian Theune
2024-10-11 13:06 ` Chris Mason
0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-10-11 9:08 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
[-- Attachment #1: Type: text/plain, Size: 1117 bytes --]
> On 11. Oct 2024, at 09:27, Christian Theune <ct@flyingcircus.io> wrote:
>
> I’m going to gather a few more instances during the day and will post them as a batch later.
I’ve received 8 alerts in the last hours and managed to get detailed, repeated walker output from two of them:
- FC-41287.log
- FC-41289.log
The other logs are tracebacks as the kernel reported them but the situation resolved itself faster than I could log in and run the walker script. In FC-41289.log I’m also providing output from `ps auxf` to see what the process tree looks like, maybe that helps, too.
My observations:
- different entry points from the XFS code: unlink, f(data)sync, truncate
- in none of the cases I caught I could see any real competing traffic (aside from maybe occasional journal writes and very little background noise), all affected machines are staging environments that saw basically no usage during that timeframe
I’m stopping my alerting now as it’s been interrupting me every few minutes and I’m running out of steam sitting around waiting for the alert. ;)
Christian
[-- Attachment #2: FC-41281.log --]
[-- Type: application/octet-stream, Size: 2345 bytes --]
[195020.405783] INFO: task nix-build:157920 blocked for more than 122 seconds.
[195020.406720] Not tainted 6.11.0 #1-NixOS
[195020.407260] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[195020.408162] task:nix-build state:D stack:0 pid:157920 tgid:157920 ppid:157919 flags:0x00000002
[195020.409260] Call Trace:
[195020.409566] <TASK>
[195020.409845] __schedule+0x3a3/0x1300
[195020.410323] schedule+0x27/0xf0
[195020.410715] io_schedule+0x46/0x70
[195020.411151] folio_wait_bit_common+0x13f/0x340
[195020.411723] ? __pfx_wake_page_function+0x10/0x10
[195020.412315] folio_wait_writeback+0x2b/0x80
[195020.412781] truncate_inode_partial_folio+0x5e/0x1b0
[195020.413367] truncate_inode_pages_range+0x1de/0x400
[195020.413962] evict+0x29f/0x2c0
[195020.414358] ? iput+0x6e/0x230
[195020.414766] ? _atomic_dec_and_lock+0x39/0x50
[195020.415308] do_unlinkat+0x2de/0x330
[195020.415710] __x64_sys_unlink+0x3f/0x70
[195020.416129] do_syscall_64+0xb7/0x200
[195020.416536] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[195020.417092] RIP: 0033:0x7f2bb5d1056b
[195020.417565] RSP: 002b:00007ffc013c8588 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
[195020.418507] RAX: ffffffffffffffda RBX: 000055963c267500 RCX: 00007f2bb5d1056b
[195020.419366] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055963c268c80
[195020.420166] RBP: 000055963c267690 R08: 0000000000016020 R09: 0000000000000000
[195020.421143] R10: 00000000000000f0 R11: 0000000000000206 R12: 00007ffc013c85c8
[195020.421969] R13: 00007ffc013c85ac R14: 00007ffc013c8ed0 R15: 00005596441e42b0
[195020.422802] </TASK>
157920 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] truncate_inode_partial_folio+0x5e/0x1b0
[<0>] truncate_inode_pages_range+0x1de/0x400
[<0>] evict+0x29f/0x2c0
[<0>] do_unlinkat+0x2de/0x330
[<0>] __x64_sys_unlink+0x3f/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] truncate_inode_partial_folio+0x5e/0x1b0
[<0>] truncate_inode_pages_range+0x1de/0x400
[<0>] evict+0x29f/0x2c0
[<0>] do_unlinkat+0x2de/0x330
[<0>] __x64_sys_unlink+0x3f/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[-- Attachment #3: FC-41282.log --]
[-- Type: application/octet-stream, Size: 1781 bytes --]
[208400.702546] INFO: task nix-build:330993 blocked for more than 122 seconds.
[208400.703012] Not tainted 6.11.0 #1-NixOS
[208400.703260] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[208400.703760] task:nix-build state:D stack:0 pid:330993 tgid:330993 ppid:330992 flags:0x00004002
[208400.704588] Call Trace:
[208400.704744] <TASK>
[208400.704874] __schedule+0x3a3/0x1300
[208400.705085] ? wb_update_bandwidth+0x52/0x70
[208400.705329] schedule+0x27/0xf0
[208400.705523] io_schedule+0x46/0x70
[208400.705734] folio_wait_bit_common+0x13f/0x340
[208400.706021] ? __pfx_wake_page_function+0x10/0x10
[208400.706296] folio_wait_writeback+0x2b/0x80
[208400.706644] __filemap_fdatawait_range+0x80/0xe0
[208400.707037] filemap_write_and_wait_range+0x85/0xb0
[208400.707436] xfs_setattr_size+0xd9/0x3c0 [xfs]
[208400.707955] xfs_vn_setattr+0x81/0x150 [xfs]
[208400.708365] notify_change+0x2ed/0x4f0
[208400.708638] ? do_truncate+0x98/0xf0
[208400.708855] do_truncate+0x98/0xf0
[208400.709050] do_ftruncate+0xfe/0x160
[208400.709329] __x64_sys_ftruncate+0x3e/0x70
[208400.709656] do_syscall_64+0xb7/0x200
[208400.710041] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[208400.710367] RIP: 0033:0x7fab32912c2b
[208400.710614] RSP: 002b:00007ffee94d7e18 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
[208400.711093] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fab32912c2b
[208400.711503] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000011
[208400.711947] RBP: 0000000000000011 R08: 0000000000000000 R09: 00007ffee94d7dc0
[208400.712355] R10: 0000000000068000 R11: 0000000000000246 R12: 000055f3aca90b20
[208400.712776] R13: 000055f3acb2d3d8 R14: 0000000000000001 R15: 000055f3acb3a3a8
[208400.713222] </TASK>
[-- Attachment #4: FC-41283.log --]
[-- Type: application/octet-stream, Size: 1724 bytes --]
[820710.966217] INFO: task nix-build:884370 blocked for more than 122 seconds.
[820710.966643] Not tainted 6.11.0 #1-NixOS
[820710.966890] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[820710.967307] task:nix-build state:D stack:0 pid:884370 tgid:884370 ppid:884369 flags:0x00000002
[820710.967913] Call Trace:
[820710.968056] <TASK>
[820710.968189] __schedule+0x3a3/0x1300
[820710.968391] schedule+0x27/0xf0
[820710.968563] io_schedule+0x46/0x70
[820710.968758] folio_wait_bit_common+0x13f/0x340
[820710.968998] ? __pfx_wake_page_function+0x10/0x10
[820710.969258] folio_wait_writeback+0x2b/0x80
[820710.969485] truncate_inode_partial_folio+0x5e/0x1b0
[820710.969753] truncate_inode_pages_range+0x1de/0x400
[820710.970041] evict+0x29f/0x2c0
[820710.970230] ? iput+0x6e/0x230
[820710.970399] ? _atomic_dec_and_lock+0x39/0x50
[820710.970633] do_unlinkat+0x2de/0x330
[820710.970837] __x64_sys_unlink+0x3f/0x70
[820710.971042] do_syscall_64+0xb7/0x200
[820710.971257] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[820710.971528] RIP: 0033:0x7f09e0e2d56b
[820710.971740] RSP: 002b:00007ffed1ddeb58 EFLAGS: 00000202 ORIG_RAX: 0000000000000057
[820710.972131] RAX: ffffffffffffffda RBX: 00005587092aa500 RCX: 00007f09e0e2d56b
[820710.972503] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005587092abc80
[820710.972875] RBP: 00005587092aa690 R08: 0000000000016020 R09: 0000000000000000
[820710.973249] R10: 0000000000000080 R11: 0000000000000202 R12: 00007ffed1ddeb98
[820710.973623] R13: 00007ffed1ddeb7c R14: 00007ffed1ddf4a0 R15: 0000558711268cd0
[820710.973998] </TASK>
[820710.974122] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
[-- Attachment #5: FC-41285.log --]
[-- Type: application/octet-stream, Size: 1567 bytes --]
[217499.576744] INFO: task nix-build:176931 blocked for more than 122 seconds.
[217499.577213] Not tainted 6.11.0 #1-NixOS
[217499.577455] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[217499.577910] task:nix-build state:D stack:0 pid:176931 tgid:176931 ppid:176930 flags:0x00004002
[217499.578417] Call Trace:
[217499.578560] <TASK>
[217499.578697] __schedule+0x3a3/0x1300
[217499.578920] ? xfs_vm_writepages+0x67/0x90 [xfs]
[217499.579333] schedule+0x27/0xf0
[217499.579515] io_schedule+0x46/0x70
[217499.579721] folio_wait_bit_common+0x13f/0x340
[217499.579981] ? __pfx_wake_page_function+0x10/0x10
[217499.580241] folio_wait_writeback+0x2b/0x80
[217499.580475] __filemap_fdatawait_range+0x80/0xe0
[217499.580740] file_write_and_wait_range+0x88/0xb0
[217499.581004] xfs_file_fsync+0x5e/0x2a0 [xfs]
[217499.581586] __x64_sys_fdatasync+0x52/0x90
[217499.581856] do_syscall_64+0xb7/0x200
[217499.582069] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[217499.582349] RIP: 0033:0x7f56be82f70a
[217499.582563] RSP: 002b:00007fff458db490 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[217499.582988] RAX: ffffffffffffffda RBX: 000055af3319bbf8 RCX: 00007f56be82f70a
[217499.583372] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
[217499.583760] RBP: 0000000000000000 R08: 0000000000000001 R09: 000055af3b0fefe8
[217499.584169] R10: 000000000000007e R11: 0000000000000293 R12: 0000000000000001
[217499.584552] R13: 0000000000000197 R14: 000055af3b0ff33e R15: 00007fff458db690
[217499.584951] </TASK>
[-- Attachment #6: FC-41286.log --]
[-- Type: application/octet-stream, Size: 2237 bytes --]
[217499.576744] INFO: task nix-build:176931 blocked for more than 122 seconds.
[217499.577213] Not tainted 6.11.0 #1-NixOS
[217499.577455] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[217499.577910] task:nix-build state:D stack:0 pid:176931 tgid:176931 ppid:176930 flags:0x00004002
[217499.578417] Call Trace:
[217499.578560] <TASK>
[217499.578697] __schedule+0x3a3/0x1300
[217499.578920] ? xfs_vm_writepages+0x67/0x90 [xfs]
[217499.579333] schedule+0x27/0xf0
[217499.579515] io_schedule+0x46/0x70
[217499.579721] folio_wait_bit_common+0x13f/0x340
[217499.579981] ? __pfx_wake_page_function+0x10/0x10
[217499.580241] folio_wait_writeback+0x2b/0x80
[217499.580475] __filemap_fdatawait_range+0x80/0xe0
[217499.580740] file_write_and_wait_range+0x88/0xb0
[217499.581004] xfs_file_fsync+0x5e/0x2a0 [xfs]
[217499.581586] __x64_sys_fdatasync+0x52/0x90
[217499.581856] do_syscall_64+0xb7/0x200
[217499.582069] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[217499.582349] RIP: 0033:0x7f56be82f70a
[217499.582563] RSP: 002b:00007fff458db490 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[217499.582988] RAX: ffffffffffffffda RBX: 000055af3319bbf8 RCX: 00007f56be82f70a
[217499.583372] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
[217499.583760] RBP: 0000000000000000 R08: 0000000000000001 R09: 000055af3b0fefe8
[217499.584169] R10: 000000000000007e R11: 0000000000000293 R12: 0000000000000001
[217499.584552] R13: 0000000000000197 R14: 000055af3b0ff33e R15: 00007fff458db690
[217499.584951] </TASK>
[217565.040136] systemd[1]: fc-agent.service: Deactivated successfully.
[217565.041118] systemd[1]: Finished Flying Circus Management Task.
[217565.041814] systemd[1]: fc-agent.service: Consumed 18.400s CPU time, received 28.9M IP traffic, sent 158.2K IP traffic.
[217637.400585] systemd[1]: Created slice Slice /user/1003.
[217637.407307] systemd[1]: Starting User Runtime Directory /run/user/1003...
[217637.426906] systemd[1]: Finished User Runtime Directory /run/user/1003.
[217637.439512] systemd[1]: Starting User Manager for UID 1003...
[217637.644565] systemd[1]: Started User Manager for UID 1003.
[217637.652243] systemd[1]: Started Session 3 of User ctheune.
[-- Attachment #7: FC-41287.log --]
[-- Type: application/octet-stream, Size: 5752 bytes --]
[215042.580872] INFO: task nix-build:240798 blocked for more than 122 seconds.
[215042.581318] Not tainted 6.11.0 #1-NixOS
[215042.581624] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[215042.582070] task:nix-build state:D stack:0 pid:240798 tgid:240798 ppid:240797 flags:0x00000002
[215042.582573] Call Trace:
[215042.582713] <TASK>
[215042.582860] __schedule+0x3a3/0x1300
[215042.583069] ? xfs_vm_writepages+0x67/0x90 [xfs]
[215042.583469] schedule+0x27/0xf0
[215042.583651] io_schedule+0x46/0x70
[215042.583859] folio_wait_bit_common+0x13f/0x340
[215042.584108] ? __pfx_wake_page_function+0x10/0x10
[215042.584364] folio_wait_writeback+0x2b/0x80
[215042.584594] __filemap_fdatawait_range+0x80/0xe0
[215042.584856] file_write_and_wait_range+0x88/0xb0
[215042.585109] xfs_file_fsync+0x5e/0x2a0 [xfs]
[215042.585471] __x64_sys_fdatasync+0x52/0x90
[215042.585698] do_syscall_64+0xb7/0x200
[215042.585916] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[215042.586191] RIP: 0033:0x7ff0c831270a
[215042.586406] RSP: 002b:00007ffe1482b960 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[215042.586818] RAX: ffffffffffffffda RBX: 0000564877f8abf8 RCX: 00007ff0c831270a
[215042.587197] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
[215042.587574] RBP: 0000000000000000 R08: 0000000000000001 R09: 000056487ff73f78
[215042.587960] R10: 0000000000000082 R11: 0000000000000293 R12: 0000000000000001
[215042.588337] R13: 00000000000001a0 R14: 000056487ff742e0 R15: 00007ffe1482bb60
[215042.588716] </TASK>
[215120.626730] systemd[1]: Created slice Slice /user/1003.
[215120.633868] systemd[1]: Starting User Runtime Directory /run/user/1003...
[215120.664698] systemd[1]: Finished User Runtime Directory /run/user/1003.
[215120.673752] systemd[1]: Starting User Manager for UID 1003...
[215121.175903] systemd[1]: Started User Manager for UID 1003.
[215121.182026] systemd[1]: Started Session 1 of User ctheune.
[215135.177690429]
240798 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[215140.478882357]
240798 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[215145.029642882]
240798 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[215150.173831058]
240798 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[215155.155491198]
240798 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] file_write_and_wait_range+0x88/0xb0
[<0>] xfs_file_fsync+0x5e/0x2a0 [xfs]
[<0>] __x64_sys_fdatasync+0x52/0x90
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[215163.267601] systemd[1]: fc-agent.service: Deactivated successfully.
[215163.268172] systemd[1]: Finished Flying Circus Management Task.
[215163.269162] systemd[1]: fc-agent.service: Consumed 19.683s CPU time, received 28.9M IP traffic, sent 152.5K IP traffic.
[-- Attachment #8: FC-41288.log --]
[-- Type: application/octet-stream, Size: 1684 bytes --]
[217748.915126] INFO: task nix-build:198761 blocked for more than 122 seconds.
[217748.916085] Not tainted 6.11.0 #1-NixOS
[217748.916639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[217748.917636] task:nix-build state:D stack:0 pid:198761 tgid:198761 ppid:198760 flags:0x00000002
[217748.918897] Call Trace:
[217748.919271] <TASK>
[217748.919593] __schedule+0x3a3/0x1300
[217748.920118] ? xfs_btree_insrec+0x32c/0x570 [xfs]
[217748.921070] schedule+0x27/0xf0
[217748.921483] io_schedule+0x46/0x70
[217748.921921] folio_wait_bit_common+0x13f/0x340
[217748.922508] ? __pfx_wake_page_function+0x10/0x10
[217748.923115] folio_wait_writeback+0x2b/0x80
[217748.923647] truncate_inode_partial_folio+0x5e/0x1b0
[217748.924286] truncate_inode_pages_range+0x1de/0x400
[217748.924903] evict+0x29f/0x2c0
[217748.925325] ? iput+0x6e/0x230
[217748.925722] ? _atomic_dec_and_lock+0x39/0x50
[217748.926290] do_unlinkat+0x2de/0x330
[217748.926751] __x64_sys_unlink+0x3f/0x70
[217748.927247] do_syscall_64+0xb7/0x200
[217748.927716] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[217748.928375] RIP: 0033:0x7f177b02d56b
[217748.928860] RSP: 002b:00007ffe6bb88658 EFLAGS: 00000202 ORIG_RAX: 0000000000000057
[217748.929804] RAX: ffffffffffffffda RBX: 000055fd811835b0 RCX: 00007f177b02d56b
[217748.930700] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055fd81184d30
[217748.931615] RBP: 000055fd81183740 R08: 0000000000016020 R09: 0000000000000000
[217748.932508] R10: 0000000000000030 R11: 0000000000000202 R12: 00007ffe6bb88698
[217748.933422] R13: 00007ffe6bb8867c R14: 00007ffe6bb88fa0 R15: 000055fd890e1a50
[217748.934341] </TASK>
[-- Attachment #9: FC-41289.log --]
[-- Type: application/octet-stream, Size: 166376 bytes --]
[218237.291578] INFO: task nix-build:176536 blocked for more than 122 seconds.
[218237.292026] Not tainted 6.11.0 #1-NixOS
[218237.292261] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[218237.292695] task:nix-build state:D stack:0 pid:176536 tgid:176536 ppid:176535 flags:0x00000002
[218237.293188] Call Trace:
[218237.293326] <TASK>
[218237.293458] __schedule+0x3a3/0x1300
[218237.293673] ? xfs_vm_writepages+0x67/0x90 [xfs]
[218237.294063] schedule+0x27/0xf0
[218237.294240] io_schedule+0x46/0x70
[218237.294426] folio_wait_bit_common+0x13f/0x340
[218237.294696] ? __pfx_wake_page_function+0x10/0x10
[218237.295038] folio_wait_writeback+0x2b/0x80
[218237.295270] __filemap_fdatawait_range+0x80/0xe0
[218237.295541] filemap_write_and_wait_range+0x85/0xb0
[218237.295804] xfs_setattr_size+0xd9/0x3c0 [xfs]
[218237.296173] xfs_vn_setattr+0x81/0x150 [xfs]
[218237.296530] notify_change+0x2ed/0x4f0
[218237.296777] ? do_truncate+0x98/0xf0
[218237.296996] do_truncate+0x98/0xf0
[218237.297183] do_ftruncate+0xfe/0x160
[218237.297378] __x64_sys_ftruncate+0x3e/0x70
[218237.297632] do_syscall_64+0xb7/0x200
[218237.297836] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[218237.298111] RIP: 0033:0x7f0453b12c2b
[218237.298316] RSP: 002b:00007ffe9f6db828 EFLAGS: 00000246 ORIG_RAX: 000000000000004d
[218237.298742] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0453b12c2b
[218237.299116] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000011
[218237.299492] RBP: 0000000000000011 R08: 0000000000000000 R09: 00007ffe9f6db7d0
[218237.299869] R10: 0000000000018000 R11: 0000000000000246 R12: 00005562ff27fb20
[218237.300241] R13: 00005562ff31c3d8 R14: 0000000000000001 R15: 00005562ff3293a8
[218237.300631] </TASK>
[218261.984778] systemd[1]: Created slice Slice /user/1003.
[218261.989545] systemd[1]: Starting User Runtime Directory /run/user/1003...
[218262.000938] systemd[1]: Finished User Runtime Directory /run/user/1003.
[218262.005583] systemd[1]: Starting User Manager for UID 1003...
[218262.105005] systemd[1]: Started User Manager for UID 1003.
[218262.109759] systemd[1]: Started Session 7 of User ctheune.
[218269.921479398]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[218274.052571366]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[218278.588908363]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[218283.450120071]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[218287.296514668]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
[218290.957136179]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
# dstat
You did not select any stats, using -cdngy by default.
--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw
5 1 92 1 1|7869B 145k| 0 0 | 17B 442B| 611 801
2 1 0 97 0| 0 1488k| 21k 7707B| 0 0 |1030 1385
2 1 0 96 1| 0 1076k| 20k 5616B| 0 0 | 981 1296
1 0 0 99 0| 0 1196k| 11k 454B| 0 0 | 711 987 ^C
[218298.44665228]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Oct08 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Oct08 0:00 \_ [pool_workqueue_release]
root 4 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-rcu_gp]
root 5 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-sync_wq]
root 6 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-slub_flushwq]
root 7 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-netns]
root 10 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/0:0H-kblockd]
root 13 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mm_percpu_wq]
root 14 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_kthread]
root 15 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_rude_kthread]
root 16 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_trace_kthread]
root 17 0.0 0.0 0 0 ? S Oct08 0:25 \_ [ksoftirqd/0]
root 18 0.0 0.0 0 0 ? I Oct08 1:12 \_ [rcu_preempt]
root 19 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_par_gp_kthread_worker/0]
root 20 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_gp_kthread_worker]
root 21 0.0 0.0 0 0 ? S Oct08 0:00 \_ [migration/0]
root 22 0.0 0.0 0 0 ? S Oct08 0:00 \_ [idle_inject/0]
root 23 0.0 0.0 0 0 ? S Oct08 0:00 \_ [cpuhp/0]
root 24 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kdevtmpfs]
root 25 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-inet_frag_wq]
root 26 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kauditd]
root 27 0.0 0.0 0 0 ? S Oct08 0:00 \_ [khungtaskd]
root 28 0.0 0.0 0 0 ? S Oct08 0:00 \_ [oom_reaper]
root 29 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-writeback]
root 30 0.0 0.0 0 0 ? S Oct08 0:02 \_ [kcompactd0]
root 31 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [ksmd]
root 32 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [khugepaged]
root 33 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kintegrityd]
root 34 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kblockd]
root 35 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-blkcg_punt_bio]
root 36 0.0 0.0 0 0 ? S Oct08 0:00 \_ [irq/9-acpi]
root 37 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md]
root 38 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md_bitmap]
root 39 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-devfreq_wq]
root 44 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kswapd0]
root 45 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kthrotld]
root 46 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mld]
root 47 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ipv6_addrconf]
root 54 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kstrp]
root 55 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/u5:0]
root 102 0.0 0.0 0 0 ? S Oct08 0:00 \_ [hwrng]
root 109 0.0 0.0 0 0 ? S Oct08 0:00 \_ [watchdogd]
root 149 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ata_sff]
root 150 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_0]
root 151 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_0]
root 152 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_1]
root 153 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_1]
root 184 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfsalloc]
root 185 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs_mru_cache]
root 186 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vda1]
root 187 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vda1]
root 188 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vda1]
root 189 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vda1]
root 190 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vda1]
root 191 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vda1]
root 192 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vda1]
root 193 0.0 0.0 0 0 ? S Oct08 0:20 \_ [xfsaild/vda1]
root 531 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 644 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vdc1]
root 645 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vdc1]
root 646 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vdc1]
root 647 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vdc1]
root 648 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vdc1]
root 649 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vdc1]
root 650 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vdc1]
root 651 0.0 0.0 0 0 ? S Oct08 0:05 \_ [xfsaild/vdc1]
root 723 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ttm]
root 1286 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-tls-strp]
root 2772 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 171717 0.0 0.0 0 0 ? I 09:03 0:00 \_ [kworker/u4:3-events_power_efficient]
root 174477 0.0 0.0 0 0 ? I 10:01 0:00 \_ [kworker/0:2-xfs-conv/vdc1]
root 174683 0.0 0.0 0 0 ? I 10:06 0:00 \_ [kworker/u4:2-events_power_efficient]
root 175378 0.0 0.0 0 0 ? I 10:20 0:00 \_ [kworker/u4:4-events_unbound]
root 176049 0.0 0.0 0 0 ? I 10:34 0:00 \_ [kworker/0:3-xfs-conv/vdc1]
root 176150 0.0 0.0 0 0 ? I< 10:35 0:00 \_ [kworker/0:1H-xfs-log/vda1]
root 176358 0.0 0.0 0 0 ? I 10:40 0:00 \_ [kworker/0:0-xfs-conv/vdc1]
root 176402 0.0 0.0 0 0 ? I 10:41 0:00 \_ [kworker/u4:0-events_power_efficient]
root 176544 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:1-writeback]
root 176545 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:5-writeback]
root 176546 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:6-events_power_efficient]
root 176549 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:1-xfs-conv/vdc1]
root 176550 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:4-xfs-conv/vdc1]
root 176551 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:5-xfs-conv/vdc1]
root 176552 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:6-xfs-conv/vdc1]
root 176553 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:7-xfs-conv/vdc1]
root 176554 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:8-xfs-conv/vdc1]
root 176555 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:9-xfs-conv/vdc1]
root 176556 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:10-xfs-conv/vdc1]
root 176557 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:11-xfs-conv/vdc1]
root 176558 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:12-kthrotld]
root 176559 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:13-xfs-conv/vdc1]
root 176560 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:14-xfs-conv/vdc1]
root 176561 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:15-xfs-conv/vdc1]
root 176562 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:16-xfs-conv/vdc1]
root 176563 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:17-xfs-conv/vdc1]
root 176564 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:18-xfs-conv/vdc1]
root 176565 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:19-xfs-conv/vdc1]
root 176566 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:20-xfs-conv/vdc1]
root 176567 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:21-xfs-conv/vdc1]
root 176568 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:22-xfs-conv/vdc1]
root 176569 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:23-xfs-conv/vdc1]
root 176570 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:24-xfs-conv/vdc1]
root 176571 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:25-xfs-conv/vdc1]
root 176572 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:26-xfs-conv/vdc1]
root 176573 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:27-xfs-conv/vdc1]
root 176574 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:28-xfs-conv/vdc1]
root 176575 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:29-xfs-conv/vdc1]
root 176576 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:30-xfs-conv/vdc1]
root 176577 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:31-xfs-conv/vdc1]
root 176578 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:32-xfs-conv/vdc1]
root 176579 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:33-xfs-conv/vdc1]
root 176580 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:34-xfs-conv/vdc1]
root 176581 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:35-xfs-conv/vdc1]
root 176582 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:36-xfs-conv/vdc1]
root 176583 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:37-xfs-conv/vdc1]
root 176584 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:38-xfs-conv/vdc1]
root 176585 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:39-xfs-conv/vdc1]
root 176586 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:40-xfs-conv/vdc1]
root 176587 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:41-xfs-buf/vdc1]
root 176588 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:42-xfs-conv/vdc1]
root 176589 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:43-xfs-conv/vdc1]
root 176590 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:44-xfs-conv/vdc1]
root 176591 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:45-xfs-conv/vdc1]
root 176592 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:46-xfs-conv/vdc1]
root 176593 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:47-xfs-conv/vdc1]
root 176594 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:48-xfs-conv/vdc1]
root 176595 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:49-xfs-conv/vdc1]
root 176596 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:50-xfs-conv/vdc1]
root 176597 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:51-xfs-conv/vdc1]
root 176598 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:52-xfs-conv/vdc1]
root 176599 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:53-xfs-conv/vdc1]
root 176600 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:54-xfs-conv/vdc1]
root 176601 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:55-xfs-conv/vdc1]
root 176602 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:56-xfs-conv/vdc1]
root 176603 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:57-xfs-conv/vdc1]
root 176604 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:58-xfs-conv/vdc1]
root 176605 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:59-xfs-conv/vdc1]
root 176606 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:60-xfs-conv/vdc1]
root 176607 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:61-xfs-conv/vdc1]
root 176608 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:62-xfs-conv/vdc1]
root 176609 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:63-xfs-conv/vdc1]
root 176610 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:64-xfs-conv/vdc1]
root 176611 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:65-xfs-conv/vdc1]
root 176612 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:66-xfs-conv/vdc1]
root 176613 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:67-xfs-conv/vdc1]
root 176614 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:68-xfs-conv/vdc1]
root 176615 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:69-xfs-conv/vdc1]
root 176616 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:70-xfs-conv/vdc1]
root 176617 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:71-xfs-conv/vdc1]
root 176618 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:72-xfs-conv/vdc1]
root 176619 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:73-xfs-conv/vdc1]
root 176620 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:74-xfs-conv/vdc1]
root 176621 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:75-xfs-conv/vdc1]
root 176622 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:76-xfs-conv/vdc1]
root 176623 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:77-xfs-conv/vdc1]
root 176624 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:78-xfs-conv/vdc1]
root 176625 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:79-xfs-conv/vdc1]
root 176626 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:80-xfs-conv/vdc1]
root 176627 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:81-xfs-conv/vdc1]
root 176628 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:82-xfs-conv/vdc1]
root 176629 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:83-xfs-conv/vdc1]
root 176630 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:84-xfs-conv/vdc1]
root 176631 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:85-xfs-conv/vdc1]
root 176632 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:86-xfs-conv/vdc1]
root 176633 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:87-xfs-conv/vdc1]
root 176634 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:88-xfs-conv/vdc1]
root 176635 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:89-xfs-conv/vdc1]
root 176636 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:90-xfs-conv/vdc1]
root 176637 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:91-xfs-conv/vdc1]
root 176638 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:92-xfs-conv/vdc1]
root 176639 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:93-xfs-conv/vdc1]
root 176640 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:94-xfs-conv/vdc1]
root 176641 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:95-xfs-conv/vdc1]
root 176642 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:96-xfs-conv/vdc1]
root 176643 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:97-xfs-conv/vdc1]
root 176644 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:98-xfs-conv/vdc1]
root 176645 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:99-xfs-conv/vdc1]
root 176646 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:100-xfs-conv/vdc1]
root 176647 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:101-xfs-conv/vdc1]
root 176648 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:102-xfs-conv/vdc1]
root 176649 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:103-xfs-conv/vdc1]
root 176650 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:104-xfs-conv/vdc1]
root 176651 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:105-xfs-conv/vdc1]
root 176652 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:106-xfs-conv/vdc1]
root 176653 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:107-xfs-conv/vdc1]
root 176654 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:108-xfs-conv/vdc1]
root 176655 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:109-xfs-conv/vdc1]
root 176656 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:110-xfs-conv/vdc1]
root 176657 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:111-xfs-conv/vdc1]
root 176658 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:112-xfs-conv/vdc1]
root 176659 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:113-xfs-conv/vdc1]
root 176660 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:114-xfs-conv/vdc1]
root 176661 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:115-xfs-conv/vdc1]
root 176662 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:116-xfs-conv/vdc1]
root 176663 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:117-xfs-conv/vdc1]
root 176664 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:118-xfs-conv/vdc1]
root 176665 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:119-xfs-conv/vdc1]
root 176666 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:120-xfs-conv/vdc1]
root 176667 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:121-xfs-conv/vdc1]
root 176668 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:122-xfs-conv/vdc1]
root 176669 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:123-xfs-conv/vdc1]
root 176670 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:124-xfs-conv/vdc1]
root 176671 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:125-xfs-conv/vdc1]
root 176672 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:126-xfs-conv/vdc1]
root 176673 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:127-xfs-conv/vdc1]
root 176674 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:128-xfs-conv/vdc1]
root 176675 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:129-xfs-conv/vdc1]
root 176676 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:130-xfs-conv/vdc1]
root 176677 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:131-xfs-conv/vdc1]
root 176678 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:132-xfs-conv/vdc1]
root 176679 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:133-xfs-conv/vdc1]
root 176680 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:134-xfs-conv/vdc1]
root 176681 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:135-xfs-conv/vdc1]
root 176682 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:136-xfs-conv/vdc1]
root 176683 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:137-xfs-conv/vdc1]
root 176684 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:138-xfs-conv/vdc1]
root 176685 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:139-xfs-conv/vdc1]
root 176686 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:140-xfs-conv/vdc1]
root 176687 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:141-xfs-conv/vdc1]
root 176688 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:142-xfs-conv/vdc1]
root 176689 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:143-xfs-conv/vdc1]
root 176690 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:144-xfs-conv/vdc1]
root 176691 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:145-xfs-conv/vdc1]
root 176692 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:146-xfs-conv/vdc1]
root 176693 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:147-xfs-conv/vdc1]
root 176694 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:148-xfs-conv/vdc1]
root 176695 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:149-xfs-conv/vdc1]
root 176696 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:150-xfs-conv/vdc1]
root 176697 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:151-xfs-conv/vdc1]
root 176698 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:152-xfs-conv/vdc1]
root 176699 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:153-xfs-conv/vdc1]
root 176700 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:154-xfs-conv/vdc1]
root 176701 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:155-xfs-conv/vdc1]
root 176702 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:156-xfs-conv/vdc1]
root 176703 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:157-xfs-conv/vdc1]
root 176704 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:158-xfs-buf/vda1]
root 176705 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:159-xfs-conv/vdc1]
root 176706 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:160-xfs-conv/vdc1]
root 176707 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:161-xfs-conv/vdc1]
root 176708 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:162-xfs-conv/vdc1]
root 176709 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:163-xfs-conv/vdc1]
root 176710 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:164-xfs-conv/vdc1]
root 176711 0.2 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:165-xfs-conv/vda1]
root 176712 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:166-xfs-conv/vdc1]
root 176713 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:167-xfs-conv/vdc1]
root 176714 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:168-xfs-conv/vdc1]
root 176715 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:169-xfs-conv/vdc1]
root 176716 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:170-xfs-conv/vdc1]
root 176717 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:171-xfs-conv/vdc1]
root 176718 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:172-xfs-conv/vdc1]
root 176719 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:173-xfs-conv/vdc1]
root 176720 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:174-xfs-conv/vdc1]
root 176721 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:175-xfs-conv/vdc1]
root 176722 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:176-xfs-conv/vdc1]
root 176723 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:177-xfs-conv/vdc1]
root 176724 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:178-xfs-conv/vdc1]
root 176725 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:179-xfs-conv/vdc1]
root 176726 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:180-xfs-conv/vdc1]
root 176727 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:181-xfs-conv/vdc1]
root 176728 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:182-xfs-conv/vdc1]
root 176729 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:183-xfs-conv/vdc1]
root 176730 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:184-xfs-conv/vdc1]
root 176731 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:185-xfs-conv/vdc1]
root 176732 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:186-xfs-conv/vdc1]
root 176733 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:187-xfs-conv/vdc1]
root 176734 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:188-xfs-conv/vdc1]
root 176735 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:189-xfs-conv/vdc1]
root 176736 0.3 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:190-cgroup_destroy]
root 176737 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:191-xfs-conv/vdc1]
root 176738 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:192-xfs-conv/vdc1]
root 176739 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:193-xfs-conv/vdc1]
root 176740 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:194-xfs-conv/vdc1]
root 176741 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:195-xfs-conv/vdc1]
root 176742 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:196-xfs-conv/vdc1]
root 176743 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:197-xfs-conv/vdc1]
root 176744 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:198-xfs-conv/vdc1]
root 176745 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:199-xfs-conv/vdc1]
root 176746 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:200-xfs-conv/vdc1]
root 176747 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:201-xfs-conv/vdc1]
root 176748 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:202-xfs-conv/vdc1]
root 176749 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:203-xfs-conv/vdc1]
root 176750 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:204-xfs-conv/vdc1]
root 176751 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:205-xfs-conv/vdc1]
root 176752 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:206-xfs-buf/vda1]
root 176753 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:207-xfs-conv/vdc1]
root 176754 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:208-xfs-conv/vdc1]
root 176755 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:209-xfs-conv/vdc1]
root 176756 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:210-xfs-conv/vdc1]
root 176757 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:211-xfs-conv/vdc1]
root 176758 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:212-xfs-conv/vdc1]
root 176759 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:213-xfs-conv/vdc1]
root 176760 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:214-xfs-conv/vdc1]
root 176761 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:215-xfs-conv/vdc1]
root 176762 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:216-xfs-conv/vdc1]
root 176763 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:217-xfs-conv/vdc1]
root 176764 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:218-xfs-conv/vdc1]
root 176765 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:219-xfs-conv/vdc1]
root 176766 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:220-xfs-conv/vdc1]
root 176767 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:221-xfs-conv/vdc1]
root 176768 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:222-xfs-conv/vdc1]
root 176769 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:223-xfs-conv/vdc1]
root 176770 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:224-xfs-conv/vdc1]
root 176771 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:225-xfs-conv/vdc1]
root 176772 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:226-xfs-conv/vdc1]
root 176773 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:227-xfs-conv/vdc1]
root 176774 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:228-xfs-conv/vdc1]
root 176775 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:229-xfs-conv/vdc1]
root 176776 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:230-xfs-conv/vdc1]
root 176777 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:231-xfs-conv/vdc1]
root 176778 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:232-xfs-conv/vdc1]
root 176779 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:233-xfs-conv/vdc1]
root 176780 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:234-xfs-conv/vdc1]
root 176781 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:235-xfs-conv/vdc1]
root 176782 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:236-xfs-conv/vdc1]
root 176783 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:237-xfs-conv/vdc1]
root 176784 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:238-xfs-conv/vdc1]
root 176785 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:239-xfs-conv/vdc1]
root 176786 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:240-xfs-conv/vdc1]
root 176787 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:241-xfs-conv/vdc1]
root 176788 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:242-xfs-conv/vdc1]
root 176789 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:243-xfs-conv/vdc1]
root 176790 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:244-xfs-conv/vdc1]
root 176791 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:245-xfs-conv/vdc1]
root 176792 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:246-xfs-conv/vdc1]
root 176793 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:247-xfs-conv/vdc1]
root 176794 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:248-xfs-conv/vdc1]
root 176795 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:249-xfs-conv/vdc1]
root 176796 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:250-xfs-conv/vdc1]
root 176797 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:251-xfs-conv/vdc1]
root 176798 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:252-xfs-conv/vdc1]
root 176799 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:253-xfs-conv/vdc1]
root 176800 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:254-xfs-conv/vdc1]
root 176801 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:255-xfs-conv/vdc1]
root 176802 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:256-xfs-buf/vda1]
root 176803 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:257-xfs-conv/vdc1]
root 176804 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:7-events_unbound]
root 176813 0.0 0.0 0 0 ? I< 10:44 0:00 \_ [kworker/0:2H-kblockd]
root 176814 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/u4:8-events_unbound]
root 176815 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/0:258]
root 1 0.0 0.3 21852 13056 ? Ss Oct08 0:19 /run/current-system/systemd/lib/systemd/systemd
root 399 0.0 1.8 139764 75096 ? Ss Oct08 0:13 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-journald
root 455 0.0 0.2 33848 8168 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-udevd
systemd+ 811 0.0 0.1 16800 6660 ? Ss Oct08 0:10 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-oomd
systemd+ 816 0.0 0.1 91380 7952 ? Ssl Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-timesyncd
root 837 0.0 0.0 80596 3288 ? Ssl Oct08 1:37 /nix/store/ag3xk1l8ij06vx434abk8643f8p7i08c-qemu-host-cpu-only-8.2.6-ga/bin/qemu-ga --statedir /run/qemu-ga
root 840 0.0 0.0 226896 1984 ? Ss Oct08 0:00 /nix/store/k34f0d079arcgfjsq78gpkdbd6l6nnq4-cron-4.1/bin/cron -n
message+ 850 0.0 0.1 13776 6080 ? Ss Oct08 0:05 /nix/store/0hm8vh65m378439kl16xv0p6l7c51asj-dbus-1.14.10/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root 876 0.0 0.1 17468 7968 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-logind
nscd 1074 0.0 0.1 555748 6016 ? Ssl Oct08 0:28 /nix/store/zza9hvd6iawqdcxvinf4yxv580av3s9f-nsncd-unstable-2024-01-16/bin/nsncd
telegraf 1092 0.3 3.4 6344672 138484 ? S<Lsl Oct08 13:05 /nix/store/8bnbkyh26j97l0pw02gb7lngh4n6k3r5-telegraf-1.30.3/bin/telegraf -config /nix/store/nh4k7bx1asm0kn1klhbmg52wk1qdcwpw-config.toml -config-directory /nix/store/dj77wnb5j
root 1093 0.0 1.5 1109328 60864 ? Ssl Oct08 2:24 /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/fn9jcsr2kp2kq3m2qd6qrkv6xh7jcj5g-fail2ban-1.0.2/bin/.fail2ban-server-wrapped -xf start
sensucl+ 1094 0.0 0.9 898112 38340 ? Ssl Oct08 1:41 /nix/store/qqc6v89xn0g2w123wx85blkpc4pz2ags-ruby-2.7.8/bin/ruby /nix/store/dpvf0jdq1mbrdc90aapyrn2wvjbpckyv-sensu-check-env/bin/sensu-client -L warn -c /nix/store/ly677hg5b7szz
root 1098 0.0 0.1 11564 7568 ? Ss Oct08 0:00 sshd: /nix/store/1m888byzaqaig6azrrfpmjdyhgfliaga-openssh-9.7p1/bin/sshd -D -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
root 176967 0.0 0.2 14380 9840 ? Ss 10:47 0:00 \_ sshd: ctheune [priv]
ctheune 176988 0.2 0.1 14540 5856 ? S 10:47 0:00 \_ sshd: ctheune@pts/0
ctheune 176992 0.0 0.1 230756 5968 pts/0 Ss 10:47 0:00 \_ -bash
root 176998 0.0 0.0 228796 3956 pts/0 S+ 10:47 0:00 \_ sudo -i
root 177001 0.0 0.0 228796 1604 pts/1 Ss 10:47 0:00 \_ sudo -i
root 177002 0.0 0.1 230892 6064 pts/1 S 10:47 0:00 \_ -bash
root 177041 0.0 0.1 232344 4264 pts/1 R+ 10:48 0:00 \_ ps auxf
root 1101 0.0 0.0 226928 1944 tty1 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login --noclear --keep-baud tty1 115200,38400,9600 linux
root 1102 0.0 0.0 226928 2192 ttyS0 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login ttyS0 --keep-baud vt220
_du4651+ 1105 0.0 2.2 2505204 90824 ? Ssl Oct08 1:15 /nix/store/ff5j2is3di7praysyv232wfvcq7hvkii-filebeat-oss-7.17.16/bin/filebeat -e -c /nix/store/xlb56lv0f3j03l3v34x5jfvq8wng18ww-filebeat-journal-services19.gocept.net.json -pat
mysql 2809 0.3 18.6 4784932 750856 ? Ssl Oct08 11:47 /nix/store/9iq211dy95nqn484nx5z5mv3c7pc2h27-percona-server_lts-8.0.36-28/bin/mysqld --defaults-extra-file=/nix/store/frvxmffp9fpgq06bx89rgczyn6k6i51y-my.cnf --user=mysql --data
root 176527 0.0 0.0 227904 3236 ? SNs 10:43 0:00 /nix/store/516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32/bin/bash /nix/store/s8g5ls9d611hjq5psyd15sqbpqgrlwck-unit-script-fc-agent-start/bin/fc-agent-start
root 176535 0.1 1.1 279068 46452 ? SN 10:43 0:00 \_ /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/gavi1rlv3ja79vl5hg3lgh07absa8yb9-python3.11-fc-agent-1.0/bin/.fc-manage-wrapped --enc-p
root 176536 3.5 1.8 635400 72368 ? DNl 10:43 0:09 \_ nix-build --no-build-output <nixpkgs/nixos> -A system -I https://hydra.flyingcircus.io/build/496886/download/1/nixexprs.tar.xz --out-link /run/fc-agent-built-system
ctheune 176972 0.1 0.2 20028 11856 ? Ss 10:47 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd --user
ctheune 176974 0.0 0.0 20368 3004 ? S 10:47 0:00 \_ (sd-pam)
[218305.88474928]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Oct08 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Oct08 0:00 \_ [pool_workqueue_release]
root 4 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-rcu_gp]
root 5 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-sync_wq]
root 6 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-slub_flushwq]
root 7 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-netns]
root 10 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/0:0H-kblockd]
root 13 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mm_percpu_wq]
root 14 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_kthread]
root 15 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_rude_kthread]
root 16 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_trace_kthread]
root 17 0.0 0.0 0 0 ? S Oct08 0:25 \_ [ksoftirqd/0]
root 18 0.0 0.0 0 0 ? I Oct08 1:12 \_ [rcu_preempt]
root 19 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_par_gp_kthread_worker/0]
root 20 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_gp_kthread_worker]
root 21 0.0 0.0 0 0 ? S Oct08 0:00 \_ [migration/0]
root 22 0.0 0.0 0 0 ? S Oct08 0:00 \_ [idle_inject/0]
root 23 0.0 0.0 0 0 ? S Oct08 0:00 \_ [cpuhp/0]
root 24 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kdevtmpfs]
root 25 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-inet_frag_wq]
root 26 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kauditd]
root 27 0.0 0.0 0 0 ? S Oct08 0:00 \_ [khungtaskd]
root 28 0.0 0.0 0 0 ? S Oct08 0:00 \_ [oom_reaper]
root 29 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-writeback]
root 30 0.0 0.0 0 0 ? S Oct08 0:02 \_ [kcompactd0]
root 31 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [ksmd]
root 32 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [khugepaged]
root 33 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kintegrityd]
root 34 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kblockd]
root 35 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-blkcg_punt_bio]
root 36 0.0 0.0 0 0 ? S Oct08 0:00 \_ [irq/9-acpi]
root 37 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md]
root 38 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md_bitmap]
root 39 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-devfreq_wq]
root 44 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kswapd0]
root 45 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kthrotld]
root 46 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mld]
root 47 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ipv6_addrconf]
root 54 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kstrp]
root 55 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/u5:0]
root 102 0.0 0.0 0 0 ? S Oct08 0:00 \_ [hwrng]
root 109 0.0 0.0 0 0 ? S Oct08 0:00 \_ [watchdogd]
root 149 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ata_sff]
root 150 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_0]
root 151 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_0]
root 152 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_1]
root 153 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_1]
root 184 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfsalloc]
root 185 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs_mru_cache]
root 186 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vda1]
root 187 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vda1]
root 188 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vda1]
root 189 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vda1]
root 190 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vda1]
root 191 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vda1]
root 192 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vda1]
root 193 0.0 0.0 0 0 ? S Oct08 0:20 \_ [xfsaild/vda1]
root 531 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 644 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vdc1]
root 645 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vdc1]
root 646 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vdc1]
root 647 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vdc1]
root 648 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vdc1]
root 649 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vdc1]
root 650 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vdc1]
root 651 0.0 0.0 0 0 ? S Oct08 0:05 \_ [xfsaild/vdc1]
root 723 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ttm]
root 1286 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-tls-strp]
root 2772 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 171717 0.0 0.0 0 0 ? I 09:03 0:00 \_ [kworker/u4:3-events_power_efficient]
root 174477 0.0 0.0 0 0 ? I 10:01 0:00 \_ [kworker/0:2-xfs-conv/vdc1]
root 174683 0.0 0.0 0 0 ? I 10:06 0:00 \_ [kworker/u4:2-events_unbound]
root 175378 0.0 0.0 0 0 ? I 10:20 0:00 \_ [kworker/u4:4-writeback]
root 176049 0.0 0.0 0 0 ? I 10:34 0:00 \_ [kworker/0:3-xfs-conv/vdc1]
root 176150 0.0 0.0 0 0 ? I< 10:35 0:00 \_ [kworker/0:1H-xfs-log/vda1]
root 176358 0.0 0.0 0 0 ? I 10:40 0:00 \_ [kworker/0:0-xfs-conv/vdc1]
root 176402 0.0 0.0 0 0 ? I 10:41 0:00 \_ [kworker/u4:0-events_power_efficient]
root 176544 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:1-writeback]
root 176545 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:5-writeback]
root 176546 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:6-events_power_efficient]
root 176549 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:1-xfs-conv/vdc1]
root 176550 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:4-xfs-conv/vdc1]
root 176551 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:5-xfs-conv/vdc1]
root 176552 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:6-xfs-conv/vdc1]
root 176553 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:7-xfs-conv/vdc1]
root 176554 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:8-xfs-conv/vdc1]
root 176555 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:9-xfs-conv/vdc1]
root 176556 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:10-xfs-conv/vdc1]
root 176557 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:11-xfs-conv/vdc1]
root 176558 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:12-kthrotld]
root 176559 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:13-xfs-conv/vdc1]
root 176560 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:14-xfs-conv/vdc1]
root 176561 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:15-xfs-conv/vdc1]
root 176562 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:16-xfs-conv/vdc1]
root 176563 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:17-xfs-conv/vdc1]
root 176564 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:18-xfs-conv/vdc1]
root 176565 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:19-xfs-conv/vdc1]
root 176566 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:20-xfs-conv/vdc1]
root 176567 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:21-xfs-conv/vdc1]
root 176568 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:22-xfs-conv/vdc1]
root 176569 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:23-xfs-conv/vdc1]
root 176570 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:24-xfs-conv/vdc1]
root 176571 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:25-xfs-conv/vdc1]
root 176572 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:26-xfs-conv/vdc1]
root 176573 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:27-xfs-conv/vdc1]
root 176574 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:28-xfs-conv/vdc1]
root 176575 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:29-xfs-conv/vdc1]
root 176576 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:30-xfs-conv/vdc1]
root 176577 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:31-xfs-conv/vdc1]
root 176578 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:32-xfs-conv/vdc1]
root 176579 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:33-xfs-conv/vdc1]
root 176580 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:34-xfs-conv/vdc1]
root 176581 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:35-xfs-conv/vdc1]
root 176582 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:36-xfs-conv/vdc1]
root 176583 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:37-xfs-conv/vdc1]
root 176584 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:38-xfs-conv/vdc1]
root 176585 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:39-xfs-conv/vdc1]
root 176586 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:40-xfs-conv/vdc1]
root 176587 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:41-xfs-buf/vdc1]
root 176588 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:42-xfs-conv/vdc1]
root 176589 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:43-xfs-conv/vdc1]
root 176590 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:44-xfs-conv/vdc1]
root 176591 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:45-xfs-conv/vdc1]
root 176592 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:46-xfs-conv/vdc1]
root 176593 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:47-xfs-conv/vdc1]
root 176594 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:48-xfs-conv/vdc1]
root 176595 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:49-xfs-conv/vdc1]
root 176596 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:50-xfs-conv/vdc1]
root 176597 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:51-xfs-conv/vdc1]
root 176598 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:52-xfs-conv/vdc1]
root 176599 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:53-xfs-conv/vdc1]
root 176600 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:54-xfs-conv/vdc1]
root 176601 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:55-xfs-conv/vdc1]
root 176602 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:56-xfs-conv/vdc1]
root 176603 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:57-xfs-conv/vdc1]
root 176604 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:58-xfs-conv/vdc1]
root 176605 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:59-xfs-conv/vdc1]
root 176606 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:60-xfs-conv/vdc1]
root 176607 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:61-xfs-conv/vdc1]
root 176608 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:62-xfs-conv/vdc1]
root 176609 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:63-xfs-conv/vdc1]
root 176610 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:64-xfs-conv/vdc1]
root 176611 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:65-xfs-conv/vdc1]
root 176612 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:66-xfs-conv/vdc1]
root 176613 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:67-xfs-conv/vdc1]
root 176614 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:68-xfs-conv/vdc1]
root 176615 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:69-xfs-conv/vdc1]
root 176616 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:70-xfs-conv/vdc1]
root 176617 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:71-xfs-conv/vdc1]
root 176618 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:72-xfs-conv/vdc1]
root 176619 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:73-xfs-conv/vdc1]
root 176620 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:74-xfs-conv/vdc1]
root 176621 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:75-xfs-conv/vdc1]
root 176622 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:76-xfs-conv/vdc1]
root 176623 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:77-xfs-conv/vdc1]
root 176624 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:78-xfs-conv/vdc1]
root 176625 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:79-xfs-conv/vdc1]
root 176626 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:80-xfs-conv/vdc1]
root 176627 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:81-xfs-conv/vdc1]
root 176628 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:82-xfs-conv/vdc1]
root 176629 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:83-xfs-conv/vdc1]
root 176630 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:84-xfs-conv/vdc1]
root 176631 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:85-xfs-conv/vdc1]
root 176632 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:86-xfs-conv/vdc1]
root 176633 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:87-xfs-conv/vdc1]
root 176634 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:88-xfs-conv/vdc1]
root 176635 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:89-xfs-conv/vdc1]
root 176636 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:90-xfs-conv/vdc1]
root 176637 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:91-xfs-conv/vdc1]
root 176638 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:92-xfs-conv/vdc1]
root 176639 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:93-xfs-conv/vdc1]
root 176640 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:94-xfs-conv/vdc1]
root 176641 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:95-xfs-conv/vdc1]
root 176642 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:96-xfs-conv/vdc1]
root 176643 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:97-xfs-conv/vdc1]
root 176644 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:98-xfs-conv/vdc1]
root 176645 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:99-xfs-conv/vdc1]
root 176646 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:100-xfs-conv/vdc1]
root 176647 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:101-xfs-conv/vdc1]
root 176648 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:102-xfs-conv/vdc1]
root 176649 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:103-xfs-conv/vdc1]
root 176650 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:104-xfs-conv/vdc1]
root 176651 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:105-xfs-conv/vdc1]
root 176652 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:106-xfs-conv/vdc1]
root 176653 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:107-xfs-conv/vdc1]
root 176654 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:108-xfs-conv/vdc1]
root 176655 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:109-xfs-conv/vdc1]
root 176656 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:110-xfs-conv/vdc1]
root 176657 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:111-xfs-conv/vdc1]
root 176658 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:112-xfs-conv/vdc1]
root 176659 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:113-xfs-conv/vdc1]
root 176660 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:114-xfs-conv/vdc1]
root 176661 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:115-xfs-conv/vdc1]
root 176662 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:116-xfs-conv/vdc1]
root 176663 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:117-xfs-conv/vdc1]
root 176664 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:118-xfs-conv/vdc1]
root 176665 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:119-xfs-conv/vdc1]
root 176666 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:120-xfs-conv/vdc1]
root 176667 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:121-xfs-conv/vdc1]
root 176668 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:122-xfs-conv/vdc1]
root 176669 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:123-xfs-conv/vdc1]
root 176670 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:124-xfs-conv/vdc1]
root 176671 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:125-xfs-conv/vdc1]
root 176672 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:126-xfs-conv/vdc1]
root 176673 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:127-xfs-conv/vdc1]
root 176674 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:128-xfs-conv/vdc1]
root 176675 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:129-xfs-conv/vdc1]
root 176676 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:130-xfs-conv/vdc1]
root 176677 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:131-xfs-conv/vdc1]
root 176678 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:132-xfs-conv/vdc1]
root 176679 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:133-xfs-conv/vdc1]
root 176680 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:134-xfs-conv/vdc1]
root 176681 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:135-xfs-conv/vdc1]
root 176682 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:136-xfs-conv/vdc1]
root 176683 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:137-xfs-conv/vdc1]
root 176684 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:138-xfs-conv/vdc1]
root 176685 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:139-xfs-conv/vdc1]
root 176686 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:140-xfs-conv/vdc1]
root 176687 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:141-xfs-conv/vdc1]
root 176688 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:142-xfs-conv/vdc1]
root 176689 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:143-xfs-conv/vdc1]
root 176690 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:144-xfs-conv/vdc1]
root 176691 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:145-xfs-conv/vdc1]
root 176692 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:146-xfs-conv/vdc1]
root 176693 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:147-xfs-conv/vdc1]
root 176694 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:148-xfs-conv/vdc1]
root 176695 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:149-xfs-conv/vdc1]
root 176696 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:150-xfs-conv/vdc1]
root 176697 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:151-xfs-conv/vdc1]
root 176698 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:152-xfs-conv/vdc1]
root 176699 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:153-xfs-conv/vdc1]
root 176700 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:154-xfs-conv/vdc1]
root 176701 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:155-xfs-conv/vdc1]
root 176702 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:156-xfs-conv/vdc1]
root 176703 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:157-xfs-conv/vdc1]
root 176704 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:158-xfs-buf/vda1]
root 176705 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:159-xfs-conv/vdc1]
root 176706 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:160-xfs-conv/vdc1]
root 176707 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:161-xfs-conv/vdc1]
root 176708 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:162-xfs-conv/vdc1]
root 176709 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:163-xfs-conv/vdc1]
root 176710 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:164-xfs-conv/vdc1]
root 176711 0.2 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:165-xfs-conv/vda1]
root 176712 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:166-xfs-conv/vdc1]
root 176713 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:167-xfs-conv/vdc1]
root 176714 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:168-xfs-conv/vdc1]
root 176715 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:169-xfs-conv/vdc1]
root 176716 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:170-xfs-conv/vdc1]
root 176717 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:171-xfs-conv/vdc1]
root 176718 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:172-xfs-conv/vdc1]
root 176719 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:173-xfs-conv/vdc1]
root 176720 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:174-xfs-conv/vdc1]
root 176721 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:175-xfs-conv/vdc1]
root 176722 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:176-xfs-conv/vdc1]
root 176723 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:177-xfs-conv/vdc1]
root 176724 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:178-xfs-conv/vdc1]
root 176725 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:179-xfs-conv/vdc1]
root 176726 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:180-xfs-conv/vdc1]
root 176727 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:181-xfs-conv/vdc1]
root 176728 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:182-xfs-conv/vdc1]
root 176729 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:183-xfs-conv/vdc1]
root 176730 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:184-xfs-conv/vdc1]
root 176731 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:185-xfs-conv/vdc1]
root 176732 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:186-xfs-conv/vdc1]
root 176733 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:187-xfs-conv/vdc1]
root 176734 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:188-xfs-conv/vdc1]
root 176735 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:189-xfs-conv/vdc1]
root 176736 0.3 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:190-cgroup_destroy]
root 176737 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:191-xfs-conv/vdc1]
root 176738 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:192-xfs-conv/vdc1]
root 176739 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:193-xfs-conv/vdc1]
root 176740 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:194-xfs-conv/vdc1]
root 176741 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:195-xfs-conv/vdc1]
root 176742 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:196-xfs-conv/vdc1]
root 176743 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:197-xfs-conv/vdc1]
root 176744 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:198-xfs-conv/vdc1]
root 176745 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:199-xfs-conv/vdc1]
root 176746 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:200-xfs-conv/vdc1]
root 176747 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:201-xfs-conv/vdc1]
root 176748 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:202-xfs-conv/vdc1]
root 176749 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:203-xfs-conv/vdc1]
root 176750 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:204-xfs-conv/vdc1]
root 176751 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:205-xfs-conv/vdc1]
root 176752 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:206-xfs-buf/vda1]
root 176753 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:207-xfs-conv/vdc1]
root 176754 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:208-xfs-conv/vdc1]
root 176755 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:209-xfs-conv/vdc1]
root 176756 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:210-xfs-conv/vdc1]
root 176757 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:211-xfs-conv/vdc1]
root 176758 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:212-xfs-conv/vdc1]
root 176759 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:213-xfs-conv/vdc1]
root 176760 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:214-xfs-conv/vdc1]
root 176761 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:215-xfs-conv/vdc1]
root 176762 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:216-xfs-conv/vdc1]
root 176763 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:217-xfs-conv/vdc1]
root 176764 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:218-xfs-conv/vdc1]
root 176765 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:219-xfs-conv/vdc1]
root 176766 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:220-xfs-conv/vdc1]
root 176767 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:221-xfs-conv/vdc1]
root 176768 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:222-xfs-conv/vdc1]
root 176769 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:223-xfs-conv/vdc1]
root 176770 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:224-xfs-conv/vdc1]
root 176771 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:225-xfs-conv/vdc1]
root 176772 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:226-xfs-conv/vdc1]
root 176773 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:227-xfs-conv/vdc1]
root 176774 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:228-xfs-conv/vdc1]
root 176775 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:229-xfs-conv/vdc1]
root 176776 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:230-xfs-conv/vdc1]
root 176777 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:231-xfs-conv/vdc1]
root 176778 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:232-xfs-conv/vdc1]
root 176779 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:233-xfs-conv/vdc1]
root 176780 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:234-xfs-conv/vdc1]
root 176781 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:235-xfs-conv/vdc1]
root 176782 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:236-xfs-conv/vdc1]
root 176783 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:237-xfs-conv/vdc1]
root 176784 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:238-xfs-conv/vdc1]
root 176785 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:239-xfs-conv/vdc1]
root 176786 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:240-xfs-conv/vdc1]
root 176787 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:241-xfs-conv/vdc1]
root 176788 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:242-xfs-conv/vdc1]
root 176789 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:243-xfs-conv/vdc1]
root 176790 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:244-xfs-conv/vdc1]
root 176791 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:245-xfs-conv/vdc1]
root 176792 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:246-xfs-conv/vdc1]
root 176793 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:247-xfs-conv/vdc1]
root 176794 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:248-xfs-conv/vdc1]
root 176795 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:249-xfs-conv/vdc1]
root 176796 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:250-xfs-conv/vdc1]
root 176797 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:251-xfs-conv/vdc1]
root 176798 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:252-xfs-conv/vdc1]
root 176799 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:253-xfs-conv/vdc1]
root 176800 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:254-xfs-conv/vdc1]
root 176801 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:255-xfs-conv/vdc1]
root 176802 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:256-xfs-buf/vda1]
root 176803 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:257-xfs-conv/vdc1]
root 176804 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:7-events_unbound]
root 176813 0.0 0.0 0 0 ? I< 10:44 0:00 \_ [kworker/0:2H-kblockd]
root 176814 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/u4:8-events_unbound]
root 176815 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/0:258]
root 1 0.0 0.3 21852 13056 ? Ss Oct08 0:19 /run/current-system/systemd/lib/systemd/systemd
root 399 0.0 1.8 139764 75096 ? Ss Oct08 0:13 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-journald
root 455 0.0 0.2 33848 8168 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-udevd
systemd+ 811 0.0 0.1 16800 6660 ? Ss Oct08 0:10 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-oomd
systemd+ 816 0.0 0.1 91380 7952 ? Ssl Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-timesyncd
root 837 0.0 0.0 80596 3288 ? Ssl Oct08 1:37 /nix/store/ag3xk1l8ij06vx434abk8643f8p7i08c-qemu-host-cpu-only-8.2.6-ga/bin/qemu-ga --statedir /run/qemu-ga
root 840 0.0 0.0 226896 1984 ? Ss Oct08 0:00 /nix/store/k34f0d079arcgfjsq78gpkdbd6l6nnq4-cron-4.1/bin/cron -n
message+ 850 0.0 0.1 13776 6080 ? Ss Oct08 0:05 /nix/store/0hm8vh65m378439kl16xv0p6l7c51asj-dbus-1.14.10/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root 876 0.0 0.1 17468 7968 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-logind
nscd 1074 0.0 0.1 555748 6016 ? Ssl Oct08 0:28 /nix/store/zza9hvd6iawqdcxvinf4yxv580av3s9f-nsncd-unstable-2024-01-16/bin/nsncd
telegraf 1092 0.3 3.4 6344672 138484 ? S<Lsl Oct08 13:05 /nix/store/8bnbkyh26j97l0pw02gb7lngh4n6k3r5-telegraf-1.30.3/bin/telegraf -config /nix/store/nh4k7bx1asm0kn1klhbmg52wk1qdcwpw-config.toml -config-directory /nix/store/dj77wnb5j
root 1093 0.0 1.5 1109328 60864 ? Ssl Oct08 2:24 /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/fn9jcsr2kp2kq3m2qd6qrkv6xh7jcj5g-fail2ban-1.0.2/bin/.fail2ban-server-wrapped -xf start
sensucl+ 1094 0.0 0.9 898112 38340 ? Ssl Oct08 1:41 /nix/store/qqc6v89xn0g2w123wx85blkpc4pz2ags-ruby-2.7.8/bin/ruby /nix/store/dpvf0jdq1mbrdc90aapyrn2wvjbpckyv-sensu-check-env/bin/sensu-client -L warn -c /nix/store/ly677hg5b7szz
root 1098 0.0 0.1 11564 7568 ? Ss Oct08 0:00 sshd: /nix/store/1m888byzaqaig6azrrfpmjdyhgfliaga-openssh-9.7p1/bin/sshd -D -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
root 176967 0.0 0.2 14380 9840 ? Ss 10:47 0:00 \_ sshd: ctheune [priv]
ctheune 176988 0.2 0.1 14540 5856 ? S 10:47 0:00 \_ sshd: ctheune@pts/0
ctheune 176992 0.0 0.1 230756 5968 pts/0 Ss 10:47 0:00 \_ -bash
root 176998 0.0 0.0 228796 3956 pts/0 S+ 10:47 0:00 \_ sudo -i
root 177001 0.0 0.0 228796 1604 pts/1 Ss 10:47 0:00 \_ sudo -i
root 177002 0.0 0.1 230892 6064 pts/1 S 10:47 0:00 \_ -bash
root 177048 0.0 0.0 232344 3944 pts/1 R+ 10:48 0:00 \_ ps auxf
root 1101 0.0 0.0 226928 1944 tty1 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login --noclear --keep-baud tty1 115200,38400,9600 linux
root 1102 0.0 0.0 226928 2192 ttyS0 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login ttyS0 --keep-baud vt220
_du4651+ 1105 0.0 2.2 2505204 90824 ? Ssl Oct08 1:15 /nix/store/ff5j2is3di7praysyv232wfvcq7hvkii-filebeat-oss-7.17.16/bin/filebeat -e -c /nix/store/xlb56lv0f3j03l3v34x5jfvq8wng18ww-filebeat-journal-services19.gocept.net.json -pat
mysql 2809 0.3 18.6 4784932 750856 ? Ssl Oct08 11:47 /nix/store/9iq211dy95nqn484nx5z5mv3c7pc2h27-percona-server_lts-8.0.36-28/bin/mysqld --defaults-extra-file=/nix/store/frvxmffp9fpgq06bx89rgczyn6k6i51y-my.cnf --user=mysql --data
root 176527 0.0 0.0 227904 3236 ? SNs 10:43 0:00 /nix/store/516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32/bin/bash /nix/store/s8g5ls9d611hjq5psyd15sqbpqgrlwck-unit-script-fc-agent-start/bin/fc-agent-start
root 176535 0.1 1.1 279068 46452 ? SN 10:43 0:00 \_ /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/gavi1rlv3ja79vl5hg3lgh07absa8yb9-python3.11-fc-agent-1.0/bin/.fc-manage-wrapped --enc-p
root 176536 3.5 1.8 635400 72368 ? DNl 10:43 0:09 \_ nix-build --no-build-output <nixpkgs/nixos> -A system -I https://hydra.flyingcircus.io/build/496886/download/1/nixexprs.tar.xz --out-link /run/fc-agent-built-system
ctheune 176972 0.1 0.2 20028 11856 ? Ss 10:47 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd --user
ctheune 176974 0.0 0.0 20368 3004 ? S 10:47 0:00 \_ (sd-pam)
[218314.012140606]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Oct08 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Oct08 0:00 \_ [pool_workqueue_release]
root 4 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-rcu_gp]
root 5 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-sync_wq]
root 6 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-slub_flushwq]
root 7 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-netns]
root 10 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/0:0H-kblockd]
root 13 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mm_percpu_wq]
root 14 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_kthread]
root 15 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_rude_kthread]
root 16 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_trace_kthread]
root 17 0.0 0.0 0 0 ? S Oct08 0:25 \_ [ksoftirqd/0]
root 18 0.0 0.0 0 0 ? I Oct08 1:12 \_ [rcu_preempt]
root 19 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_par_gp_kthread_worker/0]
root 20 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_gp_kthread_worker]
root 21 0.0 0.0 0 0 ? S Oct08 0:00 \_ [migration/0]
root 22 0.0 0.0 0 0 ? S Oct08 0:00 \_ [idle_inject/0]
root 23 0.0 0.0 0 0 ? S Oct08 0:00 \_ [cpuhp/0]
root 24 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kdevtmpfs]
root 25 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-inet_frag_wq]
root 26 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kauditd]
root 27 0.0 0.0 0 0 ? S Oct08 0:00 \_ [khungtaskd]
root 28 0.0 0.0 0 0 ? S Oct08 0:00 \_ [oom_reaper]
root 29 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-writeback]
root 30 0.0 0.0 0 0 ? S Oct08 0:02 \_ [kcompactd0]
root 31 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [ksmd]
root 32 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [khugepaged]
root 33 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kintegrityd]
root 34 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kblockd]
root 35 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-blkcg_punt_bio]
root 36 0.0 0.0 0 0 ? S Oct08 0:00 \_ [irq/9-acpi]
root 37 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md]
root 38 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md_bitmap]
root 39 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-devfreq_wq]
root 44 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kswapd0]
root 45 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kthrotld]
root 46 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mld]
root 47 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ipv6_addrconf]
root 54 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kstrp]
root 55 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/u5:0]
root 102 0.0 0.0 0 0 ? S Oct08 0:00 \_ [hwrng]
root 109 0.0 0.0 0 0 ? S Oct08 0:00 \_ [watchdogd]
root 149 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ata_sff]
root 150 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_0]
root 151 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_0]
root 152 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_1]
root 153 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_1]
root 184 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfsalloc]
root 185 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs_mru_cache]
root 186 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vda1]
root 187 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vda1]
root 188 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vda1]
root 189 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vda1]
root 190 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vda1]
root 191 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vda1]
root 192 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vda1]
root 193 0.0 0.0 0 0 ? S Oct08 0:20 \_ [xfsaild/vda1]
root 531 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 644 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vdc1]
root 645 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vdc1]
root 646 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vdc1]
root 647 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vdc1]
root 648 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vdc1]
root 649 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vdc1]
root 650 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vdc1]
root 651 0.0 0.0 0 0 ? S Oct08 0:05 \_ [xfsaild/vdc1]
root 723 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ttm]
root 1286 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-tls-strp]
root 2772 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 171717 0.0 0.0 0 0 ? I 09:03 0:00 \_ [kworker/u4:3-writeback]
root 174477 0.0 0.0 0 0 ? I 10:01 0:00 \_ [kworker/0:2-xfs-conv/vdc1]
root 174683 0.0 0.0 0 0 ? I 10:06 0:00 \_ [kworker/u4:2-events_unbound]
root 175378 0.0 0.0 0 0 ? I 10:20 0:00 \_ [kworker/u4:4-events_power_efficient]
root 176049 0.0 0.0 0 0 ? I 10:34 0:00 \_ [kworker/0:3-xfs-conv/vdc1]
root 176150 0.0 0.0 0 0 ? I< 10:35 0:00 \_ [kworker/0:1H-xfs-log/vda1]
root 176358 0.0 0.0 0 0 ? I 10:40 0:00 \_ [kworker/0:0-xfs-conv/vdc1]
root 176402 0.0 0.0 0 0 ? I 10:41 0:00 \_ [kworker/u4:0-events_power_efficient]
root 176544 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:1-writeback]
root 176545 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:5-writeback]
root 176546 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:6-events_power_efficient]
root 176549 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:1-xfs-conv/vdc1]
root 176550 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:4-xfs-conv/vdc1]
root 176551 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:5-xfs-conv/vdc1]
root 176552 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:6-xfs-conv/vdc1]
root 176553 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:7-xfs-conv/vdc1]
root 176554 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:8-xfs-conv/vdc1]
root 176555 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:9-xfs-conv/vdc1]
root 176556 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:10-xfs-conv/vdc1]
root 176557 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:11-xfs-conv/vdc1]
root 176558 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:12-kthrotld]
root 176559 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:13-xfs-conv/vdc1]
root 176560 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:14-xfs-conv/vdc1]
root 176561 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:15-xfs-conv/vdc1]
root 176562 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:16-xfs-conv/vdc1]
root 176563 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:17-xfs-conv/vdc1]
root 176564 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:18-xfs-conv/vdc1]
root 176565 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:19-xfs-conv/vdc1]
root 176566 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:20-xfs-conv/vdc1]
root 176567 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:21-xfs-conv/vdc1]
root 176568 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:22-xfs-conv/vdc1]
root 176569 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:23-xfs-conv/vdc1]
root 176570 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:24-xfs-conv/vdc1]
root 176571 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:25-xfs-conv/vdc1]
root 176572 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:26-xfs-conv/vdc1]
root 176573 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:27-xfs-conv/vdc1]
root 176574 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:28-xfs-conv/vdc1]
root 176575 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:29-xfs-conv/vdc1]
root 176576 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:30-xfs-conv/vdc1]
root 176577 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:31-xfs-conv/vdc1]
root 176578 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:32-xfs-conv/vdc1]
root 176579 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:33-xfs-conv/vdc1]
root 176580 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:34-xfs-conv/vdc1]
root 176581 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:35-xfs-conv/vdc1]
root 176582 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:36-xfs-conv/vdc1]
root 176583 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:37-xfs-conv/vdc1]
root 176584 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:38-xfs-conv/vdc1]
root 176585 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:39-xfs-conv/vdc1]
root 176586 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:40-xfs-conv/vdc1]
root 176587 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:41-xfs-buf/vdc1]
root 176588 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:42-xfs-conv/vdc1]
root 176589 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:43-xfs-conv/vdc1]
root 176590 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:44-xfs-conv/vdc1]
root 176591 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:45-xfs-conv/vdc1]
root 176592 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:46-xfs-conv/vdc1]
root 176593 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:47-xfs-conv/vdc1]
root 176594 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:48-xfs-conv/vdc1]
root 176595 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:49-xfs-conv/vdc1]
root 176596 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:50-xfs-conv/vdc1]
root 176597 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:51-xfs-conv/vdc1]
root 176598 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:52-xfs-conv/vdc1]
root 176599 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:53-xfs-conv/vdc1]
root 176600 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:54-xfs-conv/vdc1]
root 176601 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:55-xfs-conv/vdc1]
root 176602 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:56-xfs-conv/vdc1]
root 176603 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:57-xfs-conv/vdc1]
root 176604 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:58-xfs-conv/vdc1]
root 176605 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:59-xfs-conv/vdc1]
root 176606 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:60-xfs-conv/vdc1]
root 176607 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:61-xfs-conv/vdc1]
root 176608 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:62-xfs-conv/vdc1]
root 176609 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:63-xfs-conv/vdc1]
root 176610 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:64-xfs-conv/vdc1]
root 176611 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:65-xfs-conv/vdc1]
root 176612 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:66-xfs-conv/vdc1]
root 176613 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:67-xfs-conv/vdc1]
root 176614 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:68-xfs-conv/vdc1]
root 176615 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:69-xfs-conv/vdc1]
root 176616 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:70-xfs-conv/vdc1]
root 176617 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:71-xfs-conv/vdc1]
root 176618 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:72-xfs-conv/vdc1]
root 176619 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:73-xfs-conv/vdc1]
root 176620 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:74-xfs-conv/vdc1]
root 176621 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:75-xfs-conv/vdc1]
root 176622 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:76-xfs-conv/vdc1]
root 176623 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:77-xfs-conv/vdc1]
root 176624 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:78-xfs-conv/vdc1]
root 176625 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:79-xfs-conv/vdc1]
root 176626 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:80-xfs-conv/vdc1]
root 176627 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:81-xfs-conv/vdc1]
root 176628 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:82-xfs-conv/vdc1]
root 176629 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:83-xfs-conv/vdc1]
root 176630 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:84-xfs-conv/vdc1]
root 176631 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:85-xfs-conv/vdc1]
root 176632 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:86-xfs-conv/vdc1]
root 176633 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:87-xfs-conv/vdc1]
root 176634 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:88-xfs-conv/vdc1]
root 176635 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:89-xfs-conv/vdc1]
root 176636 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:90-xfs-conv/vdc1]
root 176637 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:91-xfs-conv/vdc1]
root 176638 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:92-xfs-conv/vdc1]
root 176639 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:93-xfs-conv/vdc1]
root 176640 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:94-xfs-conv/vdc1]
root 176641 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:95-xfs-conv/vdc1]
root 176642 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:96-xfs-conv/vdc1]
root 176643 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:97-xfs-conv/vdc1]
root 176644 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:98-xfs-conv/vdc1]
root 176645 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:99-xfs-conv/vdc1]
root 176646 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:100-xfs-conv/vdc1]
root 176647 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:101-xfs-conv/vdc1]
root 176648 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:102-xfs-conv/vdc1]
root 176649 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:103-xfs-conv/vdc1]
root 176650 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:104-xfs-conv/vdc1]
root 176651 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:105-xfs-conv/vdc1]
root 176652 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:106-xfs-conv/vdc1]
root 176653 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:107-xfs-conv/vdc1]
root 176654 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:108-xfs-conv/vdc1]
root 176655 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:109-xfs-conv/vdc1]
root 176656 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:110-xfs-conv/vdc1]
root 176657 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:111-xfs-conv/vdc1]
root 176658 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:112-xfs-conv/vdc1]
root 176659 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:113-xfs-conv/vdc1]
root 176660 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:114-xfs-conv/vdc1]
root 176661 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:115-xfs-conv/vdc1]
root 176662 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:116-xfs-conv/vdc1]
root 176663 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:117-xfs-conv/vdc1]
root 176664 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:118-xfs-conv/vdc1]
root 176665 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:119-xfs-conv/vdc1]
root 176666 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:120-xfs-conv/vdc1]
root 176667 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:121-xfs-conv/vdc1]
root 176668 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:122-xfs-conv/vdc1]
root 176669 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:123-xfs-conv/vdc1]
root 176670 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:124-xfs-conv/vdc1]
root 176671 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:125-xfs-conv/vdc1]
root 176672 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:126-xfs-conv/vdc1]
root 176673 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:127-xfs-conv/vdc1]
root 176674 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:128-xfs-conv/vdc1]
root 176675 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:129-xfs-conv/vdc1]
root 176676 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:130-xfs-conv/vdc1]
root 176677 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:131-xfs-conv/vdc1]
root 176678 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:132-xfs-conv/vdc1]
root 176679 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:133-xfs-conv/vdc1]
root 176680 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:134-xfs-conv/vdc1]
root 176681 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:135-xfs-conv/vdc1]
root 176682 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:136-xfs-conv/vdc1]
root 176683 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:137-xfs-conv/vdc1]
root 176684 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:138-xfs-conv/vdc1]
root 176685 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:139-xfs-conv/vdc1]
root 176686 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:140-xfs-conv/vdc1]
root 176687 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:141-xfs-conv/vdc1]
root 176688 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:142-xfs-conv/vdc1]
root 176689 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:143-xfs-conv/vdc1]
root 176690 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:144-xfs-conv/vdc1]
root 176691 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:145-xfs-conv/vdc1]
root 176692 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:146-xfs-conv/vdc1]
root 176693 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:147-xfs-conv/vdc1]
root 176694 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:148-xfs-conv/vdc1]
root 176695 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:149-xfs-conv/vdc1]
root 176696 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:150-xfs-conv/vdc1]
root 176697 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:151-xfs-conv/vdc1]
root 176698 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:152-xfs-conv/vdc1]
root 176699 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:153-xfs-conv/vdc1]
root 176700 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:154-xfs-conv/vdc1]
root 176701 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:155-xfs-conv/vdc1]
root 176702 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:156-xfs-conv/vdc1]
root 176703 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:157-xfs-conv/vdc1]
root 176704 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:158-xfs-buf/vda1]
root 176705 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:159-xfs-conv/vdc1]
root 176706 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:160-xfs-conv/vdc1]
root 176707 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:161-xfs-conv/vdc1]
root 176708 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:162-xfs-conv/vdc1]
root 176709 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:163-xfs-conv/vdc1]
root 176710 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:164-xfs-conv/vdc1]
root 176711 0.2 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:165-xfs-conv/vda1]
root 176712 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:166-xfs-conv/vdc1]
root 176713 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:167-xfs-conv/vdc1]
root 176714 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:168-xfs-conv/vdc1]
root 176715 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:169-xfs-conv/vdc1]
root 176716 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:170-xfs-conv/vdc1]
root 176717 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:171-xfs-conv/vdc1]
root 176718 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:172-xfs-conv/vdc1]
root 176719 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:173-xfs-conv/vdc1]
root 176720 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:174-xfs-conv/vdc1]
root 176721 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:175-xfs-conv/vdc1]
root 176722 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:176-xfs-conv/vdc1]
root 176723 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:177-xfs-conv/vdc1]
root 176724 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:178-xfs-conv/vdc1]
root 176725 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:179-xfs-conv/vdc1]
root 176726 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:180-xfs-conv/vdc1]
root 176727 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:181-xfs-conv/vdc1]
root 176728 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:182-xfs-conv/vdc1]
root 176729 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:183-xfs-conv/vdc1]
root 176730 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:184-xfs-conv/vdc1]
root 176731 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:185-xfs-conv/vdc1]
root 176732 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:186-xfs-conv/vdc1]
root 176733 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:187-xfs-conv/vdc1]
root 176734 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:188-xfs-conv/vdc1]
root 176735 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:189-xfs-conv/vdc1]
root 176736 0.3 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:190-cgroup_destroy]
root 176737 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:191-xfs-conv/vdc1]
root 176738 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:192-xfs-conv/vdc1]
root 176739 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:193-xfs-conv/vdc1]
root 176740 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:194-xfs-conv/vdc1]
root 176741 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:195-xfs-conv/vdc1]
root 176742 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:196-xfs-conv/vdc1]
root 176743 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:197-xfs-conv/vdc1]
root 176744 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:198-xfs-conv/vdc1]
root 176745 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:199-xfs-conv/vdc1]
root 176746 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:200-xfs-conv/vdc1]
root 176747 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:201-xfs-conv/vdc1]
root 176748 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:202-xfs-conv/vdc1]
root 176749 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:203-xfs-conv/vdc1]
root 176750 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:204-xfs-conv/vdc1]
root 176751 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:205-xfs-conv/vdc1]
root 176752 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:206-xfs-buf/vda1]
root 176753 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:207-xfs-conv/vdc1]
root 176754 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:208-xfs-conv/vdc1]
root 176755 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:209-xfs-conv/vdc1]
root 176756 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:210-xfs-conv/vdc1]
root 176757 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:211-xfs-conv/vdc1]
root 176758 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:212-xfs-conv/vdc1]
root 176759 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:213-xfs-conv/vdc1]
root 176760 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:214-xfs-conv/vdc1]
root 176761 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:215-xfs-conv/vdc1]
root 176762 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:216-xfs-conv/vdc1]
root 176763 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:217-xfs-conv/vdc1]
root 176764 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:218-xfs-conv/vdc1]
root 176765 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:219-xfs-conv/vdc1]
root 176766 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:220-xfs-conv/vdc1]
root 176767 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:221-xfs-conv/vdc1]
root 176768 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:222-xfs-conv/vdc1]
root 176769 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:223-xfs-conv/vdc1]
root 176770 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:224-xfs-conv/vdc1]
root 176771 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:225-xfs-conv/vdc1]
root 176772 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:226-xfs-conv/vdc1]
root 176773 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:227-xfs-conv/vdc1]
root 176774 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:228-xfs-conv/vdc1]
root 176775 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:229-xfs-conv/vdc1]
root 176776 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:230-xfs-conv/vdc1]
root 176777 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:231-xfs-conv/vdc1]
root 176778 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:232-xfs-conv/vdc1]
root 176779 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:233-xfs-conv/vdc1]
root 176780 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:234-xfs-conv/vdc1]
root 176781 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:235-xfs-conv/vdc1]
root 176782 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:236-xfs-conv/vdc1]
root 176783 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:237-xfs-conv/vdc1]
root 176784 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:238-xfs-conv/vdc1]
root 176785 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:239-xfs-conv/vdc1]
root 176786 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:240-xfs-conv/vdc1]
root 176787 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:241-xfs-conv/vdc1]
root 176788 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:242-xfs-conv/vdc1]
root 176789 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:243-xfs-conv/vdc1]
root 176790 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:244-xfs-conv/vdc1]
root 176791 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:245-xfs-conv/vdc1]
root 176792 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:246-xfs-conv/vdc1]
root 176793 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:247-xfs-conv/vdc1]
root 176794 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:248-xfs-conv/vdc1]
root 176795 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:249-xfs-conv/vdc1]
root 176796 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:250-xfs-conv/vdc1]
root 176797 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:251-xfs-conv/vdc1]
root 176798 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:252-xfs-conv/vdc1]
root 176799 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:253-xfs-conv/vdc1]
root 176800 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:254-xfs-conv/vdc1]
root 176801 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:255-xfs-conv/vdc1]
root 176802 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:256-xfs-buf/vda1]
root 176803 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:257-xfs-conv/vdc1]
root 176804 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:7-events_unbound]
root 176813 0.0 0.0 0 0 ? I< 10:44 0:00 \_ [kworker/0:2H-kblockd]
root 176814 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/u4:8-events_unbound]
root 176815 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/0:258]
root 1 0.0 0.3 21852 13056 ? Ss Oct08 0:19 /run/current-system/systemd/lib/systemd/systemd
root 399 0.0 1.8 139764 75096 ? Ss Oct08 0:13 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-journald
root 455 0.0 0.2 33848 8168 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-udevd
systemd+ 811 0.0 0.1 16800 6660 ? Ss Oct08 0:10 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-oomd
systemd+ 816 0.0 0.1 91380 7952 ? Ssl Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-timesyncd
root 837 0.0 0.0 80596 3288 ? Ssl Oct08 1:37 /nix/store/ag3xk1l8ij06vx434abk8643f8p7i08c-qemu-host-cpu-only-8.2.6-ga/bin/qemu-ga --statedir /run/qemu-ga
root 840 0.0 0.0 226896 1984 ? Ss Oct08 0:00 /nix/store/k34f0d079arcgfjsq78gpkdbd6l6nnq4-cron-4.1/bin/cron -n
message+ 850 0.0 0.1 13776 6080 ? Ss Oct08 0:05 /nix/store/0hm8vh65m378439kl16xv0p6l7c51asj-dbus-1.14.10/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root 876 0.0 0.1 17468 7968 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-logind
nscd 1074 0.0 0.1 555748 6016 ? Ssl Oct08 0:28 /nix/store/zza9hvd6iawqdcxvinf4yxv580av3s9f-nsncd-unstable-2024-01-16/bin/nsncd
telegraf 1092 0.3 3.4 6344672 138484 ? S<Lsl Oct08 13:05 /nix/store/8bnbkyh26j97l0pw02gb7lngh4n6k3r5-telegraf-1.30.3/bin/telegraf -config /nix/store/nh4k7bx1asm0kn1klhbmg52wk1qdcwpw-config.toml -config-directory /nix/store/dj77wnb5j
root 1093 0.0 1.5 1109328 60864 ? Ssl Oct08 2:24 /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/fn9jcsr2kp2kq3m2qd6qrkv6xh7jcj5g-fail2ban-1.0.2/bin/.fail2ban-server-wrapped -xf start
sensucl+ 1094 0.0 0.9 898112 38340 ? Ssl Oct08 1:41 /nix/store/qqc6v89xn0g2w123wx85blkpc4pz2ags-ruby-2.7.8/bin/ruby /nix/store/dpvf0jdq1mbrdc90aapyrn2wvjbpckyv-sensu-check-env/bin/sensu-client -L warn -c /nix/store/ly677hg5b7szz
root 1098 0.0 0.1 11564 7568 ? Ss Oct08 0:00 sshd: /nix/store/1m888byzaqaig6azrrfpmjdyhgfliaga-openssh-9.7p1/bin/sshd -D -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
root 176967 0.0 0.2 14380 9840 ? Ss 10:47 0:00 \_ sshd: ctheune [priv]
ctheune 176988 0.2 0.1 14540 5856 ? S 10:47 0:00 \_ sshd: ctheune@pts/0
ctheune 176992 0.0 0.1 230756 5968 pts/0 Ss 10:47 0:00 \_ -bash
root 176998 0.0 0.0 228796 3956 pts/0 S+ 10:47 0:00 \_ sudo -i
root 177001 0.0 0.0 228796 1604 pts/1 Ss 10:47 0:00 \_ sudo -i
root 177002 0.0 0.1 230892 6064 pts/1 S 10:47 0:00 \_ -bash
root 177061 0.0 0.1 232344 4048 pts/1 R+ 10:48 0:00 \_ ps auxf
root 1101 0.0 0.0 226928 1944 tty1 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login --noclear --keep-baud tty1 115200,38400,9600 linux
root 1102 0.0 0.0 226928 2192 ttyS0 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login ttyS0 --keep-baud vt220
_du4651+ 1105 0.0 2.2 2505204 90824 ? Ssl Oct08 1:15 /nix/store/ff5j2is3di7praysyv232wfvcq7hvkii-filebeat-oss-7.17.16/bin/filebeat -e -c /nix/store/xlb56lv0f3j03l3v34x5jfvq8wng18ww-filebeat-journal-services19.gocept.net.json -pat
mysql 2809 0.3 18.6 4784932 750856 ? Ssl Oct08 11:47 /nix/store/9iq211dy95nqn484nx5z5mv3c7pc2h27-percona-server_lts-8.0.36-28/bin/mysqld --defaults-extra-file=/nix/store/frvxmffp9fpgq06bx89rgczyn6k6i51y-my.cnf --user=mysql --data
root 176527 0.0 0.0 227904 3236 ? SNs 10:43 0:00 /nix/store/516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32/bin/bash /nix/store/s8g5ls9d611hjq5psyd15sqbpqgrlwck-unit-script-fc-agent-start/bin/fc-agent-start
root 176535 0.1 1.1 279068 46452 ? SN 10:43 0:00 \_ /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/gavi1rlv3ja79vl5hg3lgh07absa8yb9-python3.11-fc-agent-1.0/bin/.fc-manage-wrapped --enc-p
root 176536 3.3 1.8 635400 72368 ? DNl 10:43 0:09 \_ nix-build --no-build-output <nixpkgs/nixos> -A system -I https://hydra.flyingcircus.io/build/496886/download/1/nixexprs.tar.xz --out-link /run/fc-agent-built-system
ctheune 176972 0.1 0.2 20028 11856 ? Ss 10:47 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd --user
ctheune 176974 0.0 0.0 20368 3004 ? S 10:47 0:00 \_ (sd-pam)
[218321.967537846]
176536 nix-build D
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
stack summary
1 hit:
[<0>] folio_wait_bit_common+0x13f/0x340
[<0>] folio_wait_writeback+0x2b/0x80
[<0>] __filemap_fdatawait_range+0x80/0xe0
[<0>] filemap_write_and_wait_range+0x85/0xb0
[<0>] xfs_setattr_size+0xd9/0x3c0 [xfs]
[<0>] xfs_vn_setattr+0x81/0x150 [xfs]
[<0>] notify_change+0x2ed/0x4f0
[<0>] do_truncate+0x98/0xf0
[<0>] do_ftruncate+0xfe/0x160
[<0>] __x64_sys_ftruncate+0x3e/0x70
[<0>] do_syscall_64+0xb7/0x200
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
-----
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Oct08 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Oct08 0:00 \_ [pool_workqueue_release]
root 4 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-rcu_gp]
root 5 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-sync_wq]
root 6 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-slub_flushwq]
root 7 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-netns]
root 10 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/0:0H-kblockd]
root 13 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mm_percpu_wq]
root 14 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_kthread]
root 15 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_rude_kthread]
root 16 0.0 0.0 0 0 ? I Oct08 0:00 \_ [rcu_tasks_trace_kthread]
root 17 0.0 0.0 0 0 ? S Oct08 0:25 \_ [ksoftirqd/0]
root 18 0.0 0.0 0 0 ? I Oct08 1:12 \_ [rcu_preempt]
root 19 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_par_gp_kthread_worker/0]
root 20 0.0 0.0 0 0 ? S Oct08 0:00 \_ [rcu_exp_gp_kthread_worker]
root 21 0.0 0.0 0 0 ? S Oct08 0:00 \_ [migration/0]
root 22 0.0 0.0 0 0 ? S Oct08 0:00 \_ [idle_inject/0]
root 23 0.0 0.0 0 0 ? S Oct08 0:00 \_ [cpuhp/0]
root 24 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kdevtmpfs]
root 25 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-inet_frag_wq]
root 26 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kauditd]
root 27 0.0 0.0 0 0 ? S Oct08 0:00 \_ [khungtaskd]
root 28 0.0 0.0 0 0 ? S Oct08 0:00 \_ [oom_reaper]
root 29 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-writeback]
root 30 0.0 0.0 0 0 ? S Oct08 0:02 \_ [kcompactd0]
root 31 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [ksmd]
root 32 0.0 0.0 0 0 ? SN Oct08 0:00 \_ [khugepaged]
root 33 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kintegrityd]
root 34 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kblockd]
root 35 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-blkcg_punt_bio]
root 36 0.0 0.0 0 0 ? S Oct08 0:00 \_ [irq/9-acpi]
root 37 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md]
root 38 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-md_bitmap]
root 39 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-devfreq_wq]
root 44 0.0 0.0 0 0 ? S Oct08 0:00 \_ [kswapd0]
root 45 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kthrotld]
root 46 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-mld]
root 47 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ipv6_addrconf]
root 54 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-kstrp]
root 55 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/u5:0]
root 102 0.0 0.0 0 0 ? S Oct08 0:00 \_ [hwrng]
root 109 0.0 0.0 0 0 ? S Oct08 0:00 \_ [watchdogd]
root 149 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ata_sff]
root 150 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_0]
root 151 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_0]
root 152 0.0 0.0 0 0 ? S Oct08 0:00 \_ [scsi_eh_1]
root 153 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-scsi_tmf_1]
root 184 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfsalloc]
root 185 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs_mru_cache]
root 186 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vda1]
root 187 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vda1]
root 188 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vda1]
root 189 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vda1]
root 190 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vda1]
root 191 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vda1]
root 192 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vda1]
root 193 0.0 0.0 0 0 ? S Oct08 0:20 \_ [xfsaild/vda1]
root 531 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 644 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-buf/vdc1]
root 645 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-conv/vdc1]
root 646 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-reclaim/vdc1]
root 647 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-blockgc/vdc1]
root 648 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-inodegc/vdc1]
root 649 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-log/vdc1]
root 650 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-xfs-cil/vdc1]
root 651 0.0 0.0 0 0 ? S Oct08 0:05 \_ [xfsaild/vdc1]
root 723 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-ttm]
root 1286 0.0 0.0 0 0 ? I< Oct08 0:00 \_ [kworker/R-tls-strp]
root 2772 0.0 0.0 0 0 ? S Oct08 0:00 \_ [psimon]
root 171717 0.0 0.0 0 0 ? I 09:03 0:00 \_ [kworker/u4:3-events_power_efficient]
root 174477 0.0 0.0 0 0 ? I 10:01 0:00 \_ [kworker/0:2-xfs-conv/vdc1]
root 174683 0.0 0.0 0 0 ? I 10:06 0:00 \_ [kworker/u4:2-writeback]
root 175378 0.0 0.0 0 0 ? I 10:20 0:00 \_ [kworker/u4:4-events_unbound]
root 176049 0.0 0.0 0 0 ? I 10:34 0:00 \_ [kworker/0:3-xfs-conv/vdc1]
root 176150 0.0 0.0 0 0 ? I< 10:35 0:00 \_ [kworker/0:1H-xfs-log/vda1]
root 176358 0.0 0.0 0 0 ? I 10:40 0:00 \_ [kworker/0:0-xfs-conv/vdc1]
root 176402 0.0 0.0 0 0 ? I 10:41 0:00 \_ [kworker/u4:0-events_power_efficient]
root 176544 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:1-writeback]
root 176545 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:5-writeback]
root 176546 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:6-events_power_efficient]
root 176549 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:1-xfs-conv/vdc1]
root 176550 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:4-xfs-conv/vdc1]
root 176551 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:5-xfs-conv/vdc1]
root 176552 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:6-xfs-conv/vdc1]
root 176553 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:7-xfs-conv/vdc1]
root 176554 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:8-xfs-conv/vdc1]
root 176555 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:9-xfs-conv/vdc1]
root 176556 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:10-xfs-conv/vdc1]
root 176557 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:11-xfs-conv/vdc1]
root 176558 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:12-kthrotld]
root 176559 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:13-xfs-conv/vdc1]
root 176560 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:14-xfs-conv/vdc1]
root 176561 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:15-xfs-conv/vdc1]
root 176562 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:16-xfs-conv/vdc1]
root 176563 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:17-xfs-conv/vdc1]
root 176564 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:18-xfs-conv/vdc1]
root 176565 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:19-xfs-conv/vdc1]
root 176566 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:20-xfs-conv/vdc1]
root 176567 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:21-xfs-conv/vdc1]
root 176568 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:22-xfs-conv/vdc1]
root 176569 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:23-xfs-conv/vdc1]
root 176570 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:24-xfs-conv/vdc1]
root 176571 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:25-xfs-conv/vdc1]
root 176572 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:26-xfs-conv/vdc1]
root 176573 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:27-xfs-conv/vdc1]
root 176574 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:28-xfs-conv/vdc1]
root 176575 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:29-xfs-conv/vdc1]
root 176576 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:30-xfs-conv/vdc1]
root 176577 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:31-xfs-conv/vdc1]
root 176578 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:32-xfs-conv/vdc1]
root 176579 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:33-xfs-conv/vdc1]
root 176580 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:34-xfs-conv/vdc1]
root 176581 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:35-xfs-conv/vdc1]
root 176582 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:36-xfs-conv/vdc1]
root 176583 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:37-xfs-conv/vdc1]
root 176584 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:38-xfs-conv/vdc1]
root 176585 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:39-xfs-conv/vdc1]
root 176586 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:40-xfs-conv/vdc1]
root 176587 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:41-xfs-buf/vdc1]
root 176588 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:42-xfs-conv/vdc1]
root 176589 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:43-xfs-conv/vdc1]
root 176590 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:44-xfs-conv/vdc1]
root 176591 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:45-xfs-conv/vdc1]
root 176592 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:46-xfs-conv/vdc1]
root 176593 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:47-xfs-conv/vdc1]
root 176594 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:48-xfs-conv/vdc1]
root 176595 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:49-xfs-conv/vdc1]
root 176596 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:50-xfs-conv/vdc1]
root 176597 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:51-xfs-conv/vdc1]
root 176598 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:52-xfs-conv/vdc1]
root 176599 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:53-xfs-conv/vdc1]
root 176600 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:54-xfs-conv/vdc1]
root 176601 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:55-xfs-conv/vdc1]
root 176602 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:56-xfs-conv/vdc1]
root 176603 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:57-xfs-conv/vdc1]
root 176604 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:58-xfs-conv/vdc1]
root 176605 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:59-xfs-conv/vdc1]
root 176606 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:60-xfs-conv/vdc1]
root 176607 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:61-xfs-conv/vdc1]
root 176608 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:62-xfs-conv/vdc1]
root 176609 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:63-xfs-conv/vdc1]
root 176610 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:64-xfs-conv/vdc1]
root 176611 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:65-xfs-conv/vdc1]
root 176612 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:66-xfs-conv/vdc1]
root 176613 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:67-xfs-conv/vdc1]
root 176614 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:68-xfs-conv/vdc1]
root 176615 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:69-xfs-conv/vdc1]
root 176616 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:70-xfs-conv/vdc1]
root 176617 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:71-xfs-conv/vdc1]
root 176618 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:72-xfs-conv/vdc1]
root 176619 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:73-xfs-conv/vdc1]
root 176620 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:74-xfs-conv/vdc1]
root 176621 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:75-xfs-conv/vdc1]
root 176622 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:76-xfs-conv/vdc1]
root 176623 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:77-xfs-conv/vdc1]
root 176624 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:78-xfs-conv/vdc1]
root 176625 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:79-xfs-conv/vdc1]
root 176626 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:80-xfs-conv/vdc1]
root 176627 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:81-xfs-conv/vdc1]
root 176628 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:82-xfs-conv/vdc1]
root 176629 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:83-xfs-conv/vdc1]
root 176630 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:84-xfs-conv/vdc1]
root 176631 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:85-xfs-conv/vdc1]
root 176632 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:86-xfs-conv/vdc1]
root 176633 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:87-xfs-conv/vdc1]
root 176634 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:88-xfs-conv/vdc1]
root 176635 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:89-xfs-conv/vdc1]
root 176636 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:90-xfs-conv/vdc1]
root 176637 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:91-xfs-conv/vdc1]
root 176638 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:92-xfs-conv/vdc1]
root 176639 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:93-xfs-conv/vdc1]
root 176640 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:94-xfs-conv/vdc1]
root 176641 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:95-xfs-conv/vdc1]
root 176642 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:96-xfs-conv/vdc1]
root 176643 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:97-xfs-conv/vdc1]
root 176644 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:98-xfs-conv/vdc1]
root 176645 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:99-xfs-conv/vdc1]
root 176646 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:100-xfs-conv/vdc1]
root 176647 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:101-xfs-conv/vdc1]
root 176648 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:102-xfs-conv/vdc1]
root 176649 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:103-xfs-conv/vdc1]
root 176650 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:104-xfs-conv/vdc1]
root 176651 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:105-xfs-conv/vdc1]
root 176652 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:106-xfs-conv/vdc1]
root 176653 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:107-xfs-conv/vdc1]
root 176654 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:108-xfs-conv/vdc1]
root 176655 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:109-xfs-conv/vdc1]
root 176656 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:110-xfs-conv/vdc1]
root 176657 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:111-xfs-conv/vdc1]
root 176658 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:112-xfs-conv/vdc1]
root 176659 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:113-xfs-conv/vdc1]
root 176660 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:114-xfs-conv/vdc1]
root 176661 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:115-xfs-conv/vdc1]
root 176662 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:116-xfs-conv/vdc1]
root 176663 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:117-xfs-conv/vdc1]
root 176664 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:118-xfs-conv/vdc1]
root 176665 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:119-xfs-conv/vdc1]
root 176666 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:120-xfs-conv/vdc1]
root 176667 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:121-xfs-conv/vdc1]
root 176668 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:122-xfs-conv/vdc1]
root 176669 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:123-xfs-conv/vdc1]
root 176670 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:124-xfs-conv/vdc1]
root 176671 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:125-xfs-conv/vdc1]
root 176672 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:126-xfs-conv/vdc1]
root 176673 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:127-xfs-conv/vdc1]
root 176674 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:128-xfs-conv/vdc1]
root 176675 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:129-xfs-conv/vdc1]
root 176676 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:130-xfs-conv/vdc1]
root 176677 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:131-xfs-conv/vdc1]
root 176678 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:132-xfs-conv/vdc1]
root 176679 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:133-xfs-conv/vdc1]
root 176680 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:134-xfs-conv/vdc1]
root 176681 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:135-xfs-conv/vdc1]
root 176682 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:136-xfs-conv/vdc1]
root 176683 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:137-xfs-conv/vdc1]
root 176684 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:138-xfs-conv/vdc1]
root 176685 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:139-xfs-conv/vdc1]
root 176686 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:140-xfs-conv/vdc1]
root 176687 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:141-xfs-conv/vdc1]
root 176688 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:142-xfs-conv/vdc1]
root 176689 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:143-xfs-conv/vdc1]
root 176690 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:144-xfs-conv/vdc1]
root 176691 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:145-xfs-conv/vdc1]
root 176692 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:146-xfs-conv/vdc1]
root 176693 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:147-xfs-conv/vdc1]
root 176694 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:148-xfs-conv/vdc1]
root 176695 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:149-xfs-conv/vdc1]
root 176696 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:150-xfs-conv/vdc1]
root 176697 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:151-xfs-conv/vdc1]
root 176698 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:152-xfs-conv/vdc1]
root 176699 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:153-xfs-conv/vdc1]
root 176700 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:154-xfs-conv/vdc1]
root 176701 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:155-xfs-conv/vdc1]
root 176702 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:156-xfs-conv/vdc1]
root 176703 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:157-xfs-conv/vdc1]
root 176704 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:158-xfs-buf/vda1]
root 176705 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:159-xfs-conv/vdc1]
root 176706 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:160-xfs-conv/vdc1]
root 176707 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:161-xfs-conv/vdc1]
root 176708 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:162-xfs-conv/vdc1]
root 176709 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:163-xfs-conv/vdc1]
root 176710 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:164-xfs-conv/vdc1]
root 176711 0.2 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:165-xfs-conv/vda1]
root 176712 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:166-xfs-conv/vdc1]
root 176713 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:167-xfs-conv/vdc1]
root 176714 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:168-xfs-conv/vdc1]
root 176715 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:169-xfs-conv/vdc1]
root 176716 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:170-xfs-conv/vdc1]
root 176717 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:171-xfs-conv/vdc1]
root 176718 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:172-xfs-conv/vdc1]
root 176719 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:173-xfs-conv/vdc1]
root 176720 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:174-xfs-conv/vdc1]
root 176721 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:175-xfs-conv/vdc1]
root 176722 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:176-xfs-conv/vdc1]
root 176723 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:177-xfs-conv/vdc1]
root 176724 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:178-xfs-conv/vdc1]
root 176725 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:179-xfs-conv/vdc1]
root 176726 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:180-xfs-conv/vdc1]
root 176727 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:181-xfs-conv/vdc1]
root 176728 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:182-xfs-conv/vdc1]
root 176729 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:183-xfs-conv/vdc1]
root 176730 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:184-xfs-conv/vdc1]
root 176731 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:185-xfs-conv/vdc1]
root 176732 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:186-xfs-conv/vdc1]
root 176733 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:187-xfs-conv/vdc1]
root 176734 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:188-xfs-conv/vdc1]
root 176735 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:189-xfs-conv/vdc1]
root 176736 0.3 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:190-cgroup_destroy]
root 176737 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:191-xfs-conv/vdc1]
root 176738 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:192-xfs-conv/vdc1]
root 176739 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:193-xfs-conv/vdc1]
root 176740 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:194-xfs-conv/vdc1]
root 176741 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:195-xfs-conv/vdc1]
root 176742 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:196-xfs-conv/vdc1]
root 176743 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:197-xfs-conv/vdc1]
root 176744 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:198-xfs-conv/vdc1]
root 176745 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:199-xfs-conv/vdc1]
root 176746 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:200-xfs-conv/vdc1]
root 176747 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:201-xfs-conv/vdc1]
root 176748 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:202-xfs-conv/vdc1]
root 176749 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:203-xfs-conv/vdc1]
root 176750 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:204-xfs-conv/vdc1]
root 176751 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:205-xfs-conv/vdc1]
root 176752 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:206-xfs-buf/vda1]
root 176753 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:207-xfs-conv/vdc1]
root 176754 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:208-xfs-conv/vdc1]
root 176755 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:209-xfs-conv/vdc1]
root 176756 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:210-xfs-conv/vdc1]
root 176757 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:211-xfs-conv/vdc1]
root 176758 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:212-xfs-conv/vdc1]
root 176759 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:213-xfs-conv/vdc1]
root 176760 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:214-xfs-conv/vdc1]
root 176761 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:215-xfs-conv/vdc1]
root 176762 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:216-xfs-conv/vdc1]
root 176763 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:217-xfs-conv/vdc1]
root 176764 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:218-xfs-conv/vdc1]
root 176765 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:219-xfs-conv/vdc1]
root 176766 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:220-xfs-conv/vdc1]
root 176767 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:221-xfs-conv/vdc1]
root 176768 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:222-xfs-conv/vdc1]
root 176769 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:223-xfs-conv/vdc1]
root 176770 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:224-xfs-conv/vdc1]
root 176771 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:225-xfs-conv/vdc1]
root 176772 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:226-xfs-conv/vdc1]
root 176773 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:227-xfs-conv/vdc1]
root 176774 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:228-xfs-conv/vdc1]
root 176775 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:229-xfs-conv/vdc1]
root 176776 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:230-xfs-conv/vdc1]
root 176777 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:231-xfs-conv/vdc1]
root 176778 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:232-xfs-conv/vdc1]
root 176779 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:233-xfs-conv/vdc1]
root 176780 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:234-xfs-conv/vdc1]
root 176781 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:235-xfs-conv/vdc1]
root 176782 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:236-xfs-conv/vdc1]
root 176783 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:237-xfs-conv/vdc1]
root 176784 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:238-xfs-conv/vdc1]
root 176785 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:239-xfs-conv/vdc1]
root 176786 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:240-xfs-conv/vdc1]
root 176787 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:241-xfs-conv/vdc1]
root 176788 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:242-xfs-conv/vdc1]
root 176789 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:243-xfs-conv/vdc1]
root 176790 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:244-xfs-conv/vdc1]
root 176791 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:245-xfs-conv/vdc1]
root 176792 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:246-xfs-conv/vdc1]
root 176793 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:247-xfs-conv/vdc1]
root 176794 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:248-xfs-conv/vdc1]
root 176795 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:249-xfs-conv/vdc1]
root 176796 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:250-xfs-conv/vdc1]
root 176797 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:251-xfs-conv/vdc1]
root 176798 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:252-xfs-conv/vdc1]
root 176799 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:253-xfs-conv/vdc1]
root 176800 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:254-xfs-conv/vdc1]
root 176801 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:255-xfs-conv/vdc1]
root 176802 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:256-xfs-buf/vda1]
root 176803 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/0:257-xfs-conv/vdc1]
root 176804 0.0 0.0 0 0 ? I 10:43 0:00 \_ [kworker/u4:7-events_unbound]
root 176813 0.0 0.0 0 0 ? I< 10:44 0:00 \_ [kworker/0:2H-kblockd]
root 176814 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/u4:8-events_unbound]
root 176815 0.0 0.0 0 0 ? I 10:44 0:00 \_ [kworker/0:258]
root 1 0.0 0.3 21852 13056 ? Ss Oct08 0:19 /run/current-system/systemd/lib/systemd/systemd
root 399 0.0 1.8 139764 75096 ? Ss Oct08 0:13 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-journald
root 455 0.0 0.2 33848 8168 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-udevd
systemd+ 811 0.0 0.1 16800 6660 ? Ss Oct08 0:10 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-oomd
systemd+ 816 0.0 0.1 91380 7952 ? Ssl Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-timesyncd
root 837 0.0 0.0 80596 3288 ? Ssl Oct08 1:37 /nix/store/ag3xk1l8ij06vx434abk8643f8p7i08c-qemu-host-cpu-only-8.2.6-ga/bin/qemu-ga --statedir /run/qemu-ga
root 840 0.0 0.0 226896 1984 ? Ss Oct08 0:00 /nix/store/k34f0d079arcgfjsq78gpkdbd6l6nnq4-cron-4.1/bin/cron -n
message+ 850 0.0 0.1 13776 6080 ? Ss Oct08 0:05 /nix/store/0hm8vh65m378439kl16xv0p6l7c51asj-dbus-1.14.10/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root 876 0.0 0.1 17468 7968 ? Ss Oct08 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd-logind
nscd 1074 0.0 0.1 555748 6016 ? Ssl Oct08 0:28 /nix/store/zza9hvd6iawqdcxvinf4yxv580av3s9f-nsncd-unstable-2024-01-16/bin/nsncd
telegraf 1092 0.3 3.4 6344672 138484 ? S<Lsl Oct08 13:05 /nix/store/8bnbkyh26j97l0pw02gb7lngh4n6k3r5-telegraf-1.30.3/bin/telegraf -config /nix/store/nh4k7bx1asm0kn1klhbmg52wk1qdcwpw-config.toml -config-directory /nix/store/dj77wnb5j
root 1093 0.0 1.5 1109328 60864 ? Ssl Oct08 2:24 /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/fn9jcsr2kp2kq3m2qd6qrkv6xh7jcj5g-fail2ban-1.0.2/bin/.fail2ban-server-wrapped -xf start
sensucl+ 1094 0.0 0.9 898112 38340 ? Ssl Oct08 1:41 /nix/store/qqc6v89xn0g2w123wx85blkpc4pz2ags-ruby-2.7.8/bin/ruby /nix/store/dpvf0jdq1mbrdc90aapyrn2wvjbpckyv-sensu-check-env/bin/sensu-client -L warn -c /nix/store/ly677hg5b7szz
root 1098 0.0 0.1 11564 7568 ? Ss Oct08 0:00 sshd: /nix/store/1m888byzaqaig6azrrfpmjdyhgfliaga-openssh-9.7p1/bin/sshd -D -f /etc/ssh/sshd_config [listener] 0 of 10-100 startups
root 176967 0.0 0.2 14380 9840 ? Ss 10:47 0:00 \_ sshd: ctheune [priv]
ctheune 176988 0.2 0.1 14540 5856 ? S 10:47 0:00 \_ sshd: ctheune@pts/0
ctheune 176992 0.0 0.1 230756 5968 pts/0 Ss 10:47 0:00 \_ -bash
root 176998 0.0 0.0 228796 3956 pts/0 S+ 10:47 0:00 \_ sudo -i
root 177001 0.0 0.0 228796 1604 pts/1 Ss 10:47 0:00 \_ sudo -i
root 177002 0.0 0.1 230892 6064 pts/1 S 10:47 0:00 \_ -bash
root 177075 0.0 0.1 232344 4048 pts/1 R+ 10:48 0:00 \_ ps auxf
root 1101 0.0 0.0 226928 1944 tty1 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login --noclear --keep-baud tty1 115200,38400,9600 linux
root 1102 0.0 0.0 226928 2192 ttyS0 Ss+ Oct08 0:00 agetty --login-program /nix/store/gwihsgkd13xmk8vwfn2k1nkdi9bys42x-shadow-4.14.6/bin/login ttyS0 --keep-baud vt220
_du4651+ 1105 0.0 2.2 2505204 90952 ? Ssl Oct08 1:15 /nix/store/ff5j2is3di7praysyv232wfvcq7hvkii-filebeat-oss-7.17.16/bin/filebeat -e -c /nix/store/xlb56lv0f3j03l3v34x5jfvq8wng18ww-filebeat-journal-services19.gocept.net.json -pat
mysql 2809 0.3 18.6 4784932 750856 ? Ssl Oct08 11:47 /nix/store/9iq211dy95nqn484nx5z5mv3c7pc2h27-percona-server_lts-8.0.36-28/bin/mysqld --defaults-extra-file=/nix/store/frvxmffp9fpgq06bx89rgczyn6k6i51y-my.cnf --user=mysql --data
root 176527 0.0 0.0 227904 3236 ? SNs 10:43 0:00 /nix/store/516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32/bin/bash /nix/store/s8g5ls9d611hjq5psyd15sqbpqgrlwck-unit-script-fc-agent-start/bin/fc-agent-start
root 176535 0.0 1.1 279068 46452 ? SN 10:43 0:00 \_ /nix/store/h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9/bin/python3.11 /nix/store/gavi1rlv3ja79vl5hg3lgh07absa8yb9-python3.11-fc-agent-1.0/bin/.fc-manage-wrapped --enc-p
root 176536 3.2 1.8 635400 72368 ? DNl 10:43 0:09 \_ nix-build --no-build-output <nixpkgs/nixos> -A system -I https://hydra.flyingcircus.io/build/496886/download/1/nixexprs.tar.xz --out-link /run/fc-agent-built-system
ctheune 176972 0.0 0.2 20028 11856 ? Ss 10:47 0:00 /nix/store/nswmyag3qi9ars0mxw5lp8zm0wv5zxld-systemd-255.9/lib/systemd/systemd --user
ctheune 176974 0.0 0.0 20368 3004 ? S 10:47 0:00 \_ (sd-pam)
[218342.027043] systemd[1]: fc-agent.service: Deactivated successfully.
[218342.027658] systemd[1]: Finished Flying Circus Management Task.
[218342.028479] systemd[1]: fc-agent.service: Consumed 17.942s CPU time, received 28.8M IP traffic, sent 133.3K IP traffic.
[218331.821045432] (no further output from walker.py)
[-- Attachment #10: Type: text/plain, Size: 281 bytes --]
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-11 9:08 ` Christian Theune
@ 2024-10-11 13:06 ` Chris Mason
2024-10-11 13:50 ` Christian Theune
2024-10-12 17:01 ` Linus Torvalds
0 siblings, 2 replies; 81+ messages in thread
From: Chris Mason @ 2024-10-11 13:06 UTC (permalink / raw)
To: Christian Theune
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
On 10/11/24 5:08 AM, Christian Theune wrote:
>
>> On 11. Oct 2024, at 09:27, Christian Theune <ct@flyingcircus.io> wrote:
>>
>> I’m going to gather a few more instances during the day and will post them as a batch later.
>
> I’ve received 8 alerts in the last hours and managed to get detailed, repeated walker output from two of them:
>
> - FC-41287.log
> - FC-41289.log
These are really helpful.
If io throttling were the cause, the traces should also have a process
that's waiting to submit the IO, but that's not present here.
Another common pattern is hung tasks with a process stuck in the kernel
burning CPU, but holding a lock or being somehow responsible for waking
the hung task. Your process listings don't have that either.
One part I wanted to mention:
[820710.974122] Future hung task reports are suppressed, see sysctl
kernel.hung_task_warnings
By default you only get 10 or so hung task notifications per boot, and
after that they are suppressed. So for example, if you're watching a
count of hung task messages across a lot of machines and thinking that
things are pretty stable because you're not seeing hung task messages
anymore...the kernel might have just stopped complaining.
This isn't exactly new kernel behavior, but it can be a surprise.
Anyway, this leaves me with ~3 theories:
- Linus's starvation observation. It doesn't feel like there's enough
load to cause this, especially given us sitting in truncate, where it
should be pretty unlikely to have multiple procs banging on the page in
question.
- Willy's folio->mapping check idea. I _think_ this is also wrong, the
reference counts we have in the truncate path check folio->mapping
before returning, and we shouldn't be able to reuse the folio in a
different mapping while we have the reference held.
If this is the problem it would mean our original bug is slightly
unfixed. But the fact that you're not seeing other problems, and these
hung tasks do resolve should mean we're ok. We can add a printk or just
run a drgn script to check.
- It's actually taking the IO a long time to finish. We can poke at the
pending requests, how does the device look in the VM? (virtio, scsi etc).
-chris
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-11 13:06 ` Chris Mason
@ 2024-10-11 13:50 ` Christian Theune
2024-10-12 17:01 ` Linus Torvalds
1 sibling, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-10-11 13:50 UTC (permalink / raw)
To: Chris Mason
Cc: Linus Torvalds, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
Hi,
> On 11. Oct 2024, at 15:06, Chris Mason <clm@meta.com> wrote:
>
> - It's actually taking the IO a long time to finish. We can poke at the
> pending requests, how does the device look in the VM? (virtio, scsi etc).
I _think_ that’s not it. This is a Qemu w/ virtio-block + Ceph stack with 2x10G and fully SSD backed. The last 24 hours show operation latency at less than 0.016ms. Ceph’s slow request warning (30s limit) has not triggered in the last 24 hours.
Also, aside from a VM that was exhausting its Qemu io throttling for a minute (and stuck in completely different tracebacks) the only blocked task reports from the last 48 hours was this specific process.
I’d expect that we’d see a lot more reports about IO issues from multiple VMs and multiple loads at the same time when the storage misbehaves (we did experience those in the long long past in older Ceph versions and with spinning rust, so I’m pretty confident (at the moment) this isn’t a storage issue per se).
Incidentally this now reminds me of a different (maybe not?) issue that I’ve been trying to track down with mdraid/xfs:
https://marc.info/?l=linux-raid&m=172295385102939&w=2
This is only tested on an older kernel so far (5.15.138) and we ended up seeing IOPS stuck in the md device but not below it. However, MD isn’t involved here. I made the connection because the original traceback also shows it stuck in “wait_on_page_writeback”, but maybe that’s a red herring:
[Aug 6 09:35] INFO: task .backy-wrapped:2615 blocked for more than 122 seconds.
[ +0.008130] Not tainted 5.15.138 #1-NixOS
[ +0.005194] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.008895] task:.backy-wrapped state:D stack: 0 pid: 2615 ppid: 1 flags:0x00000002
[ +0.000005] Call Trace:
[ +0.000002] <TASK>
[ +0.000004] __schedule+0x373/0x1580
[ +0.000009] ? xlog_cil_commit+0x559/0x880 [xfs]
[ +0.000041] schedule+0x5b/0xe0
[ +0.000001] io_schedule+0x42/0x70
[ +0.000001] wait_on_page_bit_common+0x119/0x380
[ +0.000005] ? __page_cache_alloc+0x80/0x80
[ +0.000002] wait_on_page_writeback+0x22/0x70
[ +0.000001] truncate_inode_pages_range+0x26f/0x6d0
[ +0.000006] evict+0x15f/0x180
[ +0.000003] __dentry_kill+0xde/0x170
[ +0.000001] dput+0x15b/0x330
[ +0.000002] do_renameat2+0x34e/0x5b0
[ +0.000003] __x64_sys_rename+0x3f/0x50
[ +0.000002] do_syscall_64+0x3a/0x90
[ +0.000002] entry_SYSCALL_64_after_hwframe+0x62/0xcc
[ +0.000003] RIP: 0033:0x7fdd1885275b
[ +0.000002] RSP: 002b:00007ffde643ad18 EFLAGS: 00000246 ORIG_RAX: 0000000000000052
[ +0.000002] RAX: ffffffffffffffda RBX: 00007ffde643adb0 RCX: 00007fdd1885275b
[ +0.000001] RDX: 0000000000000000 RSI: 00007fdd09a3d3d0 RDI: 00007fdd098549d0
[ +0.000001] RBP: 00007ffde643ad60 R08: 00000000ffffffff R09: 0000000000000000
[ +0.000001] R10: 00007ffde643af90 R11: 0000000000000246 R12: 00000000ffffff9c
[ +0.000000] R13: 00000000ffffff9c R14: 000000000183cab0 R15: 00007fdd0b128810
[ +0.000001] </TASK>
[ +0.000011] INFO: task kworker/u64:0:2380262 blocked for more than 122 seconds.
[ +0.008309] Not tainted 5.15.138 #1-NixOS
[ +0.005190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ +0.008895] task:kworker/u64:0 state:D stack: 0 pid:2380262 ppid: 2 flags:0x00004000
[ +0.000004] Workqueue: kcryptd/253:4 kcryptd_crypt [dm_crypt]
[ +0.000006] Call Trace:
[ +0.000001] <TASK>
[ +0.000001] __schedule+0x373/0x1580
[ +0.000003] schedule+0x5b/0xe0
[ +0.000001] md_bitmap_startwrite+0x177/0x1e0
[ +0.000004] ? finish_wait+0x90/0x90
[ +0.000004] add_stripe_bio+0x449/0x770 [raid456]
[ +0.000005] raid5_make_request+0x1cf/0xbd0 [raid456]
[ +0.000003] ? kmem_cache_alloc_node_trace+0x391/0x3e0
[ +0.000004] ? linear_map+0x44/0x90 [dm_mod]
[ +0.000005] ? finish_wait+0x90/0x90
[ +0.000001] ? __blk_queue_split+0x516/0x580
[ +0.000003] md_handle_request+0x122/0x1b0
[ +0.000003] md_submit_bio+0x6e/0xb0
[ +0.000001] __submit_bio+0x18f/0x220
[ +0.000002] ? crypt_page_alloc+0x46/0x60 [dm_crypt]
[ +0.000002] submit_bio_noacct+0xbe/0x2d0
[ +0.000001] kcryptd_crypt+0x392/0x550 [dm_crypt]
[ +0.000002] process_one_work+0x1d6/0x360
[ +0.000003] worker_thread+0x4d/0x3b0
[ +0.000002] ? process_one_work+0x360/0x360
[ +0.000001] kthread+0x118/0x140
[ +0.000001] ? set_kthread_struct+0x50/0x50
[ +0.000001] ret_from_fork+0x22/0x30
[ +0.000004] </TASK>
…(more md kworker tasks pile up here)
Christian
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-11 13:06 ` Chris Mason
2024-10-11 13:50 ` Christian Theune
@ 2024-10-12 17:01 ` Linus Torvalds
2024-12-02 10:44 ` Christian Theune
1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2024-10-12 17:01 UTC (permalink / raw)
To: Chris Mason
Cc: Christian Theune, Dave Chinner, Matthew Wilcox, Jens Axboe,
linux-mm, linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao,
regressions, regressions
[-- Attachment #1: Type: text/plain, Size: 3012 bytes --]
On Fri, 11 Oct 2024 at 06:06, Chris Mason <clm@meta.com> wrote:
>
> - Linus's starvation observation. It doesn't feel like there's enough
> load to cause this, especially given us sitting in truncate, where it
> should be pretty unlikely to have multiple procs banging on the page in
> question.
Yeah, I think the starvation can only possibly happen in
fdatasync-like paths where it's waiting for existing writeback without
holding the page lock. And while Christian has had those backtraces
too, the truncate path is not one of them.
That said, just because I wanted to see how nasty it is, I looked into
changing the rules for folio_wake_bit().
Christian, just to clarify, this is not for you to test - this is
very experimental - but maybe Willy has comments on it.
Because it *might* be possible to do something like the attached,
where we do the page flags changes atomically but without any locks if
there are no waiters, but if there is a waiter on the page, we always
clear the page flag bit atomically under the waitqueue lock as we wake
up the waiter.
I changed the name (and the return value) of the
folio_xor_flags_has_waiters() function to just not have any
possibility of semantic mixup, but basically instead of doing the xor
atomically and unconditionally (and returning whether we had waiters),
it now does it conditionally only if we do *not* have waiters, and
returns true if successful.
And if there were waiters, it moves the flag clearing into the wakeup function.
That in turn means that the "while whiteback" loop can go back to be
just a non-looping "if writeback", and folio_wait_writeback() can't
get into any starvation with new writebacks always showing up.
The reason I say it *might* be possible to do something like this is
that it changes __folio_end_writeback() to no longer necessarily clear
the writeback bit under the XA lock. If there are waiters, we'll clear
it later (after releasing the lock) in the caller.
Willy? What do you think? Clearly this now makes PG_writeback not
synchronized with the PAGECACHE_TAG_WRITEBACK tag, but the reason I
think it might be ok is that the code that *sets* the PG_writeback bit
in __folio_start_writeback() only ever starts with a page that isn't
under writeback, and has a
VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
at the top of the function even outside the XA lock. So I don't think
these *need* to be synchronized under the XA lock, and I think the
folio flag wakeup atomicity might be more important than the XA
writeback tag vs folio writeback bit.
But I'm not going to really argue for this patch at all - I wanted to
look at how bad it was, I wrote it, I'm actually running it on my
machine now and it didn't *immediately* blow up in my face, so it
*may* work just fine.
The patch is fairly simple, and apart from the XA tagging issue is
seems very straightforward. I'm just not sure it's worth synchronizing
one part just to at the same time de-synchronize another..
Linus
[-- Attachment #2: 0001-Test-atomic-folio-bit-waiting.patch --]
[-- Type: text/x-patch, Size: 5519 bytes --]
From 9d4f0d60abc4dce5b7cfbad4576a2829832bb838 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sat, 12 Oct 2024 09:34:24 -0700
Subject: [PATCH] Test atomic folio bit waiting
---
include/linux/page-flags.h | 26 ++++++++++++++++----------
mm/filemap.c | 28 ++++++++++++++++++++++++++--
mm/page-writeback.c | 6 +++---
3 files changed, 45 insertions(+), 15 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 1b3a76710487..b30a73e1c2c7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -730,22 +730,28 @@ TESTPAGEFLAG_FALSE(Ksm, ksm)
u64 stable_page_flags(const struct page *page);
/**
- * folio_xor_flags_has_waiters - Change some folio flags.
+ * folio_xor_flags_no_waiters - Change folio flags if no waiters
* @folio: The folio.
- * @mask: Bits set in this word will be changed.
+ * @mask: Which flags to change.
*
- * This must only be used for flags which are changed with the folio
- * lock held. For example, it is unsafe to use for PG_dirty as that
- * can be set without the folio lock held. It can also only be used
- * on flags which are in the range 0-6 as some of the implementations
- * only affect those bits.
+ * This does the optimistic fast-case of changing page flag bits
+ * that has no waiters. Only flags in the first word can be modified,
+ * and the old value must be stable (typically this clears the
+ * locked or writeback bit or similar).
*
- * Return: Whether there are tasks waiting on the folio.
+ * Return: true if it succeeded
*/
-static inline bool folio_xor_flags_has_waiters(struct folio *folio,
+static inline bool folio_xor_flags_no_waiters(struct folio *folio,
unsigned long mask)
{
- return xor_unlock_is_negative_byte(mask, folio_flags(folio, 0));
+ const unsigned long waiter_mask = 1ul << PG_waiters;
+ unsigned long *flags = folio_flags(folio, 0);
+ unsigned long val = READ_ONCE(*flags);
+ do {
+ if (val & waiter_mask)
+ return false;
+ } while (!try_cmpxchg_release(flags, &val, val ^ mask));
+ return true;
}
/**
diff --git a/mm/filemap.c b/mm/filemap.c
index 664e607a71ea..5fbaf6cea964 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1164,6 +1164,14 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
return (flags & WQ_FLAG_EXCLUSIVE) != 0;
}
+/*
+ * Clear the folio bit and wake waiters atomically under
+ * the folio waitqueue lock.
+ *
+ * Note that the fast-path alternative to calling this is
+ * to atomically clear the bit and check that the PG_waiters
+ * bit was not set.
+ */
static void folio_wake_bit(struct folio *folio, int bit_nr)
{
wait_queue_head_t *q = folio_waitqueue(folio);
@@ -1175,6 +1183,7 @@ static void folio_wake_bit(struct folio *folio, int bit_nr)
key.page_match = 0;
spin_lock_irqsave(&q->lock, flags);
+ clear_bit_unlock(bit_nr, folio_flags(folio, 0));
__wake_up_locked_key(q, TASK_NORMAL, &key);
/*
@@ -1507,7 +1516,7 @@ void folio_unlock(struct folio *folio)
BUILD_BUG_ON(PG_waiters != 7);
BUILD_BUG_ON(PG_locked > 7);
VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
- if (folio_xor_flags_has_waiters(folio, 1 << PG_locked))
+ if (!folio_xor_flags_no_waiters(folio, 1 << PG_locked))
folio_wake_bit(folio, PG_locked);
}
EXPORT_SYMBOL(folio_unlock);
@@ -1535,10 +1544,25 @@ void folio_end_read(struct folio *folio, bool success)
VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio);
+ /*
+ * Try to clear 'locked' at the same time as setting 'uptodate'
+ *
+ * Note that if we have lock bit waiters and this fast-case fails,
+ * we'll have to clear the lock bit atomically under the folio wait
+ * queue lock, so then we'll set 'update' separately.
+ *
+ * Note that this is purely a "avoid multiple atomics in the
+ * common case" - while the locked bit needs to be cleared
+ * synchronously wrt waiters, the uptodate bit has no such
+ * requirements.
+ */
if (likely(success))
mask |= 1 << PG_uptodate;
- if (folio_xor_flags_has_waiters(folio, mask))
+ if (!folio_xor_flags_no_waiters(folio, mask)) {
+ if (success)
+ set_bit(PG_uptodate, folio_flags(folio, 0));
folio_wake_bit(folio, PG_locked);
+ }
}
EXPORT_SYMBOL(folio_end_read);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index fcd4c1439cb9..3277bc3ceff9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -3081,7 +3081,7 @@ bool __folio_end_writeback(struct folio *folio)
unsigned long flags;
xa_lock_irqsave(&mapping->i_pages, flags);
- ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback);
+ ret = !folio_xor_flags_no_waiters(folio, 1 << PG_writeback);
__xa_clear_mark(&mapping->i_pages, folio_index(folio),
PAGECACHE_TAG_WRITEBACK);
if (bdi->capabilities & BDI_CAP_WRITEBACK_ACCT) {
@@ -3099,7 +3099,7 @@ bool __folio_end_writeback(struct folio *folio)
xa_unlock_irqrestore(&mapping->i_pages, flags);
} else {
- ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback);
+ ret = !folio_xor_flags_no_waiters(folio, 1 << PG_writeback);
}
lruvec_stat_mod_folio(folio, NR_WRITEBACK, -nr);
@@ -3184,7 +3184,7 @@ EXPORT_SYMBOL(__folio_start_writeback);
*/
void folio_wait_writeback(struct folio *folio)
{
- while (folio_test_writeback(folio)) {
+ if (folio_test_writeback(folio)) {
trace_folio_wait_writeback(folio, folio_mapping(folio));
folio_wait_bit(folio, PG_writeback);
}
--
2.46.1.608.gc56f2c11c8
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
2024-10-12 17:01 ` Linus Torvalds
@ 2024-12-02 10:44 ` Christian Theune
0 siblings, 0 replies; 81+ messages in thread
From: Christian Theune @ 2024-12-02 10:44 UTC (permalink / raw)
To: Linus Torvalds
Cc: Chris Mason, Dave Chinner, Matthew Wilcox, Jens Axboe, linux-mm,
linux-xfs, linux-fsdevel, linux-kernel, Daniel Dao, regressions,
regressions
Hi,
waking this thread up again: we’ve been running the original fix on top of 6.11 for roughly 8 weeks now and have not had a single occurence of this. I’d be willing to call this as fixed.
@Linus: we didn’t specify an actual deadline, but I guess 8 week without any hit is good enough?
My plan would be to migrate our fleet to 6.6 now. AFAICT the relevant patch series is the one in
https://lore.kernel.org/all/20240415171857.19244-4-ryncsn@gmail.com/T/#u and was released in 6.6.54.
I’d like to revive the discussion on the second issue, though, as it ended with Linus’ last post
and I couldn’t find whether this may have been followed up elsewhere or still needs to be worked on?
Christian
> On 12. Oct 2024, at 19:01, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Fri, 11 Oct 2024 at 06:06, Chris Mason <clm@meta.com> wrote:
>>
>> - Linus's starvation observation. It doesn't feel like there's enough
>> load to cause this, especially given us sitting in truncate, where it
>> should be pretty unlikely to have multiple procs banging on the page in
>> question.
>
> Yeah, I think the starvation can only possibly happen in
> fdatasync-like paths where it's waiting for existing writeback without
> holding the page lock. And while Christian has had those backtraces
> too, the truncate path is not one of them.
>
> That said, just because I wanted to see how nasty it is, I looked into
> changing the rules for folio_wake_bit().
>
> Christian, just to clarify, this is not for you to test - this is
> very experimental - but maybe Willy has comments on it.
>
> Because it *might* be possible to do something like the attached,
> where we do the page flags changes atomically but without any locks if
> there are no waiters, but if there is a waiter on the page, we always
> clear the page flag bit atomically under the waitqueue lock as we wake
> up the waiter.
>
> I changed the name (and the return value) of the
> folio_xor_flags_has_waiters() function to just not have any
> possibility of semantic mixup, but basically instead of doing the xor
> atomically and unconditionally (and returning whether we had waiters),
> it now does it conditionally only if we do *not* have waiters, and
> returns true if successful.
>
> And if there were waiters, it moves the flag clearing into the wakeup function.
>
> That in turn means that the "while whiteback" loop can go back to be
> just a non-looping "if writeback", and folio_wait_writeback() can't
> get into any starvation with new writebacks always showing up.
>
> The reason I say it *might* be possible to do something like this is
> that it changes __folio_end_writeback() to no longer necessarily clear
> the writeback bit under the XA lock. If there are waiters, we'll clear
> it later (after releasing the lock) in the caller.
>
> Willy? What do you think? Clearly this now makes PG_writeback not
> synchronized with the PAGECACHE_TAG_WRITEBACK tag, but the reason I
> think it might be ok is that the code that *sets* the PG_writeback bit
> in __folio_start_writeback() only ever starts with a page that isn't
> under writeback, and has a
>
> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
>
> at the top of the function even outside the XA lock. So I don't think
> these *need* to be synchronized under the XA lock, and I think the
> folio flag wakeup atomicity might be more important than the XA
> writeback tag vs folio writeback bit.
>
> But I'm not going to really argue for this patch at all - I wanted to
> look at how bad it was, I wrote it, I'm actually running it on my
> machine now and it didn't *immediately* blow up in my face, so it
> *may* work just fine.
>
> The patch is fairly simple, and apart from the XA tagging issue is
> seems very straightforward. I'm just not sure it's worth synchronizing
> one part just to at the same time de-synchronize another..
>
> Linus
> <0001-Test-atomic-folio-bit-waiting.patch>
Liebe Grüße,
Christian Theune
--
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2024-12-02 10:44 UTC | newest]
Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-12 21:18 Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Christian Theune
2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11 ` Christian Theune
2024-09-12 22:12 ` Jens Axboe
2024-09-12 22:25 ` Linus Torvalds
2024-09-12 22:30 ` Jens Axboe
2024-09-12 22:56 ` Linus Torvalds
2024-09-13 3:44 ` Matthew Wilcox
2024-09-13 13:23 ` Christian Theune
2024-09-13 12:11 ` Christian Brauner
2024-09-16 13:29 ` Matthew Wilcox
2024-09-18 9:51 ` Christian Brauner
2024-09-13 15:30 ` Chris Mason
2024-09-13 15:51 ` Matthew Wilcox
2024-09-13 16:33 ` Chris Mason
2024-09-13 18:15 ` Matthew Wilcox
2024-09-13 21:24 ` Linus Torvalds
2024-09-13 21:30 ` Matthew Wilcox
2024-09-13 16:04 ` David Howells
2024-09-13 16:37 ` Chris Mason
2024-09-16 0:00 ` Dave Chinner
2024-09-16 4:20 ` Linus Torvalds
2024-09-16 8:47 ` Chris Mason
2024-09-17 9:32 ` Matthew Wilcox
2024-09-17 9:36 ` Chris Mason
2024-09-17 10:11 ` Christian Theune
2024-09-17 11:13 ` Chris Mason
2024-09-17 13:25 ` Matthew Wilcox
2024-09-18 6:37 ` Jens Axboe
2024-09-18 9:28 ` Chris Mason
2024-09-18 12:23 ` Chris Mason
2024-09-18 13:34 ` Matthew Wilcox
2024-09-18 13:51 ` Linus Torvalds
2024-09-18 14:12 ` Matthew Wilcox
2024-09-18 14:39 ` Linus Torvalds
2024-09-18 17:12 ` Matthew Wilcox
2024-09-18 16:37 ` Chris Mason
2024-09-19 1:43 ` Dave Chinner
2024-09-19 3:03 ` Linus Torvalds
2024-09-19 3:12 ` Linus Torvalds
2024-09-19 3:38 ` Jens Axboe
2024-09-19 4:32 ` Linus Torvalds
2024-09-19 4:42 ` Jens Axboe
2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 4:46 ` Jens Axboe
2024-09-19 5:20 ` Jens Axboe
2024-09-19 4:46 ` Linus Torvalds
2024-09-20 13:54 ` Chris Mason
2024-09-24 15:58 ` Matthew Wilcox
2024-09-24 17:16 ` Sam James
2024-09-25 16:06 ` Kairui Song
2024-09-25 16:42 ` Christian Theune
2024-09-27 14:51 ` Sam James
2024-09-27 14:58 ` Jens Axboe
2024-10-01 21:10 ` Kairui Song
2024-09-24 19:17 ` Chris Mason
2024-09-24 19:24 ` Linus Torvalds
2024-09-19 6:34 ` Christian Theune
2024-09-19 6:57 ` Linus Torvalds
2024-09-19 10:19 ` Christian Theune
2024-09-30 17:34 ` Christian Theune
2024-09-30 18:46 ` Linus Torvalds
2024-09-30 19:25 ` Christian Theune
2024-09-30 20:12 ` Linus Torvalds
2024-09-30 20:56 ` Matthew Wilcox
2024-09-30 22:42 ` Davidlohr Bueso
2024-09-30 23:00 ` Davidlohr Bueso
2024-09-30 23:53 ` Linus Torvalds
2024-10-01 0:56 ` Chris Mason
2024-10-01 7:54 ` Christian Theune
2024-10-10 6:29 ` Christian Theune
2024-10-11 7:27 ` Christian Theune
2024-10-11 9:08 ` Christian Theune
2024-10-11 13:06 ` Chris Mason
2024-10-11 13:50 ` Christian Theune
2024-10-12 17:01 ` Linus Torvalds
2024-12-02 10:44 ` Christian Theune
2024-10-01 2:22 ` Dave Chinner
2024-09-16 7:14 ` Christian Theune
2024-09-16 12:16 ` Matthew Wilcox
2024-09-18 8:31 ` Christian Theune
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox