Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc_thresh

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: Joshua Watt <jpewhacker@gmail.com>
Cc: jimzhao.ai@gmail.com, akpm@linux-foundation.org, jack@suse.cz,
	 linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,  willy@infradead.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc_thresh
Date: Wed, 8 Oct 2025 13:14:37 +0200	[thread overview]
Message-ID: <ywwhwyc4el6vikghnd5yoejteld6dudemta7lsrtacvecshst5@avvpac27felp> (raw)
In-Reply-To: <20251007161711.468149-1-JPEWhacker@gmail.com>

Hello!

On Tue 07-10-25 10:17:11, Joshua Watt wrote:
> From: Joshua Watt <jpewhacker@gmail.com>
> 
> This patch strangely breaks NFS 4 clients for me. The behavior is that a
> client will start getting an I/O error which in turn is caused by the client
> getting a NFS3ERR_BADSESSION when attempting to write data to the server. I
> bisected the kernel from the latest master
> (9029dc666353504ea7c1ebfdf09bc1aab40f6147) to this commit (log below). Also,
> when I revert this commit on master the bug disappears.
> 
> The server is running kernel 5.4.161, and the client that exhibits the
> behavior is running in qemux86, and has mounted the server with the options
> rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,port=52049,timeo=600,retrans=2,sec=null,clientaddr=172.16.6.90,local_lock=none,addr=172.16.6.0
> 
> The program that I wrote to reproduce this is pretty simple; it does a file
> lock over NFS, then writes data to the file once per second. After about 32
> seconds, it receives the I/O error, and this reproduced every time. I can
> provide the sample program if necessary.

This is indeed rather curious.

> I also captured the NFS traffic both in the passing case and the failure case,
> and can provide them if useful.
> 
> I did look at the two dumps and I'm not exactly sure what the difference is,
> other than with this patch the client tries to write every 30 seconds (and
> fails), where as without it attempts to write back every 5 seconds. I have no
> idea why this patch would cause this problem.

So the change in writeback behavior is not surprising. The commit does
modify the logic computing dirty limits in some corner cases and your
description matches the fact that previously the computed limits were lower
so we've started writeback after 5s (dirty_writeback_interval) while with
the patch we didn't cross the threshold and thus started writeback only
once the dirty data was old enough, which is 30s (dirty_expire_interval).

But that's all, you should be able to observe exactly the same writeback
behavior if you write less even without this patch. So I suspect that the
different writeback behavior is just triggering some bug in the NFS (either
on the client or the server side). The NFS3ERR_BADSESSION error you're
getting back sounds like something times out somewhere, falls out of cache
and reports this error (which doesn't happen if we writeback after 5s
instead of 30s). NFS guys maybe have better idea what's going on here.

You could possibly workaround this problem (and verify my theory) by tuning
/proc/sys/vm/dirty_expire_centisecs to a lower value (say 500). This will
make inode writeback start earlier and thus should effectively mask the
problem again.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

next prev parent reply	other threads:[~2025-10-08 11:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-21 10:05 Jim Zhao
2025-01-13 23:05 ` Guenter Roeck
2025-01-14 13:19   ` Jan Kara
2025-01-14 15:01     ` Guenter Roeck
2025-01-15 16:07       ` Jan Kara
2025-01-15 16:28         ` Jan Kara
2025-01-15 16:52           ` Guenter Roeck
2025-01-15 16:41         ` Guenter Roeck
2025-01-16 14:56           ` Jan Kara
2025-01-16 16:05             ` Guenter Roeck
2025-10-07 16:17 ` Joshua Watt
2025-10-08 11:14   ` Jan Kara [this message]
2025-10-08 14:49     ` Joshua Watt
2025-10-08 23:14       ` Joshua Watt
2025-10-09  8:38         ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ywwhwyc4el6vikghnd5yoejteld6dudemta7lsrtacvecshst5@avvpac27felp \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=jimzhao.ai@gmail.com \
    --cc=jpewhacker@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox