linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-mm@kvack.org
Cc: Hugh Dickins <hughd@google.com>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	William Kucharski <william.kucharski@oracle.com>
Subject: Re: [PATCH RFC] shmem: don't trim whole folio loop boundaries on partial truncate
Date: Mon, 3 Nov 2025 15:11:26 -0500	[thread overview]
Message-ID: <aQkMbsqh_9cYGsvV@bfoster> (raw)
In-Reply-To: <20251030165121.1131002-1-bfoster@redhat.com>

On Thu, Oct 30, 2025 at 12:51:21PM -0400, Brian Foster wrote:
> An fstests fsx test run in a low memory cgroup environment on a
> huge=always tmpfs mount occasionally reproduces a livelock in
> shmem_undo_range() (via a truncate operation). The process ends up
> spinning indefinitely in the whole_folios loop.
> 
> The root cause of this appears to be that earlier in the function, a
> large folio is handled at the start boundary of the truncate. The
> truncate_inode_partial_folio() splits the large folio such that the
> currently referenced folio now ends before the range of whole folios
> to truncate in the second loop. The start index is pushed back and
> the second loop finds/splits some folios in this range.
> 
> Ultimately what appears to happen is that we settle in a sequence
> where a large and dirty folio sits at the updated start index, the
> truncate returns false and doesn't actually toss the folio because
> it's not fully within the logical offset boundaries, and hence the
> loop restarts indefinitely because the index had moved forward since
> the previous batch lookup.
> 
> To avoid this problem and simplify the code, remove the start/end
> boundary updates in the partial folio handling at range boundaries.
> My understanding is that there is potentially some overlap and
> duplicate work here, but that this is relatively harmless and safer
> than the alternative. Also since the folio size can change as a
> result of the partial truncate, update the same_folio value based on
> the post-truncate folio.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Hi all,
> 
> So I've been (finally) getting back to my tmpfs zeroing fixes and
> running some testing and hit this issue. The test is fsx running in a
> 10MB memory.max cgroup against a 4MB file on a tmpfs mount with
> huge=always. Note that I first tried to bisect this since I don't recall
> hitting it before and that landed on 69e0a3b49003 ("mm: shmem: fix the
> strategy for the tmpfs 'huge=' options"). I take this as a behavior
> change to something that precedes my initial testing and not a root
> cause, so I moved away from that in favor of throwing in some tracing to
> characterize behavior.
> 
> I'm sending this as an RFC because even though it seems to resolve the
> issue in my (limited so far) testing, I'm not familiar enough with all
> the complexities around large folio management and whatnot to be totally
> sure it's the right fix. For example, I'm not quite sure if the test
> constraints here are circumstantial or something more. FWIW, I started
> with a more simple fix to just prevent start from moving backwards. That
> prevents the issue as well, but this seemed a little more explicit on
> further thought.
> 

Self-NAK. After further testing I think I've reproduced a case where the
undo range races with swapout such that this change causes the
whole_folios loop to toss a full swap entry that extends outside the
range (where the existing code would have trimmed the range start),
which is wrong. I'll fall back to testing out the incremental fix
mentioned above and follow up when I have something more generally
sorted out...

Brian

> I notice the same boundary tweaking logic in
> truncate_inode_pages_range() (via the same commit [1]), though it looks
> like that path is not susceptible to a livelock as it will just toss
> folios. Again, I'm not totally sure if there are outside circumstances
> that make this less relevant there than for tmpfs, so this is at worst a
> starting point for discussion.. Thanks.
> 
> Brian
> 
> [1] b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
> 
>  mm/shmem.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index b9081b817d28..133a7d8213c5 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1133,13 +1133,9 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
>  	same_folio = (lstart >> PAGE_SHIFT) == (lend >> PAGE_SHIFT);
>  	folio = shmem_get_partial_folio(inode, lstart >> PAGE_SHIFT);
>  	if (folio) {
> -		same_folio = lend < folio_pos(folio) + folio_size(folio);
>  		folio_mark_dirty(folio);
> -		if (!truncate_inode_partial_folio(folio, lstart, lend)) {
> -			start = folio_next_index(folio);
> -			if (same_folio)
> -				end = folio->index;
> -		}
> +		truncate_inode_partial_folio(folio, lstart, lend);
> +		same_folio = lend < folio_pos(folio) + folio_size(folio);
>  		folio_unlock(folio);
>  		folio_put(folio);
>  		folio = NULL;
> @@ -1149,8 +1145,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
>  		folio = shmem_get_partial_folio(inode, lend >> PAGE_SHIFT);
>  	if (folio) {
>  		folio_mark_dirty(folio);
> -		if (!truncate_inode_partial_folio(folio, lstart, lend))
> -			end = folio->index;
> +		truncate_inode_partial_folio(folio, lstart, lend);
>  		folio_unlock(folio);
>  		folio_put(folio);
>  	}
> -- 
> 2.51.0
> 
> 



      reply	other threads:[~2025-11-03 20:07 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-30 16:51 Brian Foster
2025-11-03 20:11 ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQkMbsqh_9cYGsvV@bfoster \
    --to=bfoster@redhat.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=linux-mm@kvack.org \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox