From: Lance Yang <ioworker0@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Linux-MM <linux-mm@kvack.org>,
Ryan Roberts <ryan.roberts@arm.com>,
David Hildenbrand <david@redhat.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: All MADV_FREE mTHPs are fully subjected to deferred_split_folio()
Date: Mon, 30 Dec 2024 10:14:04 +0800 [thread overview]
Message-ID: <CAK1f24=aY3n72EgQR6MAXPVwabNMhbKzT=b2zVEG68MwHy1BCw@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4wOL6TLa3FKQASdrGfuqqu=14EuxAtpKmnebiGLm0dnfA@mail.gmail.com>
Hi Barry,
On Mon, Dec 30, 2024 at 5:13 AM Barry Song <21cnbao@gmail.com> wrote:
>
> Hi Lance,
>
> Along with Ryan, David, Baolin, and anyone else who might be interested,
>
> We’ve noticed an unexpectedly high number of deferred splits. The root
> cause appears to be the changes introduced in commit dce7d10be4bbd3
> ("mm/madvise: optimize lazyfreeing with mTHP in madvise_free"). Since
> that commit, split_folio is no longer called in mm/madvise.c.
>
> However, we are still performing deferred_split_folio for all
> MADV_FREE mTHPs, even for those that are fully aligned with mTHP.
> This happens because we execute a goto discard in
> try_to_unmap_one(), which eventually leads to
> folio_remove_rmap_pte() adding all folios to deferred_split when we
> scan the 1st pte in try_to_unmap_one().
>
> discard:
> if (unlikely(folio_test_hugetlb(folio)))
> hugetlb_remove_rmap(folio);
> else
> folio_remove_rmap_pte(folio, subpage, vma);
>
> This could lead to a race condition with shrinker - deferred_split_scan().
> The shrinker might call folio_try_get(folio), and while we are scanning
> the second PTE of this folio in try_to_unmap_one(), the entire mTHP
> could be transitioned back to swap-backed because the reference count
> is incremented.
>
> /*
> * The only page refs must be one from isolation
> * plus the rmap(s) (dropped by discard:).
> */
> if (ref_count == 1 + map_count &&
> (!folio_test_dirty(folio) ||
> ...
> (vma->vm_flags & VM_DROPPABLE))) {
> dec_mm_counter(mm, MM_ANONPAGES);
> goto discard;
> }
>
> It also significantly increases contention on ds_queue->split_queue_lock during
> memory reclamation and could potentially introduce other race conditions with
> shrinker as well.
Good catch!
>
> I’m curious if anyone has suggestions for resolving this issue. My
> idea is to use
> folio_remove_rmap_ptes to drop all PTEs at once, rather than
> folio_remove_rmap_pte,
> which processes PTEs one by one for an mTHP. This approach would require some
> changes, such as checking the dirty state of PTEs and performing a TLB
> flush for the
> entire mTHP as a whole in try_to_unmap_one().
Yeah, IHMO, it would also be beneficial to reclaim entire mTHPs as a whole
in real-world scenarios where MADV_FREE mTHPs are typically no longer
written ;)
>
> Please let me know if you have any objections or alternative suggestions.
Let's hear suggestions from other folks as well ~
Thanks,
Lance
>
> Thanks
> Barry
next prev parent reply other threads:[~2024-12-30 2:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-29 21:12 Barry Song
2024-12-30 2:14 ` Lance Yang [this message]
2024-12-30 9:48 ` David Hildenbrand
2024-12-30 11:54 ` Barry Song
2024-12-30 12:52 ` David Hildenbrand
2024-12-30 16:02 ` Lance Yang
2024-12-30 19:19 ` Barry Song
2024-12-30 19:32 ` David Hildenbrand
2024-12-30 20:22 ` Barry Song
2024-12-30 20:31 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAK1f24=aY3n72EgQR6MAXPVwabNMhbKzT=b2zVEG68MwHy1BCw@mail.gmail.com' \
--to=ioworker0@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox