From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Yang Shi <shy828301@gmail.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Barry Song <21cnbao@gmail.com>, Lance Yang <ioworker0@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] mm/rmap: do not add fully unmapped large folio to deferred split list
Date: Thu, 25 Apr 2024 17:15:07 +0200 [thread overview]
Message-ID: <e1e3d368-4861-470f-b7ca-b222712adef5@redhat.com> (raw)
In-Reply-To: <8BBD1B75-135D-42AA-8937-53B259803AA7@nvidia.com>
On 25.04.24 16:53, Zi Yan wrote:
> On 25 Apr 2024, at 3:19, David Hildenbrand wrote:
>
>> On 25.04.24 00:46, Zi Yan wrote:
>>> From: Zi Yan <ziy@nvidia.com>
>>>
>>> In __folio_remove_rmap(), a large folio is added to deferred split list
>>> if any page in a folio loses its final mapping. It is possible that
>>> the folio is unmapped fully, but it is unnecessary to add the folio
>>> to deferred split list at all. Fix it by checking folio->_nr_pages_mapped
>>> before adding a folio to deferred split list. If the folio is already
>>> on the deferred split list, it will be skipped. This issue applies to
>>> both PTE-mapped THP and mTHP.
>>>
>>> Commit 98046944a159 ("mm: huge_memory: add the missing
>>> folio_test_pmd_mappable() for THP split statistics") tried to exclude
>>> mTHP deferred split stats from THP_DEFERRED_SPLIT_PAGE, but it does not
>>> fix the above issue. A fully unmapped PTE-mapped order-9 THP was still
>>
>> Once again: your patch won't fix it either.
>>
>>> added to deferred split list and counted as THP_DEFERRED_SPLIT_PAGE,
>>> since nr is 512 (non zero), level is RMAP_LEVEL_PTE, and inside
>>> deferred_split_folio() the order-9 folio is folio_test_pmd_mappable().
>>> However, this miscount was present even earlier due to implementation,
>>> since PTEs are unmapped individually and first PTE unmapping adds the THP
>>> into the deferred split list.
>>
>> It will still be present. Just less frequently.
>
> OK. Let me reread the email exchanges between you and Yang and clarify
> the details in the commit log.
Likely something like:
--
In __folio_remove_rmap(), a large folio is added to deferred split list
if any page in a folio loses its final mapping. But, it is possible that
the folio is now fully unmapped and adding it to the deferred split list
is unnecessary.
For PMD-mapped THPs, that was not really an issue, because removing the
last PMD mapping in the absence of PTE mappings would not have added the
folio to the deferred split queue.
However, for PTE-mapped THPs, which are now more prominent due to mTHP,
we will always end up adding them to the deferred split queue.
One side effect of this is that we will frequently increase the
THP_DEFERRED_SPLIT_PAGE stat for PTE-mapped THP, making it look like we
frequently get many partially mapped folios -- although we are simply
unmapping the whole thing stepwise.
Core-mm will now try batch-unmapping consecutive PTEs of PTE-mapped THPs
where possible. If we're lucky, we unmap the whole thing in one go and
can avoid adding the folio to the deferred split queue, reducing the
THP_DEFERRED_SPLIT_PAGE noise.
But there will still be noise when we cannot batch-unmap a complete
PTE-mapped folio in one go -- or where this type of batching is not
implemented yet.
--
Feel free to reuse what you consider reasonable.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-04-25 15:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-24 22:46 Zi Yan
2024-04-25 3:45 ` Lance Yang
2024-04-25 7:21 ` David Hildenbrand
2024-04-25 7:27 ` Lance Yang
2024-04-25 7:29 ` David Hildenbrand
2024-04-25 7:35 ` Lance Yang
2024-04-25 7:19 ` David Hildenbrand
2024-04-25 14:53 ` Zi Yan
2024-04-25 15:15 ` David Hildenbrand [this message]
2024-04-25 15:16 ` Zi Yan
2024-04-25 15:49 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1e3d368-4861-470f-b7ca-b222712adef5@redhat.com \
--to=david@redhat.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ioworker0@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox