From: Dev Jain <dev.jain@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
akpm@linux-foundation.org, david@redhat.com, willy@infradead.org,
kirill.shutemov@linux.intel.com
Cc: anshuman.khandual@arm.com, catalin.marinas@arm.com,
cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com,
apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org,
baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu,
haowenchao22@gmail.com, hughd@google.com,
aneesh.kumar@kernel.org, yang@os.amperecomputing.com,
peterx@redhat.com, ioworker0@gmail.com,
wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com,
surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com,
zhengqi.arch@bytedance.com, jhubbard@nvidia.com,
21cnbao@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 12/12] selftests/mm: khugepaged: Enlighten for mTHP collapse
Date: Wed, 18 Dec 2024 15:20:00 +0530 [thread overview]
Message-ID: <07b40938-029b-42ef-a273-46b504e67eda@arm.com> (raw)
In-Reply-To: <7098654a-776d-413b-8aca-28f811620df7@arm.com>
On 18/12/24 2:33 pm, Ryan Roberts wrote:
> On 16/12/2024 16:51, Dev Jain wrote:
>> One of the testcases triggers a CoW on the 255th page (0-indexing) with
>> max_ptes_shared = 256. This leads to 0-254 pages (255 in number) being unshared,
>> and 257 pages shared, exceeding the constraint. Suppose we run the test as
>> ./khugepaged -s 2. Therefore, khugepaged starts collapsing the range to order-2
>> folios, since PMD-collapse will fail due to the constraint.
>> When the scan reaches 254-257 PTE range, because at least one PTE in this range
>> is writable, with other 3 being read-only, khugepaged collapses this into an
>> order-2 mTHP, resulting in 3 extra PTEs getting unshared. After this, we encounter
>> a 4-sized chunk of read-only PTEs, and mTHP collapse stops according to the scaled
>> constraint, but the number of shared PTEs have now come under the constraint for
>> PMD-sized THPs. Therefore, the next scan of khugepaged will be able to collapse
>> this range into a PMD-mapped hugepage, leading to failure of this subtest. Fix
>> this by reducing the CoW range.
> Is this description essentially saying that it's now possible to creep towards
> collapsing to a full PMD-size block over successive scans due to rounding errors
> in the scaling? Or is this just trying an edge case and the problem doesn't
> generalize?
For this case, max_ptes_shared for order-2 is 256 >> (9 - 2) = 2, without rounding errors. We cannot
really get a rounding problem because we are rounding down, essentially either keep the restriction
same, or making it stricter, as we go down the orders.
But thinking again, this behaviour may generalize: essentially, let us say that the distribution
of none ptes vs filled ptes is very skewed for the PMD case. In a local region, this distribution
may not be skewed, and then an mTHP collapse will occur, making this entire region uniform. Over time this
may keep happening and then the region will become globally uniform to come under the PMD-constraint
on max_ptes_none, and eventually PMD-collapse will occur. Which may beg the question whether we want
to detach khugepaged orders from mTHP sysfs settings.
>
>> Note: The only objective of this patch is to make the test work for the PMD-case;
>> no extension has been made for testing for mTHPs.
>>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> tools/testing/selftests/mm/khugepaged.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c
>> index 8a4d34cce36b..143c4ad9f6a1 100644
>> --- a/tools/testing/selftests/mm/khugepaged.c
>> +++ b/tools/testing/selftests/mm/khugepaged.c
>> @@ -981,6 +981,7 @@ static void collapse_fork_compound(struct collapse_context *c, struct mem_ops *o
>> static void collapse_max_ptes_shared(struct collapse_context *c, struct mem_ops *ops)
>> {
>> int max_ptes_shared = thp_read_num("khugepaged/max_ptes_shared");
>> + int fault_nr_pages = is_anon(ops) ? 1 << anon_order : 1;
>> int wstatus;
>> void *p;
>>
>> @@ -997,8 +998,8 @@ static void collapse_max_ptes_shared(struct collapse_context *c, struct mem_ops
>> fail("Fail");
>>
>> printf("Trigger CoW on page %d of %d...",
>> - hpage_pmd_nr - max_ptes_shared - 1, hpage_pmd_nr);
>> - ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - 1) * page_size);
>> + hpage_pmd_nr - max_ptes_shared - fault_nr_pages, hpage_pmd_nr);
>> + ops->fault(p, 0, (hpage_pmd_nr - max_ptes_shared - fault_nr_pages) * page_size);
>> if (ops->check_huge(p, 0))
>> success("OK");
>> else
next prev parent reply other threads:[~2024-12-18 9:50 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 16:50 [RFC PATCH 00/12] khugepaged: Asynchronous " Dev Jain
2024-12-16 16:50 ` [RFC PATCH 01/12] khugepaged: Rename hpage_collapse_scan_pmd() -> ptes() Dev Jain
2024-12-17 4:18 ` Matthew Wilcox
2024-12-17 5:52 ` Dev Jain
2024-12-17 6:43 ` Ryan Roberts
2024-12-17 18:11 ` Zi Yan
2024-12-17 19:12 ` Ryan Roberts
2024-12-16 16:50 ` [RFC PATCH 02/12] khugepaged: Generalize alloc_charge_folio() Dev Jain
2024-12-17 2:51 ` Baolin Wang
2024-12-17 6:08 ` Dev Jain
2024-12-17 4:17 ` Matthew Wilcox
2024-12-17 7:09 ` Ryan Roberts
2024-12-17 13:00 ` Zi Yan
2024-12-20 17:41 ` Christoph Lameter (Ampere)
2024-12-20 17:45 ` Ryan Roberts
2024-12-20 18:47 ` Christoph Lameter (Ampere)
2025-01-02 11:21 ` Ryan Roberts
2024-12-17 6:53 ` Ryan Roberts
2024-12-17 9:06 ` Dev Jain
2024-12-16 16:50 ` [RFC PATCH 03/12] khugepaged: Generalize hugepage_vma_revalidate() Dev Jain
2024-12-17 4:21 ` Matthew Wilcox
2024-12-17 16:58 ` Ryan Roberts
2024-12-16 16:50 ` [RFC PATCH 04/12] khugepaged: Generalize __collapse_huge_page_swapin() Dev Jain
2024-12-17 4:24 ` Matthew Wilcox
2024-12-16 16:50 ` [RFC PATCH 05/12] khugepaged: Generalize __collapse_huge_page_isolate() Dev Jain
2024-12-17 4:32 ` Matthew Wilcox
2024-12-17 6:41 ` Dev Jain
2024-12-17 17:14 ` Ryan Roberts
2024-12-17 17:09 ` Ryan Roberts
2024-12-16 16:50 ` [RFC PATCH 06/12] khugepaged: Generalize __collapse_huge_page_copy_failed() Dev Jain
2024-12-17 17:22 ` Ryan Roberts
2024-12-18 8:49 ` Dev Jain
2024-12-16 16:51 ` [RFC PATCH 07/12] khugepaged: Scan PTEs order-wise Dev Jain
2024-12-17 18:15 ` Ryan Roberts
2024-12-18 9:24 ` Dev Jain
2025-01-06 10:04 ` Usama Arif
2025-01-07 7:17 ` Dev Jain
2024-12-16 16:51 ` [RFC PATCH 08/12] khugepaged: Abstract PMD-THP collapse Dev Jain
2024-12-17 19:24 ` Ryan Roberts
2024-12-18 9:26 ` Dev Jain
2024-12-16 16:51 ` [RFC PATCH 09/12] khugepaged: Introduce vma_collapse_anon_folio() Dev Jain
2024-12-16 17:06 ` David Hildenbrand
2024-12-16 19:08 ` Yang Shi
2024-12-17 10:07 ` Dev Jain
2024-12-17 10:32 ` David Hildenbrand
2024-12-18 8:35 ` Dev Jain
2025-01-02 10:08 ` Dev Jain
2025-01-02 11:33 ` David Hildenbrand
2025-01-03 8:17 ` Dev Jain
2025-01-02 11:22 ` David Hildenbrand
2024-12-18 15:59 ` Dev Jain
2025-01-06 10:17 ` Usama Arif
2025-01-07 8:12 ` Dev Jain
2024-12-16 16:51 ` [RFC PATCH 10/12] khugepaged: Skip PTE range if a larger mTHP is already mapped Dev Jain
2024-12-18 7:36 ` Ryan Roberts
2024-12-18 9:34 ` Dev Jain
2024-12-19 3:40 ` John Hubbard
2024-12-19 3:51 ` Zi Yan
2024-12-19 7:59 ` Dev Jain
2024-12-19 8:07 ` Dev Jain
2024-12-20 11:57 ` Ryan Roberts
2024-12-16 16:51 ` [RFC PATCH 11/12] khugepaged: Enable sysfs to control order of collapse Dev Jain
2024-12-16 16:51 ` [RFC PATCH 12/12] selftests/mm: khugepaged: Enlighten for mTHP collapse Dev Jain
2024-12-18 9:03 ` Ryan Roberts
2024-12-18 9:50 ` Dev Jain [this message]
2024-12-20 11:05 ` Ryan Roberts
2024-12-30 7:09 ` Dev Jain
2024-12-30 16:36 ` Zi Yan
2025-01-02 11:43 ` Ryan Roberts
2025-01-03 10:10 ` Dev Jain
2025-01-03 10:11 ` Dev Jain
2024-12-16 17:31 ` [RFC PATCH 00/12] khugepaged: Asynchronous " Dev Jain
2025-01-02 21:58 ` Nico Pache
2025-01-03 7:04 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07b40938-029b-42ef-a273-46b504e67eda@arm.com \
--to=dev.jain@arm.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=baohua@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=haowenchao22@gmail.com \
--cc=hughd@google.com \
--cc=ioworker0@gmail.com \
--cc=jack@suse.cz \
--cc=jglisse@google.com \
--cc=jhubbard@nvidia.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=peterx@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=srivatsa@csail.mit.edu \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox