From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Dev Jain <dev.jain@arm.com>, akpm@linux-foundation.org, shuah@kernel.org
Cc: ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org,
rppt@kernel.org, surenb@google.com, mhocko@suse.com,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
linux-kernel@vger.kernel.org, ryan.roberts@arm.com,
anshuman.khandual@arm.com,
Sarthak Sharma <sarthak.sharma@arm.com>
Subject: Re: [PATCH] selftests/mm: Simplify byte pattern checking in mremap_test
Date: Tue, 14 Apr 2026 10:01:57 +0200 [thread overview]
Message-ID: <134c372e-5c9e-493d-b954-d9954546beaf@kernel.org> (raw)
In-Reply-To: <8b5544eb-5ec0-4c85-a2da-7a454fa606dc@arm.com>
On 4/14/26 07:09, Dev Jain wrote:
>
>
> On 14/04/26 12:57 am, David Hildenbrand (Arm) wrote:
>> On 4/10/26 16:30, Dev Jain wrote:
>>> The original version of mremap_test (7df666253f26: "kselftests: vm: add
>>> mremap tests") validated remapped contents byte-by-byte and printed a
>>> mismatch index in case the bytes streams are not equal. That made
>>> validation expensive in both cases: for "no mismatch" (the common case when
>>> mremap is not buggy), it still walked all bytes in C; for "mismatch", it
>>> broke out of the loop after printing the mismatch index.
>>>
>>> Later, my commit 7033c6cc9620 ("selftests/mm: mremap_test: optimize
>>> execution time from minutes to seconds using chunkwise memcmp") tried to
>>> optimize both cases by using chunk-wise memcmp() and only scanning bytes
>>> within a range which has been determined by memcmp as mismatching.
>>>
>>> But get_sqrt() in that commit is buggy: `high = mid - 1` is applied
>>> unconditionally. This makes the speed of checking the mismatch index
>>> suboptimal.
>>
>> So is that the only problem with 7033c6cc9620: the speed?
>
> Yes.
>
> I'll explain the algorithm in 7033c6cc9620.
>
> The problem statement is: given two buffers of equal length n, find the
> first mismatch index.
>
> Algorithm: Divide the buffers into sqrt(n) chunks. Do a memcmp() over
> each chunk. If all of them succeed, the buffers are equal, giving the
> result in O(sqrt(n)) * t, where t = time taken by memcmp().
>
> Otherwise, worst case is that we find the mismatch in the last chunk.
> Now brute-force iterate this chunk to find the mismatch. Since chunk
> size is sqrt(n), complexity is again
> sqrt(n) * t + sqrt(n) = O(sqrt(n)) * t.
>
> So if get_sqrt() computes a wrong square root, we lose this time
> complexity.
Ah, thanks for clarifying.
>
> Maybe there is an optimal value of x = #number of chunks of the buffer,
> which may not be sqrt(n).
>
> But given the information we have, a CS course on algorithms will
> say this is one of the optimal ways to do it.
>
>>
>>>
>>> The mismatch index does not provide useful debugging value here: if
>>> validation fails, we know mremap behavior is wrong, and the specific byte
>>> offset does not make root-causing easier.
>>
>> Fully agreed.
>>
>>>
>>> So instead of fixing get_sqrt(), bite the bullet, drop mismatch index
>>> scanning and just compare the two byte streams with memcmp().
>>
>> How does this affect the execution time of the test?
>
> I just checked with ./mremap_test -t 0, the variance is very high on my
> system.
>
> In the common case of the test passing:
>
> before patch, there are multiple sub-length calls to memcmp.
> after patch, there is a single full-length call to memcmp.
>
> So the time should reduce but may not be very distinguishable.
Okay, so doesn't matter. I agree that we should simplify all that.
The exact index is irrelevant. Whoever wants to debug the test failure
could modify the test to find that out. It's one of the tests we don't
really expect to fail (often).
>
>>
>>>
>>> Reported-by: Sarthak Sharma <sarthak.sharma@arm.com>
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>
>> Fixes: 7033c6cc9620 ("selftests/mm: mremap_test: optimize execution time
>> from minutes to seconds using chunkwise memcmp")
>>
>> ?
>
> Not needed. 7033c6cc9620 does not create any incorrectness in the checking
> of mismatch index.
Yes, agreed.
I would suggest to rewrite/simplify/clarify the patch description, not
talking about "buggy" etc, focusing on the simplification.
"
The original version of mremap_test (7df666253f26: "kselftests: vm: add
mremap tests") validated remapped contents byte-by-byte and printed a
mismatch index in case the bytes streams didn't match. That was rather
inefficient, especially also if the test passed.
Later, commit 7033c6cc9620 ("selftests/mm: mremap_test: optimize
execution time from minutes to seconds using chunkwise memcmp") used
memcmp() on bigger chunks, to fallback to byte-wise scanning to detect
the problematic index only if it discovered a problem.
However, the implementation is overly complicated (e.g., get_sqrt() is
currently not optimal) and we don't really have to report the exact
index: whoever debugs the failing test can figure that out.
Let's simplify by just comparing both byte streams with memcmp() and not
detecting the exact failed index.
"
--
Cheers,
David
next prev parent reply other threads:[~2026-04-14 8:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 14:30 Dev Jain
2026-04-13 19:27 ` David Hildenbrand (Arm)
2026-04-14 5:09 ` Dev Jain
2026-04-14 7:31 ` Ryan Roberts
2026-04-14 11:57 ` Dev Jain
2026-04-14 8:01 ` David Hildenbrand (Arm) [this message]
2026-04-14 9:47 ` David Laight
2026-04-14 9:53 ` David Hildenbrand (Arm)
2026-04-14 11:53 ` Dev Jain
2026-04-14 11:38 ` Sarthak Sharma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=134c372e-5c9e-493d-b954-d9954546beaf@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=dev.jain@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sarthak.sharma@arm.com \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox