linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: David Laight <david.laight.linux@gmail.com>
Cc: Dev Jain <dev.jain@arm.com>,
	akpm@linux-foundation.org, shuah@kernel.org, ljs@kernel.org,
	Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	ryan.roberts@arm.com, anshuman.khandual@arm.com,
	Sarthak Sharma <sarthak.sharma@arm.com>
Subject: Re: [PATCH] selftests/mm: Simplify byte pattern checking in mremap_test
Date: Tue, 14 Apr 2026 11:53:06 +0200	[thread overview]
Message-ID: <c44d5afc-2936-486c-9806-01c942a73f70@kernel.org> (raw)
In-Reply-To: <20260414104712.2b634741@pumpkin>

On 4/14/26 11:47, David Laight wrote:
> On Tue, 14 Apr 2026 10:01:57 +0200
> "David Hildenbrand (Arm)" <david@kernel.org> wrote:
> 
>> On 4/14/26 07:09, Dev Jain wrote:
>>>
>>>
>>>
>>> Yes.
>>>
>>> I'll explain the algorithm in 7033c6cc9620.
>>>
>>> The problem statement is: given two buffers of equal length n, find the
>>> first mismatch index.
>>>
>>> Algorithm: Divide the buffers into sqrt(n) chunks. Do a memcmp() over
>>> each chunk. If all of them succeed, the buffers are equal, giving the
>>> result in O(sqrt(n)) * t, where t = time taken by memcmp().
>>>
>>> Otherwise, worst case is that we find the mismatch in the last chunk.
>>> Now brute-force iterate this chunk to find the mismatch. Since chunk
>>> size is sqrt(n), complexity is again
>>> sqrt(n) * t + sqrt(n) = O(sqrt(n)) * t.
>>>
>>> So if get_sqrt() computes a wrong square root, we lose this time
>>> complexity.  
>>
>> Ah, thanks for clarifying.
>>
>>>
>>> Maybe there is an optimal value of x = #number of chunks of the buffer,
>>> which may not be sqrt(n).
>>>
>>> But given the information we have, a CS course on algorithms will
>>> say this is one of the optimal ways to do it.
>>>   
>>>
>>> I just checked with ./mremap_test -t 0, the variance is very high on my
>>> system.
>>>
>>> In the common case of the test passing:
>>>
>>> before patch, there are multiple sub-length calls to memcmp.
>>> after patch, there is a single full-length call to memcmp.
>>>
>>> So the time should reduce but may not be very distinguishable.  
>>
>> Okay, so doesn't matter. I agree that we should simplify all that.
>>
>> The exact index is irrelevant. Whoever wants to debug the test failure
>> could modify the test to find that out. It's one of the tests we don't
>> really expect to fail (often).
>>
>>>   
>>>
>>> Not needed. 7033c6cc9620 does not create any incorrectness in the checking
>>> of mismatch index.  
>>
>> Yes, agreed.
>>
>>
>> I would suggest to rewrite/simplify/clarify the patch description, not
>> talking about "buggy" etc, focusing on the simplification.
>>
>> "
>> The original version of mremap_test (7df666253f26: "kselftests: vm: add
>> mremap tests") validated remapped contents byte-by-byte and printed a
>> mismatch index in case the bytes streams didn't match. That was rather
>> inefficient, especially also if the test passed.
>>
>> Later, commit 7033c6cc9620 ("selftests/mm: mremap_test: optimize
>> execution time from minutes to seconds using chunkwise memcmp") used
>> memcmp() on bigger chunks, to fallback to byte-wise scanning to detect
>> the problematic index only if it discovered a problem.
>>
>> However, the implementation is overly complicated (e.g., get_sqrt() is
>> currently not optimal) and we don't really have to report the exact
>> index: whoever debugs the failing test can figure that out.
>>
>> Let's simplify by just comparing both byte streams with memcmp() and not
>> detecting the exact failed index.
>> "
> 
> ISTM that if you get a failure it doesn't really matter how long a
> second scan takes.
> So a simple byte loop that reports the offset and data of the first
> error and counts the number of errors is more than enough.

That could also be done. But if the stars align and the test actually
fails, I am not even sure if the exact index is really helpful. So I'd
just drop that index reporting entirely.

-- 
Cheers,

David


  reply	other threads:[~2026-04-14  9:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 14:30 Dev Jain
2026-04-13 19:27 ` David Hildenbrand (Arm)
2026-04-14  5:09   ` Dev Jain
2026-04-14  7:31     ` Ryan Roberts
2026-04-14 11:57       ` Dev Jain
2026-04-14  8:01     ` David Hildenbrand (Arm)
2026-04-14  9:47       ` David Laight
2026-04-14  9:53         ` David Hildenbrand (Arm) [this message]
2026-04-14 12:00         ` Dev Jain
2026-04-14 11:53       ` Dev Jain
2026-04-14 11:38 ` Sarthak Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c44d5afc-2936-486c-9806-01c942a73f70@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=david.laight.linux@gmail.com \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=sarthak.sharma@arm.com \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox