linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
	akpm@linux-foundation.org, david@kernel.org, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com,
	luto@kernel.org, peterz@infradead.org, tglx@linutronix.de,
	willy@infradead.org, raghavendra.kt@amd.com,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH v9 4/7] x86/mm: Simplify clear_page_*
Date: Wed, 26 Nov 2025 21:28:26 -0800	[thread overview]
Message-ID: <87ikew6o3p.fsf@oracle.com> (raw)
In-Reply-To: <CAGudoHFgDEEBgQK5PrEUAJsb=iFpsT5OJ8+7W8PV0CGNePR4JQ@mail.gmail.com>


Mateusz Guzik <mjguzik@gmail.com> writes:

> On Fri, Nov 21, 2025 at 9:24 PM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>> + * Switch between three implementations of page clearing based on CPU
>> + * capabilities:
>> + *
>> + *  - __clear_pages_unrolled(): the oldest, slowest and universally
>> + *    supported method. Zeroes via 8-byte MOV instructions unrolled 8x
>> + *    to write a 64-byte cacheline in each loop iteration.
>> + *
>> + *  - "REP; STOSQ": really old CPUs had crummy REP implementations.
>> + *    Vendor CPU setup code sets 'REP_GOOD' on CPUs where REP can be
>> + *    trusted. The instruction writes 8-byte per REP iteration but
>> + *    CPUs can internally batch these together and do larger writes.
>> + *
>> + *  - "REP; STOSB": CPUs that enumerate 'ERMS' have an improved STOS
>> + *    implementation that is less picky about alignment and where
>> + *    STOSB (1-byte at a time) is actually faster than STOSQ (8-bytes
>> + *    at a time.)
>> + *
>
> I think this is somewhat odd commentary in this context.
>
> Note about "crummy REP implementations" should be in description of
> __clear_pages_unrolled as it justifies its existence (I think the
> routine would be best whacked btw, but I'm not going to argue about it
> in this thread).
> Description of STOSQ notes the CPU can do more than 8 bytes at a time,
> while description of STOSB claim does not make such a clarification.
> At the same time the note about less picky about alignment makes no
> significance in the context of page clearing as they are, well, page
> aligned.

Good point. I'll rework the comment a little bit to align things better
(maybe reusing some of what you suggest below).

> There is a fucky real-world problem with ERMS worth noting: there are
> hypervisor setups out there which *hide* the bit by default (no
> really, see Proxmox for example -- you get a bare bones pre-ERMS
> cpuid)
>
> With all this in mind, modulo poor grammar on my end, I would suggest
> something like this:
>
> <quote>
> There are 3 variants implemented:
> - REP; STOSB: used if the CPU supports "Enhanced REP MOVSB/STOSB" (aka
> ERMS), which is true for majority of microarchitectures today
> - REP; STOSQ: fallback if the ERMS bit is not present
> - __clear_pages_unrolled: code for CPUs which are determined to have
> poor REP support, only concerns long obsolete uarchs.
>
> Warnings: some hypervisors are configured to expose a very limited set
> of capabilites in the guest, fitering out ERMS even if present. As
> such the STOSQ variant is still in active use on some setups even when
> hardware does not need it.
> </quote>

The last bit is useful context though maybe some of it fits better in
the commit message.

Thanks
ankur


  reply	other threads:[~2025-11-27  5:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-21 20:23 [PATCH v9 0/7] mm: folio_zero_user: clear contiguous pages Ankur Arora
2025-11-21 20:23 ` [PATCH v9 1/7] treewide: provide a generic clear_user_page() variant Ankur Arora
2025-11-23 11:53   ` Christophe Leroy (CS GROUP)
2025-11-24 10:17     ` David Hildenbrand (Red Hat)
2025-11-24 14:02       ` David Hildenbrand (Red Hat)
2025-11-25  7:52       ` Ankur Arora
2025-11-27 23:57         ` Ankur Arora
2025-11-28  7:39           ` Christophe Leroy (CS GROUP)
2025-11-28 22:19             ` Ankur Arora
2025-11-21 20:23 ` [PATCH v9 2/7] mm: introduce clear_pages() and clear_user_pages() Ankur Arora
2025-11-23 13:17   ` Christophe Leroy (CS GROUP)
2025-11-24 10:26     ` David Hildenbrand (Red Hat)
2025-11-28 10:13       ` Lance Yang
2025-11-28 21:59         ` Ankur Arora
2025-11-21 20:23 ` [PATCH v9 3/7] mm/highmem: introduce clear_user_highpages() Ankur Arora
2025-11-21 20:23 ` [PATCH v9 4/7] x86/mm: Simplify clear_page_* Ankur Arora
2025-11-25 13:47   ` Borislav Petkov
2025-11-25 19:01     ` Ankur Arora
2025-11-26 10:01   ` Mateusz Guzik
2025-11-27  5:28     ` Ankur Arora [this message]
2025-11-21 20:23 ` [PATCH v9 5/7] x86/clear_page: Introduce clear_pages() Ankur Arora
2025-11-21 20:23 ` [PATCH v9 6/7] mm, folio_zero_user: support clearing page ranges Ankur Arora
2025-11-21 20:23 ` [PATCH v9 7/7] mm: folio_zero_user: cache neighbouring pages Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikew6o3p.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=tglx@linutronix.de \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox