linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org,  linux-mm@kvack.org,
	x86@kernel.org, torvalds@linux-foundation.org,
	 akpm@linux-foundation.org, bp@alien8.de,
	dave.hansen@linux.intel.com,  hpa@zytor.com, mingo@redhat.com,
	luto@kernel.org, paulmck@kernel.org,  rostedt@goodmis.org,
	tglx@linutronix.de, willy@infradead.org,  jon.grimm@amd.com,
	bharata@amd.com, raghavendra.kt@amd.com,
	 boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH v3 1/4] x86/clear_page: extend clear_page*() for multi-page clearing
Date: Tue, 15 Apr 2025 22:32:08 +0200	[thread overview]
Message-ID: <CAGudoHEMfM+ZnAiF6enrmsMZHU64XWXxU5tu1bH5LSBbCNsO9g@mail.gmail.com> (raw)
In-Reply-To: <87cyddxkgl.fsf@oracle.com>

On Tue, Apr 15, 2025 at 10:02 PM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
>
> Mateusz Guzik <mjguzik@gmail.com> writes:
>
> > On Tue, Apr 15, 2025 at 8:14 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
> >>
> >>
> >> Mateusz Guzik <mjguzik@gmail.com> writes:
> >> > With that sucker out of the way, an optional quest is to figure out if
> >> > rep stosq vs rep stosb makes any difference for pages -- for all I know
> >> > rep stosq is the way. This would require testing on quite a few uarchs
> >> > and I'm not going to blame anyone for not being interested.
> >>
> >> IIRC some recent AMD models (Rome?) did expose REP_GOOD but not ERMS.
> >>
> >
> > The uarch does not have it or the bit magically fails to show up?
> > Worst case, should rep stosb be faster on that uarch, the kernel can
> > pretend the bit is set.
>
> It's a synthetic bit so the uarch has both. I think REP STOSB is optimized
> post FSRS (AIUI Zen3)
>
>         if (c->x86 >= 0x10)
>                 set_cpu_cap(c, X86_FEATURE_REP_GOOD);
>
>         /* AMD FSRM also implies FSRS */
>         if (cpu_has(c, X86_FEATURE_FSRM))
>                 set_cpu_cap(c, X86_FEATURE_FSRS);
>
>
> >> > Let's say nobody bothered OR rep stosb provides a win. In that case this
> >> > can trivially ALTERNATIVE between rep stosb and rep stosq based on ERMS,
> >> > no func calls necessary.
> >>
> >> We shouldn't need any function calls for ERMS and REP_GOOD.
> >>
> >> I think something like this untested code should work:
> >>
> >>         asm volatile(
> >>             ALTERNATIVE_2("call clear_pages_orig",
> >>                           "rep stosb", X86_FEATURE_REP_GOOD,
> >>                           "shrl $3,%ecx; rep stosq", X86_FEATURE_ERMS,
> >>                           : "+c" (size), "+D" (addr), ASM_CALL_CONSTRAINT
> >>                           : "a" (0)))
> >>
> >
> > That's what I'm suggesting, with one difference: whack
> > clear_pages_orig altogether.
>
> What do we gain by getting rid of it? Maybe there's old hardware with
> unoptimized rep; stos*.
>

The string routines (memset, memcpy et al) need a lot of love and
preferably nobody would bother spending time placating non-rep users
while sorting them out.

According to wiki the AMD CPUs started with REP_GOOD in 2007, meaning
you would need something even older than that to not have it. Intel is
presumably in a similar boat.

So happens gcc spent several years emitting inlined rep stosq and rep
movsq, so either users don't care or there are no users (well
realistically someone somewhere has a machine like that in the garage,
but fringe cases are not an argument).

rep_movs_alternative already punts to rep mov ignoring the issue of
REP_GOOD for some time now (admittedly, I removed the non-rep support
:P) and again there are no pitchforks (that I had seen).

So I think it would be best for everyone in the long run to completely
reap out the REP_GOOD thing. For all I know the kernel stopped booting
on machines with such uarchs long time ago for unrelated reasons.

As far as this specific patchset goes, it's just a waste of testing to
make sure it still works, but I can't *insist* on removing the
routine. I guess it is x86 maintainers call whether to whack this.
-- 
Mateusz Guzik <mjguzik gmail.com>


  reply	other threads:[~2025-04-15 20:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-14  3:46 [PATCH v3 0/4] mm/folio_zero_user: add " Ankur Arora
2025-04-14  3:46 ` [PATCH v3 1/4] x86/clear_page: extend clear_page*() for " Ankur Arora
2025-04-14  6:32   ` Ingo Molnar
2025-04-14 11:02     ` Peter Zijlstra
2025-04-14 11:14       ` Ingo Molnar
2025-04-14 19:46       ` Ankur Arora
2025-04-14 22:26       ` Mateusz Guzik
2025-04-15  6:14         ` Ankur Arora
2025-04-15  8:22           ` Mateusz Guzik
2025-04-15 20:01             ` Ankur Arora
2025-04-15 20:32               ` Mateusz Guzik [this message]
2025-04-14 19:52     ` Ankur Arora
2025-04-14 20:09       ` Matthew Wilcox
2025-04-15 21:59         ` Ankur Arora
2025-04-14  3:46 ` [PATCH v3 2/4] x86/clear_page: add clear_pages() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 3/4] huge_page: allow arch override for folio_zero_user() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 4/4] x86/folio_zero_user: multi-page clearing Ankur Arora
2025-04-14  6:53   ` Ingo Molnar
2025-04-14 21:21     ` Ankur Arora
2025-04-14  7:05   ` Ingo Molnar
2025-04-15  6:36     ` Ankur Arora
2025-04-22  6:36     ` Raghavendra K T
2025-04-22 19:14       ` Ankur Arora
2025-04-15 10:16   ` Mateusz Guzik
2025-04-15 21:46     ` Ankur Arora
2025-04-15 22:01       ` Mateusz Guzik
2025-04-16  4:46         ` Ankur Arora
2025-04-17 14:06           ` Mateusz Guzik
2025-04-14  5:34 ` [PATCH v3 0/4] mm/folio_zero_user: add " Ingo Molnar
2025-04-14 19:30   ` Ankur Arora
2025-04-14  6:36 ` Ingo Molnar
2025-04-14 19:19   ` Ankur Arora
2025-04-15 19:10 ` Zi Yan
2025-04-22 19:32   ` Ankur Arora
2025-04-22  6:23 ` Raghavendra K T
2025-04-22 19:22   ` Ankur Arora
2025-04-23  8:12     ` Raghavendra K T
2025-04-23  9:18       ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGudoHEMfM+ZnAiF6enrmsMZHU64XWXxU5tu1bH5LSBbCNsO9g@mail.gmail.com \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jon.grimm@amd.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox