linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* RFC: Faster memtest (possibly bypassing data cache)
@ 2023-07-05 15:41 Marc Gonzalez
  2023-07-11  9:01 ` Marc Gonzalez
  0 siblings, 1 reply; 2+ messages in thread
From: Marc Gonzalez @ 2023-07-05 15:41 UTC (permalink / raw)
  To: LKML, linux-mm
  Cc: Vladimir Murzin, Will Deacon, Mark Rutland, Thomas Gleixner,
	Ingo Molnar, Tomas Mudrunka, HPeter Anvin, Arnd Bergmann,
	Ard Biesheuvel

Hello,

When dealing with a few million devices (x86 and arm64),
it is statistically expected to have "a few" devices with
at least one bad RAM cell. (How many?)

For one particular model, we've determined that ~0.1% have
at least one bad RAM cell (ergo, a few thousand devices).

I've been wondering if someone more experienced knows:
Are these RAM cells bad from the start, or do they become bad
with time? (I assume both failure modes exist.)

Once the first bad cell is detected, is it more likely
to detect other bad cells as time goes by?
In other words, what are the failure modes of ageing RAM?


Closing the HW tangent, focusing on the SW side of things:

Since these bad RAM cells wreak havoc for the device's user,
especially with ASLR (different stuff crashes across reboots),
I've been experimenting with mm/memtest.c as a first line
of defense against bad RAM cells.

However, I have a run into a few issues.

Even though early_memtest is called, well... early, memory has
already been mapped as regular *cached* memory.

This means that when we test an area smaller than L3 cache, we're
not even hitting RAM, we're just testing the cache hierarchy.
I suppose it /might/ make sense to test the cache hierarchy,
as it could(?) have errors as well?
However, I suspect defects in cache are much more rare
(and thus detection might not be worth the added run-time).

On x86, I ran a few tests using SIMD non-temporal stores
(to bypass the cache on stores), and got 30% reduction in run-time.
(Minimal run-time is critical for being able to deploy the code
to millions of devices for the benefit of a few thousand users.)
AFAIK, there are no non-temporal loads, the normal loads probably
thrashed the data cache.

I was hoping to be able to test a different implementation:

When we enter early_memtest(), we remap [start, end]
as UC (or maybe WC?) so as to entirely bypass the cache.
We read/write using the largest size available for stores/loads,
e.g. entire cache lines on recent x86 HW.
Then when we leave, we remap as was done originally.

Is that possible?

Hopefully, the other cores are not started at this point?
(Otherwise this whole charade would be pointless.)

To summarize: is it possible to tweak memtest to make it
run faster while testing RAM in all cases?

Regards,

Marc Gonzalez


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RFC: Faster memtest (possibly bypassing data cache)
  2023-07-05 15:41 RFC: Faster memtest (possibly bypassing data cache) Marc Gonzalez
@ 2023-07-11  9:01 ` Marc Gonzalez
  0 siblings, 0 replies; 2+ messages in thread
From: Marc Gonzalez @ 2023-07-11  9:01 UTC (permalink / raw)
  To: LKML, linux-mm, Linux ARM
  Cc: Vladimir Murzin, Will Deacon, Mark Rutland, Robin Murphy,
	Thomas Gleixner, Tomas Mudrunka, HPeter Anvin, Ingo Molnar,
	Arnd Bergmann, Ard Biesheuvel

On 05/07/2023 17:41, Marc Gonzalez wrote:

> Hello,
> 
> When dealing with a few million devices (x86 and arm64),
> it is statistically expected to have "a few" devices with
> at least one bad RAM cell. (How many?)
> 
> For one particular model, we've determined that ~0.1% have
> at least one bad RAM cell (ergo, a few thousand devices).
> 
> I've been wondering if someone more experienced knows:
> Are these RAM cells bad from the start, or do they become bad
> with time? (I assume both failure modes exist.)
> 
> Once the first bad cell is detected, is it more likely
> to detect other bad cells as time goes by?
> In other words, what are the failure modes of ageing RAM?
> 
> 
> Closing the HW tangent, focusing on the SW side of things:
> 
> Since these bad RAM cells wreak havoc for the device's user,
> especially with ASLR (different stuff crashes across reboots),
> I've been experimenting with mm/memtest.c as a first line
> of defense against bad RAM cells.
> 
> However, I have a run into a few issues.
> 
> Even though early_memtest is called, well... early, memory has
> already been mapped as regular *cached* memory.
> 
> This means that when we test an area smaller than L3 cache, we're
> not even hitting RAM, we're just testing the cache hierarchy.
> I suppose it /might/ make sense to test the cache hierarchy,
> as it could(?) have errors as well?
> However, I suspect defects in cache are much more rare
> (and thus detection might not be worth the added run-time).
> 
> On x86, I ran a few tests using SIMD non-temporal stores
> (to bypass the cache on stores), and got 30% reduction in run-time.
> (Minimal run-time is critical for being able to deploy the code
> to millions of devices for the benefit of a few thousand users.)
> AFAIK, there are no non-temporal loads, the normal loads probably
> thrashed the data cache.
> 
> I was hoping to be able to test a different implementation:
> 
> When we enter early_memtest(), we remap [start, end]
> as UC (or maybe WC?) so as to entirely bypass the cache.
> We read/write using the largest size available for stores/loads,
> e.g. entire cache lines on recent x86 HW.
> Then when we leave, we remap as was done originally.
> 
> Is that possible?
> 
> Hopefully, the other cores are not started at this point?
> (Otherwise this whole charade would be pointless.)
> 
> To summarize: is it possible to tweak memtest to make it
> run faster while testing RAM in all cases?

Hello again,

I had a short chat with Robin on IRC.

He said trying to bypass the cache altogether was a bad idea(TM)
performance-wise. Do others agree with this assessment? :)

Would like to read people's thoughts about the whole thing.

What is the kernel API to flush a kernel memory range to memory?

  int flush_cache_to_memory(void *va_start, void *va_end);

On aarch64, I would test LDNP/STNP. Possibly also LD4/ST4.

Regards,

Marc



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-07-11  9:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-05 15:41 RFC: Faster memtest (possibly bypassing data cache) Marc Gonzalez
2023-07-11  9:01 ` Marc Gonzalez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox