From: Nancy Yuen <yuenn@google.com>
To: Randy Dunlap <rdunlap@xenotime.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Stefan Assmann <sassmann@kpanic.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
tony.luck@intel.com, andi@firstfloor.org, mingo@elte.hu,
hpa@zytor.com, rick@vanrein.org,
Michael Ditto <mditto@google.com>
Subject: Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
Date: Wed, 22 Jun 2011 11:11:51 -0700 [thread overview]
Message-ID: <BANLkTim=N=8G+Q9HJ6BaMO8L3oZouanxvtsf99fVxYGquTewDg@mail.gmail.com> (raw)
In-Reply-To: <20110622110910.c8e11eb7.rdunlap@xenotime.net>
I haven't had time to submit the patches, though it's on my todo list.
----------
Nancy
On Wed, Jun 22, 2011 at 11:09, Randy Dunlap <rdunlap@xenotime.net> wrote:
> On Wed, 22 Jun 2011 11:00:34 -0700 Andrew Morton wrote:
>
>> On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:
>>
>> > Following the RFC for the BadRAM feature here's the updated version with
>> > spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
>> > as requested by Andi Kleen.
>> > v2 with even more spelling fixes suggested by Randy.
>> > Patches are against vanilla 2.6.39.
>> >
>> > The idea is to allow the user to specify RAM addresses that shouldn't be
>> > touched by the OS, because they are broken in some way. Not all machines have
>> > hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>> > use bitmasks to mask address patterns with the new "badram" kernel command line
>> > parameter.
>> > Memtest86 has an option to generate these patterns since v2.3 so the only thing
>> > for the user to do should be:
>> > - run Memtest86
>> > - note down the pattern
>> > - add badram=<pattern> to the kernel command line
>> >
>> > The concerning pages are then marked with the hwpoison flag and thus won't be
>> > used by the memory managment system.
>>
>> The google kernel has a similar capability. I asked Nancy to comment
>> on these patches and she said:
>>
>> : One, the bad addresses are passed via the kernel command line, which
>> : has a limited length. It's okay if the addresses can be fit into a
>> : pattern, but that's not necessarily the case in the google kernel. And
>> : even with patterns, the limit on the command line length limits the
>> : number of patterns that user can specify. Instead we use lilo to pass
>> : a file containing the bad pages in e820 format to the kernel.
>> :
>> : Second, the BadRAM patch expands the address patterns from the command
>> : line into individual entries in the kernel's e820 table. The e820
>> : table is a fixed buffer that supports a very small, hard coded number
>> : of entries (128). We require a much larger number of entries (on
>> : the order of a few thousand), so much of the google kernel patch deals
>> : with expanding the e820 table. Also, with the BadRAM patch, entries
>> : that don't fit in the table are silently dropped and this isn't
>> : appropriate for us.
>> :
>> : Another caveat of mapping out too much bad memory in general. If too
>> : much memory is removed from low memory, a system may not boot. We
>> : solve this by generating good maps. Our userspace tools do not map out
>> : memory below a certain limit, and it verifies against a system's iomap
>> : that only addresses from memory is mapped out.
>>
>> I have a couple of thoughts here:
>>
>> - If this patchset is merged and a major user such as google is
>> unable to use it and has to continue to carry a separate patch then
>> that's a regrettable situation for the upstream kernel.
>>
>> - Google's is, afaik, the largest use case we know of: zillions of
>> machines for a number of years. And this real-world experience tells
>> us that the badram patchset has shortcomings. Shortcomings which we
>> can expect other users to experience.
>>
>> So. What are your thoughts on these issues?
>
>
> Good comments, so where is google's patch submittal?
>
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-06-22 18:11 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-22 11:18 Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 1/3] Add string parsing function get_next_ulong Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 2/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 3/3] Add documentation and credits for BadRAM Stefan Assmann
2011-06-22 18:00 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Andrew Morton
2011-06-22 18:06 ` Josh Boyer
2011-06-22 18:09 ` Randy Dunlap
2011-06-22 18:11 ` Nancy Yuen [this message]
2011-06-22 18:13 ` H. Peter Anvin
2011-06-22 19:01 ` Nancy Yuen
2011-06-22 19:06 ` H. Peter Anvin
2011-06-22 18:24 ` Andi Kleen
2011-06-22 18:38 ` Andrew Morton
2011-06-22 18:56 ` Andi Kleen
2011-06-22 19:05 ` H. Peter Anvin
2011-06-22 19:15 ` Andi Kleen
2011-06-22 20:25 ` H. Peter Anvin
2011-06-22 20:28 ` Andi Kleen
2011-06-22 19:46 ` [PATCH] x86: e820: Eliminate bubble sort from sanitize_e820_map Mike Ditto
2011-06-22 20:18 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-06-23 10:33 ` Rick van Rein
2011-06-23 10:49 ` Rick van Rein
2011-06-23 10:10 ` Rick van Rein
2011-06-22 18:15 ` H. Peter Anvin
2011-06-22 20:30 ` Stefan Assmann
2011-06-22 20:33 ` H. Peter Anvin
2011-06-23 13:39 ` Matthew Garrett
2011-06-23 14:08 ` Stefan Assmann
2011-06-23 14:12 ` Matthew Garrett
2011-06-23 15:37 ` Stefan Assmann
2011-06-23 16:30 ` H. Peter Anvin
2011-06-24 0:59 ` Andi Kleen
2011-06-23 17:00 ` Andi Kleen
2011-06-23 17:12 ` Luck, Tony
2011-06-24 1:03 ` Craig Bergstrom
2011-06-24 1:08 ` Andi Kleen
2011-06-24 1:22 ` Craig Bergstrom
2011-06-24 8:05 ` Rick van Rein
2011-06-24 14:34 ` Craig Bergstrom
2011-06-24 16:16 ` H. Peter Anvin
2011-06-24 16:40 ` Luck, Tony
2011-06-24 16:56 ` Rick van Rein
2011-06-24 17:14 ` H. Peter Anvin
[not found] <fa.fHPNPTsllvyE/7DxrKwiwgVbVww@ifi.uio.no>
2011-06-24 21:10 ` Shane Nay
2011-06-28 2:33 ` Craig Bergstrom
2011-06-29 8:08 ` Rick van Rein
2011-06-29 15:28 ` craig lkml
2011-06-29 16:06 ` Craig Bergstrom
2011-06-29 21:24 ` Tony Luck
2011-06-30 14:32 ` Jody Belka
-- strict thread matches above, loose matches on Subject: below --
2011-06-21 9:23 Stefan Assmann
2011-06-21 22:02 ` Andrew Morton
2011-06-22 11:11 ` Stefan Assmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTim=N=8G+Q9HJ6BaMO8L3oZouanxvtsf99fVxYGquTewDg@mail.gmail.com' \
--to=yuenn@google.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mditto@google.com \
--cc=mingo@elte.hu \
--cc=rdunlap@xenotime.net \
--cc=rick@vanrein.org \
--cc=sassmann@kpanic.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox