linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>,
	Jun'ichi Nomura <j-nomura@ce.jp.nec.com>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 9/9] hugetlb: add corrupted hugepage counter
Date: Tue, 24 Aug 2010 12:01:33 +0900	[thread overview]
Message-ID: <20100824030133.GB12507@spritzera.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20100819015752.GB5762@localhost>

On Thu, Aug 19, 2010 at 09:57:52AM +0800, Wu Fengguang wrote:
> > +void increment_corrupted_huge_page(struct page *page);
> > +void decrement_corrupted_huge_page(struct page *page);
>
> nitpick: increment/decrement are not verbs.

OK, increase/decrease are correct.


> > +void increment_corrupted_huge_page(struct page *hpage)
> > +{
> > +   struct hstate *h = page_hstate(hpage);
> > +   spin_lock(&hugetlb_lock);
> > +   h->corrupted_huge_pages++;
> > +   spin_unlock(&hugetlb_lock);
> > +}
> > +
> > +void decrement_corrupted_huge_page(struct page *hpage)
> > +{
> > +   struct hstate *h = page_hstate(hpage);
> > +   spin_lock(&hugetlb_lock);
> > +   BUG_ON(!h->corrupted_huge_pages);
>
> There is no point to have BUG_ON() here:
>
> /*
>  * Don't use BUG() or BUG_ON() unless there's really no way out; one
>  * example might be detecting data structure corruption in the middle
>  * of an operation that can't be backed out of.  If the (sub)system
>  * can somehow continue operating, perhaps with reduced functionality,
>  * it's probably not BUG-worthy.
>  *
>  * If you're tempted to BUG(), think again:  is completely giving up
>  * really the *only* solution?  There are usually better options, where
>  * users don't need to reboot ASAP and can mostly shut down cleanly.
>  */

OK. I understand.
BUG_ON() is too severe for just a counter.

>
> And there is a race case that (corrupted_huge_pages==0)!
> Suppose the user space calls unpoison_memory() on a good pfn, and the page
> happen to be hwpoisoned between lock_page() and TestClearPageHWPoison(),
> corrupted_huge_pages will go negative.

I see.
When this race happens, unpoison runs and decreases HugePages_Crpt,
but racing memory failure returns without increasing it.
Yes, this is a problem we need to fix.

Moreover for hugepage we should pay attention to the possiblity of
mce_bad_pages mismatch which can occur by race between unpoison and
multiple memory failures, where each failure increases mce_bad_pages
by the number of pages in a hugepage.

I think counting corrupted hugepages is not directly related to
hugepage migration, and this problem only affects the counter,
not other behaviors, so I'll separate hugepage counter fix patch
from this patch set and post as another patch series. Is this OK?

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-08-24  3:02 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-10  9:27 [PATCH 0/9] Hugepage migration (v2) Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 1/9] HWPOISON, hugetlb: move PG_HWPoison bit check Naoya Horiguchi
2010-08-18  0:18   ` Wu Fengguang
2010-08-19  7:55     ` Naoya Horiguchi
2010-08-19  9:28       ` Wu Fengguang
2010-08-23  9:24         ` Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 2/9] hugetlb: add allocate function for hugepage migration Naoya Horiguchi
2010-08-17  6:51   ` David Rientjes
2010-08-18  3:02     ` Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 3/9] hugetlb: rename hugepage allocation functions Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 4/9] hugetlb: redefine hugepage copy functions Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 5/9] hugetlb: hugepage migration core Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 6/9] HWPOISON, hugetlb: soft offlining for hugepage Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 7/9] HWPOISON, hugetlb: fix unpoison " Naoya Horiguchi
2010-08-10  9:27 ` [PATCH 8/9] page-types.c: fix name of unpoison interface Naoya Horiguchi
2010-08-19  1:24   ` Wu Fengguang
2010-08-10  9:27 ` [PATCH 9/9] hugetlb: add corrupted hugepage counter Naoya Horiguchi
2010-08-19  1:57   ` Wu Fengguang
2010-08-24  3:01     ` Naoya Horiguchi [this message]
2010-08-24  3:08       ` Wu Fengguang
2010-08-11 13:09 ` [PATCH 0/9] Hugepage migration (v2) Christoph Lameter
2010-08-12  7:53   ` Naoya Horiguchi
2010-08-12  7:57     ` [RFC] [PATCH 1/4] hugetlb: prepare exclusion control functions for hugepage Naoya Horiguchi
2010-08-12  7:59     ` [RFC] [PATCH 2/4] dio: add page locking for direct I/O Naoya Horiguchi
2010-08-12 13:42       ` Jeff Moyer
2010-08-16  2:07         ` Naoya Horiguchi
2010-08-16  7:21           ` Andi Kleen
2010-08-16 13:20           ` Jeff Moyer
2010-08-17  8:17             ` Naoya Horiguchi
2010-08-17 13:46               ` Jeff Moyer
2010-08-17 14:21                 ` Andi Kleen
2010-08-17 16:41                   ` Christoph Lameter
2010-08-12  8:00     ` [PATCH 3/4] HWPOISON: replace locking functions into hugepage variants Naoya Horiguchi
2010-08-12  8:00     ` [PATCH 4/4] correct locking functions of hugepage migration routine Naoya Horiguchi
2010-08-13 12:47     ` [PATCH 0/9] Hugepage migration (v2) Christoph Lameter
2010-08-16  9:19       ` Naoya Horiguchi
2010-08-16 12:19         ` Christoph Lameter
2010-08-17  2:37           ` Naoya Horiguchi
2010-08-17  8:18             ` Naoya Horiguchi
2010-08-17  9:40               ` Andi Kleen
2010-08-18  7:32                 ` Naoya Horiguchi
2010-08-18  7:46                   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100824030133.GB12507@spritzera.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox