linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [PATCH 1/1] mm:hugetlbfs: Fix hwpoison reserve accounting
Date: Tue, 24 Oct 2017 00:46:05 +0000	[thread overview]
Message-ID: <20171024004605.GA19663@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <26945734-ac7e-f71e-dbfa-0b0f0fdaff32@oracle.com>

On Mon, Oct 23, 2017 at 11:20:02AM -0700, Mike Kravetz wrote:
> On 10/23/2017 12:32 AM, Naoya Horiguchi wrote:
> > On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote:
> >> On 10/19/2017 07:30 PM, Naoya Horiguchi wrote:
> >>> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
> >>>
> >>> Thank you for addressing this. The patch itself looks good to me, but
> >>> the reported issue (negative reserve count) doesn't reproduce in my trial
> >>> with v4.14-rc5, so could you share the exact procedure for this issue?
> >>
> >> Sure, but first one question on your test scenario below.
> >>
> >>>
> >>> When error handler runs over a huge page, the reserve count is incremented
> >>> so I'm not sure why the reserve count goes negative.
> >>
> >> I'm not sure I follow.  What specific code is incrementing the reserve
> >> count?
> > 
> > The call path is like below:
> > 
> >   hugetlbfs_error_remove_page
> >     hugetlb_fix_reserve_counts
> >       hugepage_subpool_get_pages(spool, 1)
> >         hugetlb_acct_memory(h, 1);
> >           gather_surplus_pages
> >             h->resv_huge_pages += delta;
> > 
> 
> Ah OK.  This is a result of call to hugetlb_fix_reserve_counts which
> I believe is incorrect in most instances, and is unlikely to happen 
> with my patch.
> 
> >>
> >> Remove the file (rm /var/opt/oracle/hugepool/foo)
> >> -------------------------------------------------
> >> HugePages_Total:       1
> >> HugePages_Free:        0
> >> HugePages_Rsvd:    18446744073709551615
> >> HugePages_Surp:        0
> >> Hugepagesize:       2048 kB
> >>
> >> I am still confused about how your test maintains a reserve count after
> >> poisoning.  It may be a good idea for you to test my patch with your
> >> test scenario as I can not recreate here.
> > 
> > Interestingly, I found that this reproduces if all hugetlb pages are
> > reserved when poisoning.
> > Your testing meets the condition, and mine doesn't.
> > 
> > In gather_surplus_pages() we determine whether we extend hugetlb pool
> > with surplus pages like below:
> > 
> >     needed = (h->resv_huge_pages + delta) - h->free_huge_pages;
> >     if (needed <= 0) {
> >             h->resv_huge_pages += delta;
> >             return 0;
> >     }
> >     ...
> > 
> > needed is 1 if h->resv_huge_pages == h->free_huge_pages, and then
> > the reserve count gets inconsistent.
> > I confirmed that your patch fixes the issue, so I'm OK with it.
> 
> Thanks.  That now makes sense to me.
> 
> hugetlb_fix_reserve_counts (which results in gather_surplus_pages being
> called), is only designed to be called in the extremely rare cases when
> we have free'ed a huge page but are unable to free the reservation entry.
> 
> Just curious, when the hugetlb_fix_reserve_counts call was added to
> hugetlbfs_error_remove_page, was the intention to preserve the original
> reservation? 

No, the intention was to remove the reservation of the error hugepage
which was unmapped and isolated from normal hugepage's lifecycle.
The error hugepage is not freed back to hugepage pool, but it should be
handled in the same manner as freeing from the perspective of reserve count.

When I was writing commit 78bb920344b8, I experienced some reserve count
mismatch, and wrongly borrowed the code from truncation code.

> I remember thinking hard about that for the hole punch
> case and came to the conclusion that it was easier and less error prone
> to remove the reservation as well.  That will also happen in the error
> case with the patch I provided.

Yes, hole punching seems sililar to poisoning except that the final destination
of the target page differs. So we can make the same conclusion here.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2017-10-24  0:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 23:00 [PATCH 0/1] " Mike Kravetz
2017-10-19 23:00 ` [PATCH 1/1] " Mike Kravetz
2017-10-20  2:30   ` Naoya Horiguchi
2017-10-20 17:49     ` Mike Kravetz
2017-10-23  7:32       ` Naoya Horiguchi
2017-10-23 18:20         ` Mike Kravetz
2017-10-24  0:46           ` Naoya Horiguchi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171024004605.GA19663@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox