From: Mike Kravetz <mike.kravetz@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>,
James Houghton <jthoughton@google.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Peter Xu <peterx@redhat.com>, Yosry Ahmed <yosryahmed@google.com>,
linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
David Rientjes <rientjes@google.com>,
Axel Rasmussen <axelrasmussen@google.com>,
lsf-pc@lists.linux-foundation.org,
Jiaqi Yan <jiaqiyan@google.com>
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] HGM for hugetlbfs
Date: Thu, 8 Jun 2023 15:35:43 -0700 [thread overview]
Message-ID: <20230608223543.GB88798@monkey> (raw)
In-Reply-To: <64824e07ba371_142af829493@dwillia2-xfh.jf.intel.com.notmuch>
On 06/08/23 14:54, Dan Williams wrote:
> Mike Kravetz wrote:
> > On 06/07/23 10:13, David Hildenbrand wrote:
> [..]
> > I am struggling with how to support existing hugetlb users that are running
> > into issues like memory errors on hugetlb pages today. And, yes that is a
> > source of real customer issues. They are not really happy with the current
> > design that a single error will take out a 1G page, and their VM or
> > application. Moving to THP is not likely as they really want a pre-allocated
> > pool of 1G pages. I just don't have a good answer for them.
>
> Is it the reporting interface, or the fact that the page gets offlined
> too quickly?
Somewhat both.
Reporting says the error starts at the beginning of the huge page with
length of huge page size. So, actual error is not really isolated. In
a way, this is 'desired' since hugetlb pages are treated as a single page.
Once a page is marked with poison, we prevent subsequent faults of the page.
Since a hugetlb page is treated as a single page, the 'good data' can
not be accessed as there is no way to fault in smaller pieces (4K pages)
of the page. Jiaqi Yan actually put together patches to 'read' the good
4K pages within the hugetlb page [1], but we will not always have a file
handle.
[1] https://lore.kernel.org/linux-mm/20230517160948.811355-1-jiaqiyan@google.com/
> I.e. if the 1GB page was unmapped from userspace per usual
> memory-failure, but the application had an opportunity to record what
> got clobbered on a smaller granularity and then ask the kernel to repair
> the page, would that relieve some pain?
Sounds interesting.
> Where repair is atomically
> writing a full cacheline of zeroes,
Excuse my hardware ignorance ... In this case, I assume writing zeroes
will repair the error on the original memory? This would then result
in data loss/zeroed, BUT the memory could be accessed without error.
So, the original 1G page could be used by the application (with data
missing of course).
> or copying around the poison to a
> new page and returning the old one to broken down and only have the
> single 4K page with error quarantined.
I suppose we could do that within the kernel, however user space would
have the ability to do this IF it could access the good 4K pages. That
is essentially what we do with THP pages by splitting and just marking a
single 4K page with poison. That is the functionality proposed by HGM.
It seems like asking the kernel to 'repair the page' would be a new
hugetlb specific interface. Or, could there be other users?
--
Mike Kravetz
next prev parent reply other threads:[~2023-06-08 22:36 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-06 19:19 Mike Kravetz
2023-03-14 15:37 ` James Houghton
2023-04-12 1:44 ` David Rientjes
2023-05-24 20:26 ` James Houghton
2023-05-26 3:00 ` David Rientjes
[not found] ` <20230602172723.GA3941@monkey>
2023-06-06 22:40 ` David Rientjes
2023-06-07 7:38 ` David Hildenbrand
2023-06-07 7:51 ` Yosry Ahmed
2023-06-07 8:13 ` David Hildenbrand
2023-06-07 22:06 ` Mike Kravetz
2023-06-08 0:02 ` David Rientjes
2023-06-08 6:34 ` David Hildenbrand
2023-06-08 18:50 ` Yang Shi
2023-06-08 21:23 ` Mike Kravetz
2023-06-09 1:57 ` Zi Yan
2023-06-09 15:17 ` Pasha Tatashin
2023-06-09 19:04 ` Ankur Arora
2023-06-09 19:57 ` Matthew Wilcox
2023-06-08 20:10 ` Matthew Wilcox
2023-06-09 2:59 ` David Rientjes
2023-06-13 14:59 ` Jason Gunthorpe
2023-06-13 15:15 ` David Hildenbrand
2023-06-13 15:45 ` Peter Xu
2023-06-08 21:54 ` [Lsf-pc] " Dan Williams
2023-06-08 22:35 ` Mike Kravetz [this message]
2023-06-09 3:36 ` Dan Williams
2023-06-09 20:20 ` James Houghton
2023-06-13 15:17 ` Jason Gunthorpe
2023-06-07 14:40 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230608223543.GB88798@monkey \
--to=mike.kravetz@oracle.com \
--cc=axelrasmussen@google.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=jiaqiyan@google.com \
--cc=jthoughton@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@suse.com \
--cc=naoya.horiguchi@nec.com \
--cc=peterx@redhat.com \
--cc=rientjes@google.com \
--cc=willy@infradead.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox