linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jiaqi Yan <jiaqiyan@google.com>
To: Yang Shi <shy828301@gmail.com>
Cc: "Tony Luck" <tony.luck@intel.com>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Miaohe Lin" <linmiaohe@huawei.com>, "Jue Wang" <juew@google.com>,
	"Linux MM" <linux-mm@kvack.org>
Subject: Re: [RFC v1 0/2] Memory poison recovery in khugepaged
Date: Fri, 25 Mar 2022 16:07:05 -0700	[thread overview]
Message-ID: <CACw3F53iL4eZUmSgRzx_++bzfo7Wm0CJZrwgTJDeATZ48ezPpQ@mail.gmail.com> (raw)
In-Reply-To: <CAHbLzkrfTNaJX384-HqyxTXVqa=zuOPcPwVUMBhFHnUDMHMjrQ@mail.gmail.com>

On Fri, Mar 25, 2022 at 2:42 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Fri, Mar 25, 2022 at 2:11 PM Jiaqi Yan <jiaqiyan@google.com> wrote:
> >
> > On Thu, Mar 24, 2022 at 7:51 PM Yang Shi <shy828301@gmail.com> wrote:
> > >
> > > On Wed, Mar 23, 2022 at 4:29 PM Jiaqi Yan <jiaqiyan@google.com> wrote:
> > > >
> > > > Problem
> > > > =======
> > > > Memory DIMMs are subject to multi-bit flips, i.e. memory errors.
> > > > As memory size and density increase, the chances of and number of
> > > > memory errors increase. The increasing size and density of server
> > > > RAM in the data center and cloud have shown increased uncorrectable
> > > > memory errors. There are already mechanisms in the kernel to recover
> > > > from uncorrectable memory errors. This series of patches provides
> > > > the recovery mechanism for the particular kernel agent khugepaged.
> > > >
> > > > Impact
> > > > ======
> > > > The main reason we chose to make khugepaged tolerant of memory failures
> > > > was its high possibility of accessing poisoned memory while performing
> > > > functionally optional compaction actions. Standard applications
> > > > typically don't have strict requirements on the size of its pages.
> > > > So they are given 4K pages by the kernel. The kernel is able to improve
> > > > application performance by either 1) giving application 2M pages
> > > > to begin with, or 2) collapsing 4K pages into 2M pages when possible.
> > > > This collapsing operation is done by khugepaged, a kernel agent that
> > > > is constantly scanning memory. When collapsing 4K pages into a 2M page,
> > > > it must copy the data from the 4K pages into a physically contiguous
> > > > 2M page. Therefore, as long as there exists one poisoned cache line in
> > > > collapsible 4K pages, khugepaged will eventually access it. The current
> > > > impact to users is a machine check exception triggered kernel panic.
> > > > However, khugepaged’s compaction operations are not functionally required
> > > > kernel actions. Therefore making khugepaged tolerant to poisoned memory
> > > > will greatly improve user experience.
> > > >
> > > > Solution
> > > > ========
> > > > As stated before, it is less desirable to crash the system only because
> > > > khugepaged accesses poisoned pages while it is collapsing 4K pages.
> > > > The high level idea of this patch series is to skip the group of pages
> > > > (usually 512 4K-size pages) once khugepaged finds one of them is poisoned,
> > > > as these pages have become ineligible to be collapsed.
> > > >
> > > > We are also careful to unwind operations khuagepaged has performed before
> > > > it detects memory failures. For example, before copying and collapsing
> > > > a group of anonymous pages into a huge page, the source pages will be
> > > > isolated and their page table is unlinked from their PMD. These operations
> > > > need to be undone in order to ensure these pages are not changed/lost from
> > > > the perspective of other threads (both user and kernel space). As for
> > > > file backed memory pages, there already exists a rollback case. This
> > > > patch just extends it so that khugepaged also correctly rolls back when
> > > > it fails to copy poisoned 4K pages.
> > >
> > > Actually I should asked the question in the first place before diving
> > > into the implementation details, if uncorrectable memory error
> > > happens, kernel will pin the poisoned page and set hwpoison flag, the
> > > bumped page refcount would prevent the page from being collapsed IIUC.
> >
> > This patch series is for cases where khugepaged is the first guy that detects
> > the memory errors on these poisoned pages. IOW, the pages are not known to
> > have memory errors when khugepaged collapsing gets to them.
> > In our observation, this happens frequently when the huge page ratio of
> > the system is relatively low, which is fairly common in cloud VMs.
>
> Thanks, this is the very important information that needs to be caught
> in the 1st patch's commit log.

Thanks for this valuable feedback. I will add this in the commit msg of v2,
but I will wait for your comments on patch 2/2 before sending out v2.

>
> >
> >
> >
> > > So I'm wondering why we need this?
> > >
> > > >
> > > > Jiaqi Yan (2):
> > > >   mm: khugepaged: recover from poisoned anonymous memory
> > > >   mm: khugepaged: recover from poisoned file-backed memory
> > > >
> > > >  include/linux/highmem.h |  37 +++++++
> > > >  mm/khugepaged.c         | 211 +++++++++++++++++++++++++++++-----------
> > > >  2 files changed, 189 insertions(+), 59 deletions(-)
> > > >
> > > > --
> > > > 2.35.1.894.gb6a874cedc-goog
> > > >


      reply	other threads:[~2022-03-25 23:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 23:29 Jiaqi Yan
2022-03-23 23:29 ` [RFC v1 1/2] mm: khugepaged: recover from poisoned anonymous memory Jiaqi Yan
2022-03-25  2:20   ` Yang Shi
2022-04-05 20:48     ` Jiaqi Yan
2022-04-06 16:02       ` Yang Shi
2022-03-23 23:29 ` [RFC v1 2/2] mm: khugepaged: recover from poisoned file-backed memory Jiaqi Yan
2022-03-28 23:37   ` Yang Shi
2022-04-05 20:46     ` Jiaqi Yan
2022-03-29  4:02   ` Tong Tiangen
2022-04-05 20:46     ` Jiaqi Yan
2022-03-25  2:50 ` [RFC v1 0/2] Memory poison recovery in khugepaged Yang Shi
2022-03-25 21:11   ` Jiaqi Yan
2022-03-25 21:42     ` Yang Shi
2022-03-25 23:07       ` Jiaqi Yan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACw3F53iL4eZUmSgRzx_++bzfo7Wm0CJZrwgTJDeATZ48ezPpQ@mail.gmail.com \
    --to=jiaqiyan@google.com \
    --cc=juew@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=shy828301@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox