linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jiaqi Yan <jiaqiyan@google.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 shy828301@gmail.com
Cc: kirill.shutemov@linux.intel.com, kirill@shutemov.name,
	 tongtiangen@huawei.com, tony.luck@intel.com,
	naoya.horiguchi@nec.com,  linmiaohe@huawei.com,
	linux-mm@kvack.org, osalvador@suse.de
Subject: Re: [PATCH v8 1/2] mm/khugepaged: recover from poisoned anonymous memory
Date: Fri, 2 Dec 2022 09:29:23 -0800	[thread overview]
Message-ID: <CACw3F50wgfxDreGdDC+FPRJVGM1QaLB1qWaOvV++15rNZbgiag@mail.gmail.com> (raw)
In-Reply-To: <83c0ee14-75d3-c2ef-c6d0-040bf5e2fc7e@huawei.com>

On Thu, Dec 1, 2022 at 6:25 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
>
> On 2022/12/2 7:09, Andrew Morton wrote:
> > On Wed, 30 Nov 2022 16:59:30 -0800 Jiaqi Yan <jiaqiyan@google.com> wrote:
> >
> >> Make __collapse_huge_page_copy return whether copying anonymous pages
> >> succeeded, and make collapse_huge_page handle the return status.
> >>
> >> Break existing PTE scan loop into two for-loops. The first loop copies
> >> source pages into target huge page, and can fail gracefully when running
> >> into memory errors in source pages. If copying all pages succeeds, the
> >> second loop releases and clears up these normal pages. Otherwise, the
> >> second loop rolls back the page table and page states by:
> >> - re-establishing the original PTEs-to-PMD connection.
> >> - releasing source pages back to their LRU list.
> >>
> >> Tested manually:
> >> 0. Enable khugepaged on system under test.
> >> 1. Start a two-thread application. Each thread allocates a chunk of
> >>     non-huge anonymous memory buffer.
> >> 2. Pick 4 random buffer locations (2 in each thread) and inject
> >>     uncorrectable memory errors at corresponding physical addresses.
> >> 3. Signal both threads to make their memory buffer collapsible, i.e.
> >>     calling madvise(MADV_HUGEPAGE).
> >> 4. Wait and check kernel log: khugepaged is able to recover from poisoned
> >>     pages and skips collapsing them.
> >> 5. Signal both threads to inspect their buffer contents and make sure no
> >>     data corruption.
> > Looks like a nice patchset.  I'd like to give it a run in linux-next
> > but we're at -rc7 and we have no review/ack tags.  So it should be a
> > post-6.2-rc1 thing.
> >
> > I have a quibble.
> >
> >> --- a/include/linux/highmem.h
> >> +++ b/include/linux/highmem.h
> >> @@ -361,6 +361,27 @@ static inline void copy_highpage(struct page *to, struct page *from)
> >>
> >>   #endif
> >>
> >> +/*
> >> + * Machine check exception handled version of copy_highpage. Return number
> >> + * of bytes not copied if there was an exception; otherwise 0 for success.
> >> + * Note handling #MC requires arch opt-in.
> >> + */
> >> +static inline int copy_mc_highpage(struct page *to, struct page *from)
> >> +{
> >> +    char *vfrom, *vto;
> >> +    unsigned long ret;
> >> +
> >> +    vfrom = kmap_local_page(from);
> >> +    vto = kmap_local_page(to);
> >> +    ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE);
> >> +    if (ret == 0)
> >> +            kmsan_copy_page_meta(to, from);
> >> +    kunmap_local(vto);
> >> +    kunmap_local(vfrom);
> >> +
> >> +    return ret;
> >> +}
> > Why inlined?  It's large, it's slow, it's called only from
> > khugepaged.c.  A regular out-of-line function which is static to
> > khugepaged.c seems more appropriate?
>
> There is a similar function copy_mc_user_highpage(),  could we reuse
> it , see commit a873dfe1032a mm, hwpoison: try to recover from copy-on
> write faults
>
>

To Kefeng: As I explained in v7, besides `to` and `from` pages,
copy_mc_user_highpage takes `struct vm_area_struct *vma` and `vaddr`.
While it fits __collapse_huge_page_copy, it doesn't really fit well
for collapse_file (needed for the 2nd commit). When Shi Yang reviewed
my patches, we agreed that we should borrow this opportunity to unify
the copying routines in khugepaged.c (for both file-backed and anon
memory), and copy_highpage fits both (alternatively we can use
copy_user_highpage and passing vaddr=null and vma=null, but I don't
like that). So I choose to make copy_highpage to be MC recoverable.
Does this make sense to you?

To Andrew: I think it is a reasonable "quibble". I will prepare the
update in v9 while waiting for more reviews on v8 if there is.


  reply	other threads:[~2022-12-02 17:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-01  0:59 [PATCH v8 0/2] Memory poison recovery in khugepaged collapsing Jiaqi Yan
2022-12-01  0:59 ` [PATCH v8 1/2] mm/khugepaged: recover from poisoned anonymous memory Jiaqi Yan
2022-12-01 23:09   ` Andrew Morton
2022-12-02  2:25     ` Kefeng Wang
2022-12-02 17:29       ` Jiaqi Yan [this message]
2022-12-04  2:57         ` Kefeng Wang
2022-12-01  0:59 ` [PATCH v8 2/2] mm/khugepaged: recover from poisoned file-backed memory Jiaqi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACw3F50wgfxDreGdDC+FPRJVGM1QaLB1qWaOvV++15rNZbgiag@mail.gmail.com \
    --to=jiaqiyan@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=tongtiangen@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox