From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 830C9C433EF for ; Wed, 23 Mar 2022 23:29:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBE556B0074; Wed, 23 Mar 2022 19:29:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E486D6B0075; Wed, 23 Mar 2022 19:29:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE6116B0078; Wed, 23 Mar 2022 19:29:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id BC7B16B0074 for ; Wed, 23 Mar 2022 19:29:58 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 78F978249980 for ; Wed, 23 Mar 2022 23:29:58 +0000 (UTC) X-FDA: 79277246076.30.5FDA50C Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf06.hostedemail.com (Postfix) with ESMTP id 0767318000F for ; Wed, 23 Mar 2022 23:29:57 +0000 (UTC) Received: by mail-pf1-f201.google.com with SMTP id a185-20020a621ac2000000b004fa80e5e1a5so1696036pfa.5 for ; Wed, 23 Mar 2022 16:29:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=SIR8M4twMUx3QAj1ddtyJKQ8M43zHWJHNoAmeqhkOeo=; b=CnjTzx0fkmwqwlPKrXR8G2i4A9MuLPOBoA1vpizl2OSOisN6jU+9yNBwRv1K3+RIFQ cgiJUATTf/C3P5+DFY/P7Y2IJ4stEueOgACBAbPDoK4xEIQYqhah9obyRf5iYOmkXd4N CYIbFQ6w/whK+ftRMijQ+LphH+vFhNyw0NabdOyhQfFfBjEbh47rG/KVnehlW9qu3VzK o4mDNKsN51YDSrrM3t5BlX+1Kis/mVddwe6hP/IHbCn4qk5HAJKz6K90c4Ek1VKNFJ6C Zox3u/WwB3HcQfFcLbkSyuobpl6xOzfKTxLi088vbI4E1JFV+m6YohCXDRPnB5d6yisE lOLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SIR8M4twMUx3QAj1ddtyJKQ8M43zHWJHNoAmeqhkOeo=; b=JKbMyGMrYfIRzxdVLsZe8s8o631u6mCXVtXZosZgH3Yd4nqsZYI2hWrC8zWTJ9ckNB tXPDzZWsVRAJ07TzO24iwa6yTp7uZu+0T+xfZIcW56N63JFDaZ2FGrSGntLjZ4nN01Rr wQufu3sZSHQOPwPDgsxpBztVmgOOOhkiOkWBDy41TYn9Qxj89mpjPAm9ayOnAewBunU7 ZiL3Hu9kpHVDH8MvMzHBJ/ALB753C8yc2t8lEfhCW44gxEvJ75EXZJmF5wV+pFakOxOS h9un488AeBtrToPyEg8CEmYUK+6zNKrH8bJy3RXq3GyJsFRKO2Ptzg4XKCjh3BpQbwSm aHMA== X-Gm-Message-State: AOAM531DZjlwHjLMvJmqEgiUlGWOBRAc6p/cCw+BtyOfbM955whfRwCX UAy7jKOddgAEKAb3lZG1n7jzm7JSQlbY6A== X-Google-Smtp-Source: ABdhPJxV/iEr3D2sqNkrJQcrMEMPnvKF2W91TRgaLi5mWw6iBmU4dY/waFv+nTOKF2LIIB/W7OSOGa9ZMOAwtQ== X-Received: from yjqkernel.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1837]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:ccd1:b0:154:359:7e17 with SMTP id z17-20020a170902ccd100b0015403597e17mr2596853ple.42.1648078197065; Wed, 23 Mar 2022 16:29:57 -0700 (PDT) Date: Wed, 23 Mar 2022 23:29:29 +0000 In-Reply-To: <20220323232929.3035443-1-jiaqiyan@google.com> Message-Id: <20220323232929.3035443-3-jiaqiyan@google.com> Mime-Version: 1.0 References: <20220323232929.3035443-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.35.1.894.gb6a874cedc-goog Subject: [RFC v1 2/2] mm: khugepaged: recover from poisoned file-backed memory From: Jiaqi Yan To: shy828301@gmail.com Cc: tony.luck@intel.com, naoya.horiguchi@nec.com, kirill.shutemov@linux.intel.com, linmiaohe@huawei.com, juew@google.com, jiaqiyan@google.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0767318000F X-Stat-Signature: 3wsc1jh1h4krx7ufqhfj96c6c98gnfzi Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=CnjTzx0f; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3da07YggKCNYBA2IAQ2F8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jiaqiyan.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3da07YggKCNYBA2IAQ2F8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jiaqiyan.bounces.google.com X-Rspam-User: X-HE-Tag: 1648078197-1334 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make collapse_file roll back when copying pages failed. More concretely: * extract copy operations into a separate loop * postpone the updates for nr_none until both scan and copy succeeded * postpone joining small xarray entries until both scan and copy succeeded * as for update operations to NR_XXX_THPS * for SHMEM file, postpone until both scan and copy succeeded * for other file, roll back if scan succeeded but copy failed Signed-off-by: Jiaqi Yan --- include/linux/highmem.h | 18 ++++++++++ mm/khugepaged.c | 75 +++++++++++++++++++++++++++-------------- 2 files changed, 67 insertions(+), 26 deletions(-) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 15d0aa4d349c..fc5aa221bdb5 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -315,6 +315,24 @@ static inline void copy_highpage(struct page *to, struct page *from) kunmap_local(vfrom); } +/* + * Machine check exception handled version of copy_highpage. + * Return true if copying page content failed; otherwise false. + */ +static inline bool copy_highpage_mc(struct page *to, struct page *from) +{ + char *vfrom, *vto; + unsigned long ret; + + vfrom = kmap_local_page(from); + vto = kmap_local_page(to); + ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE); + kunmap_local(vto); + kunmap_local(vfrom); + + return ret > 0; +} + #endif static inline void memcpy_page(struct page *dst_page, size_t dst_off, diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 84ed177f56ff..ed2b1cd4bbc6 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1708,12 +1708,13 @@ static void collapse_file(struct mm_struct *mm, { struct address_space *mapping = file->f_mapping; gfp_t gfp; - struct page *new_page; + struct page *new_page, *page, *tmp; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); int nr_none = 0, result = SCAN_SUCCEED; bool is_shmem = shmem_file(file); + bool copy_failed = false; int nr; VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); @@ -1936,9 +1937,7 @@ static void collapse_file(struct mm_struct *mm, } nr = thp_nr_pages(new_page); - if (is_shmem) - __mod_lruvec_page_state(new_page, NR_SHMEM_THPS, nr); - else { + if (!is_shmem) { __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); /* @@ -1956,34 +1955,39 @@ static void collapse_file(struct mm_struct *mm, } } - if (nr_none) { - __mod_lruvec_page_state(new_page, NR_FILE_PAGES, nr_none); - if (is_shmem) - __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); - } - - /* Join all the small entries into a single multi-index entry */ - xas_set_order(&xas, start, HPAGE_PMD_ORDER); - xas_store(&xas, new_page); xa_locked: xas_unlock_irq(&xas); xa_unlocked: if (result == SCAN_SUCCEED) { - struct page *page, *tmp; - /* * Replacing old pages with new one has succeeded, now we - * need to copy the content and free the old pages. + * attempt to copy the contents. */ index = start; - list_for_each_entry_safe(page, tmp, &pagelist, lru) { + list_for_each_entry(page, &pagelist, lru) { while (index < page->index) { clear_highpage(new_page + (index % HPAGE_PMD_NR)); index++; } - copy_highpage(new_page + (page->index % HPAGE_PMD_NR), - page); + if (copy_highpage_mc(new_page + (page->index % HPAGE_PMD_NR), page)) { + copy_failed = true; + break; + } + index++; + } + while (!copy_failed && index < end) { + clear_highpage(new_page + (page->index % HPAGE_PMD_NR)); + index++; + } + } + + if (result == SCAN_SUCCEED && !copy_failed) { + /* + * Copying old pages to huge one has succeeded, now we + * need to free the old pages. + */ + list_for_each_entry_safe(page, tmp, &pagelist, lru) { list_del(&page->lru); page->mapping = NULL; page_ref_unfreeze(page, 1); @@ -1991,12 +1995,20 @@ static void collapse_file(struct mm_struct *mm, ClearPageUnevictable(page); unlock_page(page); put_page(page); - index++; } - while (index < end) { - clear_highpage(new_page + (index % HPAGE_PMD_NR)); - index++; + + xas_lock_irq(&xas); + if (is_shmem) + __mod_lruvec_page_state(new_page, NR_SHMEM_THPS, nr); + if (nr_none) { + __mod_lruvec_page_state(new_page, NR_FILE_PAGES, nr_none); + if (is_shmem) + __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); } + /* Join all the small entries into a single multi-index entry. */ + xas_set_order(&xas, start, HPAGE_PMD_ORDER); + xas_store(&xas, new_page); + xas_unlock_irq(&xas); SetPageUptodate(new_page); page_ref_add(new_page, HPAGE_PMD_NR - 1); @@ -2012,9 +2024,11 @@ static void collapse_file(struct mm_struct *mm, khugepaged_pages_collapsed++; } else { - struct page *page; - - /* Something went wrong: roll back page cache changes */ + /* + * Something went wrong: + * either result != SCAN_SUCCEED or copy_failed, + * roll back page cache changes + */ xas_lock_irq(&xas); mapping->nrpages -= nr_none; @@ -2047,6 +2061,15 @@ static void collapse_file(struct mm_struct *mm, xas_lock_irq(&xas); } VM_BUG_ON(nr_none); + /* + * Undo the updates of thp_nr_pages(new_page) for non-SHMEM file, + * which is not updated yet for SHMEM file. + * These undos are not needed if result is not SCAN_SUCCEED. + */ + if (!is_shmem && result == SCAN_SUCCEED) { + __mod_lruvec_page_state(new_page, NR_FILE_THPS, -nr); + filemap_nr_thps_dec(mapping); + } xas_unlock_irq(&xas); new_page->mapping = NULL; -- 2.35.1.894.gb6a874cedc-goog