From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 793DEC433EF for ; Tue, 24 May 2022 14:05:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A85B48D0003; Tue, 24 May 2022 10:05:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A34128D0001; Tue, 24 May 2022 10:05:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 920ED8D0003; Tue, 24 May 2022 10:05:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 831528D0001 for ; Tue, 24 May 2022 10:05:51 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 5719281318 for ; Tue, 24 May 2022 14:05:51 +0000 (UTC) X-FDA: 79500810102.13.0BF637C Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf15.hostedemail.com (Postfix) with ESMTP id A1D65A0002 for ; Tue, 24 May 2022 14:05:31 +0000 (UTC) Received: by mail-vs1-f47.google.com with SMTP id a127so18341066vsa.3 for ; Tue, 24 May 2022 07:05:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fdIuPH6jwZeTIlVxpfj9Xh28D+poFJFnNsATPWa2EYM=; b=SSHtihEJv6AM5w555jrL5Dw+e/278cPYpKoF1dnn+twlKpVofMFuBp8ocpS6h3YaCk jXIUmIXCE4MFwPMC9GjXdxxXNB3b/f0A3nJ76+2rfVRVDXudbrPUyIfebbOHTROjifUv mG0bNhuOR3zp2u2XnCde5KcFiI+C4utZQmVQuP5qab/bCtqXryGXyRCE4l4IRioxj9JJ pxxhivreRQbNf/ndHiPwh8thZ8SdcgjUbEnAMXl3rIiUVOXq0e6rMpd4khoR3TeWscDM TFyKmMb/5wTUlB7EXIdp2xJgCsFOqEGSHp3rP90c2SWL/n7AGhTYDsEOlsA4qoVSXl52 iOTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fdIuPH6jwZeTIlVxpfj9Xh28D+poFJFnNsATPWa2EYM=; b=XmNN305jfQKHItKdPTUCI4wmji8sKtuAli51TZ8Z/iiTguQM1KNiPXm3NPhCYlZI3a sjmngHzr2ubuIttw+cOXErpI1vkyk43j3GCOQBsb/PZxwwFkSQn3lp9f1h/XZWBcbbbn Ci5nxh/dEO9HzNYrvdFfUUlp52aMt58lGklMJiGeLiHsNXcXnlMVJ5ngopZQEqlE0YKN YyHyiQd6eJGOs92xIEgTsXzxQsZ64qasqGp8tgTZCaPg4hDg8MyyyC8b/uRvkspkEoPv ylEhAwqz3V3UAAqqPU9nLjvOn4/m0hu9f3cPPTCLf4XDopzV7moFcyy8avDGOVJtURJr SuTA== X-Gm-Message-State: AOAM533PfW0jUoVszQQLgCssxAA40n3oq6w546uJIED7EX2NpklrTGHs iag2DoLGwOKIFW9PElCr9WhAENQUt44+jzgZdRtQLA== X-Google-Smtp-Source: ABdhPJxUaDeLSCsJ3a5SOrhd2Lkd3OOhSP6fGbkQ7M92RaMY6bZh4PT3btmlZyb4BSktTc72b3j0h2TJBY0GMBuG1rg= X-Received: by 2002:a67:f6d9:0:b0:324:ba1f:1a94 with SMTP id v25-20020a67f6d9000000b00324ba1f1a94mr10390910vso.42.1653401149793; Tue, 24 May 2022 07:05:49 -0700 (PDT) MIME-Version: 1.0 References: <20220524025352.1381911-1-jiaqiyan@google.com> <20220524025352.1381911-2-jiaqiyan@google.com> In-Reply-To: From: Jue Wang Date: Tue, 24 May 2022 07:05:37 -0700 Message-ID: Subject: Re: [PATCH v3 1/2] mm: khugepaged: recover from poisoned anonymous memory To: Oscar Salvador Cc: Jiaqi Yan , shy828301@gmail.com, tongtiangen@huawei.com, tony.luck@intel.com, naoya.horiguchi@nec.com, kirill.shutemov@linux.intel.com, linmiaohe@huawei.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: 7mibusmh7yqqakmy3axkwapyc477h7pq X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=SSHtihEJ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of juew@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=juew@google.com X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A1D65A0002 X-HE-Tag: 1653401131-532727 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 24, 2022 at 3:48 AM Oscar Salvador wrote: > > On Mon, May 23, 2022 at 07:53:51PM -0700, Jiaqi Yan wrote: > > Make __collapse_huge_page_copy return whether > > collapsing/copying anonymous pages succeeded, > > and make collapse_huge_page handle the return status. > > > > Break existing PTE scan loop into two for-loops. > > The first loop copies source pages into target huge page, > > and can fail gracefully when running into memory errors in > > source pages. Roll back the page table and page states > > when copying failed: > > 1) re-establish the PTEs-to-PMD connection. > > 2) release pages back to their LRU list. > > If you spell out what the first loop does, just tell > what the second loop does as wel, it just gets easier. > > > +static bool __collapse_huge_page_copy(pte_t *pte, > > + struct page *page, > > + pmd_t *pmd, > > + pmd_t rollback, > > + struct vm_area_struct *vma, > > + unsigned long address, > > + spinlock_t *pte_ptl, > > + struct list_head *compound_pagelist) > > { > > struct page *src_page, *tmp; > > pte_t *_pte; > > - for (_pte = pte; _pte < pte + HPAGE_PMD_NR; > > - _pte++, page++, address += PAGE_SIZE) { > > - pte_t pteval = *_pte; > > + pte_t pteval; > > + unsigned long _address; > > + spinlock_t *pmd_ptl; > > + bool copy_succeeded = true; > > > > - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > > + /* > > + * Copying pages' contents is subject to memory poison at any iteration. > > + */ > > + for (_pte = pte, _address = address; > > + _pte < pte + HPAGE_PMD_NR; > > + _pte++, page++, _address += PAGE_SIZE) { > > + pteval = *_pte; > > + > > + if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) > > clear_user_highpage(page, address); > > - add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1); > > - if (is_zero_pfn(pte_pfn(pteval))) { > > - /* > > - * ptl mostly unnecessary. > > - */ > > - spin_lock(ptl); > > - ptep_clear(vma->vm_mm, address, _pte); > > - spin_unlock(ptl); > > + else { > > + src_page = pte_page(pteval); > > + if (copy_highpage_mc(page, src_page)) { > > + copy_succeeded = false; > > + trace_mm_collapse_huge_page_copy(pte_page(*pte), > > + src_page, SCAN_COPY_MC); > > You seem to assume that if there is an error, it will always happen on > the page we are copying from. What if the page we are copying to is the > fauly one? Can that happen? Can that be detected by copy_mc_to_kernel? > And if so, can that be differentiated? It's possible that the copy to page has some uncorrectable memory errors. Yet only read transactions signals machine check exceptions while write transactions do not. It's the reading from the source page with uncorrectable errors causing system panics most (since khugepaged is a kernel context thread). Thanks, -Jue > > -- > Oscar Salvador > SUSE Labs