From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02F28C433FE for ; Thu, 10 Nov 2022 20:25:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7FE326B0074; Thu, 10 Nov 2022 15:25:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ADFB6B0075; Thu, 10 Nov 2022 15:25:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C5176B0078; Thu, 10 Nov 2022 15:25:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5DEBB6B0074 for ; Thu, 10 Nov 2022 15:25:16 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3B040A08F8 for ; Thu, 10 Nov 2022 20:25:16 +0000 (UTC) X-FDA: 80118662232.22.893B001 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf24.hostedemail.com (Postfix) with ESMTP id D9418180007 for ; Thu, 10 Nov 2022 20:25:15 +0000 (UTC) Received: by mail-pl1-f172.google.com with SMTP id p21so2481561plr.7 for ; Thu, 10 Nov 2022 12:25:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=N/1Gh8rkBc+wjQswTdGX9qwYIHv1n2cNNTHHNDLF3nY=; b=RnSK5nhcnxiM6f6oFFE5mocJCg9jSj1DU6IrBQ9jbXnBuxYdstAn71ai1pWXXj3gKI G2TKvyN4cpcgH7SjR43R5Em1bWjZ2fJxuzE3iqJ9H4snzo/xdQxrh52rjmxoXK4TkyRZ ItQoz95vz8V766AvNrXLcfI+7VZZp0Vc7it/hEmCK+cl0h4bkCz8gdCOA81end884+Zf 1Jp2fz9KkTrI5k1ADai5dHcoENf8QAbs+9KYpQkNcptdemsdSXUuVNNE5/+u012RWJBG 1gjuYAW2E2nwRE2WSWauT60k7MeB9bLafS8U+TQ/UVPkFS0gIDGb++R5Q3Gz1D7mIt4V ePaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=N/1Gh8rkBc+wjQswTdGX9qwYIHv1n2cNNTHHNDLF3nY=; b=nABPThE18KeR+ZoGFRdJqE6Ft9Qqu1E1hPWwIA48a1saRuWuZhAl+ZXvMOUzPilfKl RdFCkBh5qUR++HOlIpdiZST2BAR5Phhm2g6XbOFadBLobWcnMRZRawswqJ7/daI+8Vsu By5yqyrl9aZdYmGz7jNsBEITyLLOGqXNELpa8hnRA+DMRKhLaemW4DtLYiKcDaJgPePM ewzUum+AaBid19IeT+woBIcma3djxGKlczl808kwZFve0miW+eGt2YDabh4lTG1ROWFg oI/qRSJvmFvucxnOvT3bghx2DQ2K0Hk8AUuKaAJiCdKqZ3ei/yo4mKXkBmXacpnBsAxP NI4g== X-Gm-Message-State: ACrzQf0OTzwV2sqnIVTL9v3XACTnIN/T7R3BHvZuQ08YBo/4wmtSDiZa MVFfTLzvzA3LFECcK3AEbD32mi2Kxz/DwIlmIcbEtg== X-Google-Smtp-Source: AMsMyM5tGgizqdyd55xPEryuRS+Tpp3okxK3QHZUj2Z2HiIITOD+cPskCHeb7sQfa+9uf4F4cbSxnNWWVO3cnDQSSIU= X-Received: by 2002:a17:90b:4d91:b0:213:f1b:dab5 with SMTP id oj17-20020a17090b4d9100b002130f1bdab5mr65805120pjb.95.1668111914769; Thu, 10 Nov 2022 12:25:14 -0800 (PST) MIME-Version: 1.0 References: <20221103155029.2451105-1-jiaqiyan@google.com> <20221109052908.GB527418@hori.linux.bs1.fc.nec.co.jp> In-Reply-To: From: Jiaqi Yan Date: Thu, 10 Nov 2022 12:25:03 -0800 Message-ID: Subject: Re: [RFC] Kernel Support of Memory Error Detection. To: "Luck, Tony" , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= Cc: "dave.hansen@linux.intel.com" , "david@redhat.com" , "Aktas, Erdem" , "pgonda@google.com" , "rientjes@google.com" , "Hsiao, Duen-wen" , "Vilas.Sridharan@amd.com" , "Malvestuto, Mike" , "gthelen@google.com" , "linux-mm@kvack.org" , "jthoughton@google.com" , "Ghannam, Yazen" , Sean Christopherson Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668111915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=N/1Gh8rkBc+wjQswTdGX9qwYIHv1n2cNNTHHNDLF3nY=; b=jVXGf+Q7f/D2hfFasg0NBc8N5zRUQYJwsQwE4ElWxkUT1HViWMGU6IxQ+qBM1/1vfGheor xtHAwhcWvQIlQn+ZEVR+xBm4JoL+v1x5xaoxtyhP4ZswAOORP6ZeTtZT0cW4Y/z7ZQ83t0 VSGmTTkJg8mbSx+81Yt7ntxAWLRzimw= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RnSK5nhc; spf=pass (imf24.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668111915; a=rsa-sha256; cv=none; b=1avBdPIypvqLYbjZjH2nHnRT+Ann3YscWaqrNkh0ckJQCiaJJigioBeGikEtke3v0qyRyP QACjKyhCegzSB3+60ArJCsbqz8vx5J58zou/HqRHyeHx72e2VBRISh6rNxMSMwwOBkNE02 /PM9vHtGnEYiTa6qLcJh2ZFF5Jlxjyo= X-Rspamd-Queue-Id: D9418180007 X-Rspam-User: X-Rspamd-Server: rspam08 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RnSK5nhc; spf=pass (imf24.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: nswctgpxew5kyj6ti4n9qbnh86zs4zi1 X-HE-Tag: 1668111915-686554 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 9, 2022 at 8:16 AM Luck, Tony wrote: > > > I think that another viewpoint of how we prioritize memory type to scan > > is kernel vs userspace memory. Current hwpoison mechanism does little to > > recover from errors in kernel pages (slab, reserved), so there seesm > > little benefit to detect such errors proactively and beforehand. If the > > resource for scanning is limited, the user might think of focusing on > > scanning userspace memory. > > Page cache is (in some many use cases) a large user of kernel memory, and there > would be options for recovery if errors were pre-emptively found: clean page -> > re-read from storage, modified page -> mark in some way to force EIO for read() > and fail(?) mmap(). > > -Tony Adding the page cache into discussion, I would like to separate the memory scanner from mm's recovery mechanism. We want to build an agnostic in-kernel scanner that safely detects memory errors in physical memory. (e.g. for IntelX86 all usable physical pages in e820), ideally without the need to know the "memory type" (owned by user vs kernel? free vs allocated? page cache dirty vs clean? owned by virtualization guest vs host). After the scanner detects a PFN has a memory error, it reports to the memory-failure module, who classifies the type of the memory page and takes recovery actions accordingly. (For example, page cache will be handled by me_pagecache_dirty/clean, I believe that's basically what Tony described) So the proactive scanner should always improve the kernel's memory reliability by recovering more error pages and recover proactively (not waiting until someone's access). That being said, prioritizing scanning a certain type of memory is then hard (if not impossible). Because the in-kernel background thread design sees all memory the same type, physical memory, to make things simple. The alternative is we assume there is a caller to drive the scanner. This caller can be either userspace or kernel space (our RFC chooses userspace). Then the caller can prioritize or only scan a certain type of memory, but caller has to secure the memory regions before passing to scanner. The "How to Scan" section in RFC has more details. Please do share your opinion/preference for the two designs.