From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECBB3C433FE for ; Tue, 8 Nov 2022 02:24:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E41F6B0071; Mon, 7 Nov 2022 21:24:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 194036B0073; Mon, 7 Nov 2022 21:24:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 083C28E0001; Mon, 7 Nov 2022 21:24:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EEE836B0071 for ; Mon, 7 Nov 2022 21:24:17 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BAF671C6905 for ; Tue, 8 Nov 2022 02:24:17 +0000 (UTC) X-FDA: 80108680554.26.5C3CA42 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf05.hostedemail.com (Postfix) with ESMTP id 6CDF2100007 for ; Tue, 8 Nov 2022 02:24:17 +0000 (UTC) Received: by mail-pj1-f49.google.com with SMTP id e7-20020a17090a77c700b00216928a3917so11747638pjs.4 for ; Mon, 07 Nov 2022 18:24:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xdCsKOgsRg9fXFhWENxgdBxvQkN0xxKnVi4E1SO699E=; b=mwC1ozVjwta9QL7Uy5+jxcK3I/mdlAWWtZ2lGm756Z5H5gVfkjriHYU3B0FIA2w6hj Xgxr3EDyVGKW7wlVwMlcoQ0XHBoeZzNI94qAGzhr7iVqTU4iaYWJLmg/9MXrx0CJ7XF/ SFJaRGY0pFNb76RjLIPnFgNRZ685XhIeAqVuOmpy0Y2apb5mUYGkUPqwiwtrwMSbq2EU AC4LP6AUpMT30NBrotB71wkHfYJ2AOxASfeN3ZhVxD0JUMwcCmzECmmr3gWzzoI6DeyD rd54xlO53uu0wkdhTgae5xbTdJinbNOIrSn4RxTL6u1c2oNQvLV2THn1nYV6cc/ab3yY P0kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xdCsKOgsRg9fXFhWENxgdBxvQkN0xxKnVi4E1SO699E=; b=d1xvcVEIu4oFZSbWQU97Hfj++lGCcg86YZ/tZ/XPMUn8DR99Xk0EI5DhCj8UXxl+ZM W8dFAwjcBXN+2QEFeBsDM1cWE1nz1WaEacUeGHedA0Xu+7kRw0FPMfGy3IkO58S/6gji 1WvDZBVViNxi9HzXRcjTXRT9a3CFA32TtNtPkum2dJgA7AAdCIeSOB75wNzMk+454GPk h5FUfSAr0kfQmjE861U4uCEv2g2xQpQ6+Dck1s8tnm3VcOsoAW7t1ZijrVtvkNTbRamD D+jF2uSG5q/SPKgGUK6NuI4YrClVROy0RH+0LUB4WOCILbWknjvC/AG+9WyfMFfeBoye 0gag== X-Gm-Message-State: ACrzQf2+8UaMOezzl5jZPkN1ND6BwmzcUlqeknNJWTjiWg8RnX55OLih Ddl+6PDHLNwVQUN93/ktESkkyq0DMswqWpQPOhUm4Q== X-Google-Smtp-Source: AMsMyM7uTITIyk/EfDah1L8Fayc6lnlMppMDHh/uIaEuruxdzIrHRwx9fGSAxejWAyNsDOoYlvxugKL4LG4rGYMlc3c= X-Received: by 2002:a17:90b:4d91:b0:213:f1b:dab5 with SMTP id oj17-20020a17090b4d9100b002130f1bdab5mr53602603pjb.95.1667874256244; Mon, 07 Nov 2022 18:24:16 -0800 (PST) MIME-Version: 1.0 References: <20221103155029.2451105-1-jiaqiyan@google.com> <7E670362-C29E-4626-B546-26530D54F937@gmail.com> In-Reply-To: <7E670362-C29E-4626-B546-26530D54F937@gmail.com> From: Jiaqi Yan Date: Mon, 7 Nov 2022 18:24:04 -0800 Message-ID: Subject: Re: [RFC] Kernel Support of Memory Error Detection. To: Nadav Amit Cc: "Luck, Tony" , "naoya.horiguchi@nec.com" , "dave.hansen@linux.intel.com" , David Hildenbrand , "Aktas, Erdem" , "pgonda@google.com" , "rientjes@google.com" , "Hsiao, Duen-wen" , "Vilas.Sridharan@amd.com" , "Malvestuto, Mike" , "gthelen@google.com" , "linux-mm@kvack.org" , "jthoughton@google.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mwC1ozVj; spf=pass (imf05.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667874257; a=rsa-sha256; cv=none; b=Z42BOCxU/9/5nshdQNQFfj/ffUwDTpq4cdAeS3oASBQ3Xgd9p3gdYe4yHRznc+Q3dx+mBS Uf7uSrsZZnD895Lo4W/xOFs5+AN86UifDJJfbzR3q+Hl8Mtqa1qzqKJO4a1Std+fuiN55h j8V7cRUqyQ2Kb7T76WNVPygegxw2oSg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667874257; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xdCsKOgsRg9fXFhWENxgdBxvQkN0xxKnVi4E1SO699E=; b=iYyuoPkXIKg5OVAoYUMOUQiVsScpyU13NMcxmOwqMnt8O7iVZeN8BzAqATT6+NKib2YGAw GY9nJ1gEPDy3BJgdFrDbcEnGWCk/g6VdjLCxARr5ULGgrMhlk80Ck3Fgc2AfH/l7I/pMJL FyPklSQ26leUROlGmyNTL3nt8krt+So= X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=mwC1ozVj; spf=pass (imf05.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6CDF2100007 X-Stat-Signature: yadeyfbxfh48iz1rggw5ypwzaqdzsfex X-HE-Tag: 1667874257-845886 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 3, 2022 at 9:40 AM Nadav Amit wrote: > > On Nov 3, 2022, at 9:27 AM, Luck, Tony wrote: > > >> - HPS usually doesn=E2=80=99t consume CPU cores but does consume memor= y > >> controller cycles and memory bandwidth. SW consumes both CPU cycles > >> and memory bandwidth, but is only a problem if administrators opt into > >> the scanning after weighing the cost benefit. > > > > Maybe there is a middle ground on platforms that support some s/w progr= ammable > > DMA engine that can detect memory errors in a way that doesn't signal a > > fatal system error. Your s/w scanner can direct that DMA engine to read= from > > the regions of memory that you want to scan, at a frequency that is com= patible > > with your system load requirements and risk assessments. > > > > If your idea gets traction, maybe structure the code so that it can eit= her use > > a CPU core scan a block of memory, or pass requests to a platform drive= r that can > > use a DMA engine to perform the scan. > > That=E2=80=99s exactly what I was about the write. :) > > Quickassist can be perfect for that. The IOMMU can be programmed to make = the > memory uncachable. > Agree, the kernel code will abstract away the part that does the actual memory scanning with an internal "API", so that we can plug in different scanners, e.g. CPU, DMA device. If it is feasible in future that hardware vendors can make patrol scrubber programmable, we can even direct the scanning to patrol scrubber.