From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 298EBC433FE for ; Tue, 26 Oct 2021 23:27:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9833561002 for ; Tue, 26 Oct 2021 23:27:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9833561002 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F31AF80007; Tue, 26 Oct 2021 19:27:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EDF14940007; Tue, 26 Oct 2021 19:27:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7F8F80007; Tue, 26 Oct 2021 19:27:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id C97D1940007 for ; Tue, 26 Oct 2021 19:27:50 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 858D632083 for ; Tue, 26 Oct 2021 23:27:50 +0000 (UTC) X-FDA: 78740178300.05.6C52F51 Received: from out2.migadu.com (out2.migadu.com [188.165.223.204]) by imf22.hostedemail.com (Postfix) with ESMTP id D18B6191F for ; Tue, 26 Oct 2021 23:27:49 +0000 (UTC) Date: Wed, 27 Oct 2021 08:27:36 +0900 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1635290867; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Dou0MkH9IzpmrM0Gaj9otoUETtwue+Y4C7Lx9KWiIv0=; b=cgFIK8p8UHG1HbEPfVQ9/HfxCT1X84Z2hDdzBj19UOIeuQMK7Jr7SJYyZ0PDrbft+NzuF8 g/qwf75T/nUzNvEcn7TsI7+ty/O8aH7agakJ2g1+l+6FCATNazqJnptjfXX58MGKXOa+9v o8sBfHhSOs2MHjLtbQ4ipz4Trtelk6c= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Naoya Horiguchi To: David Hildenbrand Cc: linux-mm@kvack.org, Andrew Morton , Alistair Popple , Peter Xu , Mike Kravetz , Konstantin Khlebnikov , Bin Wang , Yang Shi , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: Re: [PATCH v1] mm, pagemap: expose hwpoison entry Message-ID: <20211026232736.GA2704541@u2004> References: <20211004115001.1544259-1-naoya.horiguchi@linux.dev> <258d0ddb-6c82-0c95-a15e-b085b59d2142@redhat.com> <20211004143228.GA1545442@u2004> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20211004143228.GA1545442@u2004> X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: naoya.horiguchi@linux.dev X-Rspamd-Queue-Id: D18B6191F Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=cgFIK8p8; spf=pass (imf22.hostedemail.com: domain of naoya.horiguchi@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=naoya.horiguchi@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Stat-Signature: md4zywmpucaduddaj1pqje5hfmf1w9jh X-Rspamd-Server: rspam06 X-HE-Tag: 1635290869-182348 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 04, 2021 at 11:32:28PM +0900, Naoya Horiguchi wrote: > On Mon, Oct 04, 2021 at 01:55:30PM +0200, David Hildenbrand wrote: > > On 04.10.21 13:50, Naoya Horiguchi wrote: ... > > > > > > Hwpoison entry for hugepage is also exposed by this patch. The below > > > example shows how pagemap is visible in the case where a memory error > > > hit a hugepage mapped to a process. > > > > > > $ ./page-types --no-summary --pid $PID --raw --list --addr 0x700000000+0x400 > > > voffset offset len flags > > > 700000000 12fa00 1 ___U_______Ma__H_G_________________f_______1 > > > 700000001 12fa01 1ff ___________Ma___TG_________________f_______1 > > > 700000200 12f800 1 __________B________X_______________f______w_ > > > 700000201 12f801 1 ___________________X_______________f______w_ // memory failure hit this page > > > 700000202 12f802 1fe __________B________X_______________f______w_ > > > > > > The entries with both of "X" flag (hwpoison flag) and "w" flag (swap > > > flag) are considered as hwpoison entries. So all pages in 2MB range > > > are inaccessible from the process. We can get actual error location > > > by page-types in physical address mode. > > > > > > $ ./page-types --no-summary --addr 0x12f800+0x200 --raw --list > > > offset len flags > > > 12f800 1 __________B_________________________________ > > > 12f801 1 ___________________X________________________ > > > 12f802 1fe __________B_________________________________ > > > > > > Signed-off-by: Naoya Horiguchi > > > --- > > > fs/proc/task_mmu.c | 41 ++++++++++++++++++++++++++++++++--------- > > > include/linux/swapops.h | 13 +++++++++++++ > > > tools/vm/page-types.c | 7 ++++++- > > > 3 files changed, 51 insertions(+), 10 deletions(-) > > > > > > Please also update the documentation located at > > > > Documentation/admin-guide/mm/pagemap.rst > > I will do this in the next post. Reading the document, I found that swap type is already exported so we could identify hwpoison entry with it (without new PM_HWPOISON bit). One problem is that the format of swap types (like SWP_HWPOISON) depends on a few config macros like CONFIG_DEVICE_PRIVATE and CONFIG_MIGRATION, so we also need to export how the swap type field is interpreted. I thought of adding new interfaces for example under /sys/kernel/mm/swap/type_format/, which shows info like below (assuming that all CONFIG_{DEVICE_PRIVATE,MIGRATION,MEMORY_FAILURE} is enabled): $ ls /sys/kernel/mm/swap/type_format/ hwpoison migration_read migration_write device_write device_read device_exclusive_write device_exclusive_read $ cat /sys/kernel/mm/swap/type_format/hwpoison 25 $ cat /sys/kernel/mm/swap/type_format/device_write 28 Does it make sense or any better approach? Thanks, Naoya Horiguchi