From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30AABC54E58 for ; Thu, 21 Mar 2024 06:17:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 884456B0087; Thu, 21 Mar 2024 02:17:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 833056B0088; Thu, 21 Mar 2024 02:17:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FAED6B0089; Thu, 21 Mar 2024 02:17:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 601C36B0087 for ; Thu, 21 Mar 2024 02:17:24 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 178CA1A153A for ; Thu, 21 Mar 2024 06:17:24 +0000 (UTC) X-FDA: 81920039208.10.683CAF8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 4938540019 for ; Thu, 21 Mar 2024 06:17:21 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=d6wqCwr5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711001841; a=rsa-sha256; cv=none; b=M4wENj2GlAaIze1tAB+EOZUY4LIuWAVfTw75J3XHTvUz79/FH3K5L3UmZWI5GiIrICiuHR CFBbhLxi2WEkvapCdxZf7MhW5ksdp59+hh4OtofOLUtEe3G+XfhMHVcnYq24IElNN8ktUL XB7B2sHofbXIlP0+VN/BzgH2n5uYUrM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=d6wqCwr5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711001841; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9jAIuJp36kQ83ph08kGCrk7a6I+win2cjS4x9JRD5ps=; b=crpK1dnLiF/See01Ga4weEQ0+51iHCSCwUajxnnaBKO9Y0cGnQ0kJOVeqiZICP/IVFUtpg 47w/4dksswD0B3iEL1Mc48D8OxOnZ2MLj3QI3aoaBwVdeoj+6zW3JM35GrdFi7DafG+a/A k2qmkPCp3Tat1sCl7/oJtIHo0qs1wmc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711001835; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9jAIuJp36kQ83ph08kGCrk7a6I+win2cjS4x9JRD5ps=; b=d6wqCwr59NThwEn0f8nrKmkGX5AWsWrUzIQ9jTOp/EuvVIU4f+SVOrk7Lx+uQgkSo6R8lY mOGDa27+SFmH+zKiofleNZMcVef2Bs8b0msO+bJ5a5CnyIcVDGujxcjOCPsr5MIiQ9OvgR gQeGUCtt60TYaOLnflGkNlzJE3VsKXM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-31-i8MNhxPtOsu-j7Qqejuaog-1; Thu, 21 Mar 2024 02:17:11 -0400 X-MC-Unique: i8MNhxPtOsu-j7Qqejuaog-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E08BC101A526; Thu, 21 Mar 2024 06:17:10 +0000 (UTC) Received: from localhost (unknown [10.72.116.12]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C09D71121306; Thu, 21 Mar 2024 06:17:09 +0000 (UTC) Date: Thu, 21 Mar 2024 14:17:02 +0800 From: Baoquan He To: "Zhijian Li (Fujitsu)" Cc: "linux-kernel@vger.kernel.org" , "Yasunori Gotou (Fujitsu)" , Alison Schofield , Andrew Morton , Borislav Petkov , Dan Williams , Dave Hansen , Dave Jiang , Greg Kroah-Hartman , "hpa@zytor.com" , Ingo Molnar , Ira Weiny , Thomas Gleixner , Vishal Verma , "linux-cxl@vger.kernel.org" , "linux-mm@kvack.org" , "nvdimm@lists.linux.dev" , "x86@kernel.org" , "kexec@lists.infradead.org" Subject: Re: [RFC PATCH v3 0/7] device backed vmemmap crash dump support Message-ID: References: <20240306102846.1020868-1-lizhijian@fujitsu.com> <92644ab5-6467-484c-b8f3-05cba2164cc1@fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <92644ab5-6467-484c-b8f3-05cba2164cc1@fujitsu.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4938540019 X-Stat-Signature: eb5pg4abtm3ymgt4x9jgfzc9hxwjaogh X-HE-Tag: 1711001841-699842 X-HE-Meta: U2FsdGVkX1+8ljEVCmspXdXw6rx02yPNBxMBfbLjioE33Cgt3rVYn7blzfwe0KMqmEZinUPV/zlPQC/N7R0pzpL/zrBNW+Xnf4JMcL/+Yi5dGCfv90zFj+xJleQl2JM0eURKeX/s8cErZrqHAoywllhqWOa5psitWRi9FL5co3jtHQzto+y4iy/wovkBr1+32q824jPj6QgKnKQ6uGYA4cByI/DWpPTIOaVzApV+tnc4qaDXG2/BihjpDwFfkvD36kSyW+R8v3hA3pP5V513aGZ+Ct4IbuFOhcKzCD5aM+Nysp/hf7g71QnAeu+f+yDVvAatQLy/BF/e9DnbxslqpnCMoY7l7US+T05AsEA/L6Jat3VdXWl+WXQ6SSalFCYkdyYyqDZq6o+O8V/5hfWSx6bNxGG5Woe7iEchOzg2nn+WE9ZvVuuyrlrRHzYXw+UsUciCaPWpGHu7EC20ST71DV4Va1oCEjfo/jzMWgagWlTTcim2SYponVtQO7YxuQHFFeC7rP7IxpeiXEtPDjKpOGTnmgjkjKg/ggEDPSq67yp+hQ7WFkB2OWrBFYRwe8nImELcPwym1E4rpaIpFKSufuGMVYgZLypD70lij5zKx9hnKwtZ5ZQrAVIgNm8NsC1XK6L3I7h9cU1rPZ5zcbnmmbKjSWpmXAFLKjCd5hs+E+1C6felYr36O8S+0er8LdZ4mJ9diJ60Tp+6Fr7yu4cAMI3Z6uitgmSru5n9CFcLWfM8WTcVmAr39euz7gx7FyX/KxABhUXUvFInCwD/ZqytjXFLFtfVb+QiDqxtWEIZk1qd5OhSnYA/nMBtN4TwvL6CE9AM+CRH8Hkv1XbsCyq27ne23PMQbXQ8aRTXE0M5QH/M5quSrP3Cwo2gleKbFbHG6ihsQPgF2SCg0K/nxutqUv3UlZ754pxpDV0pdutWU/tTAcb1xhOfZwu1402HftNAxPwOX3DN8UzSfU2hdsN C5ntZi3A 27jK1NX+KdlEXVHC3HYC5TCrnJjcewFQmwLIlI7fjzXH+x1f+1BAfzsZozBv6/rp2wIXWAjOQrnSkxrXtB12g06ZJPjc/wD2QGJ6HG4ZHb3u8HQyBx1iqZs0VL5//H1BrmQVPZ41GS59jvY7eheKuy4GPs6k2osDvHzi9HfhpgQbOla5WdWmKGtjBm9u1RWY+thg9QcTjXHRe9fT8YHqujQOSkneaKYfYXKWfFN6hiHhsb9DmWcg8El9LJKXjoyRg6+gR/1+MS/7+ozLt51LttM3seG6ymf5D15Xlc6jareOssw+/2ISZGkZMNdfZtr/fb0cJz0zBo6RHX3uh/rCtLXOyQsNBwWCEqLcdV2lAZDNMdRn+PQSa4jcuhBUp6HdIdpzJOJBSpUsVzbsAmiioH+nfkUhfdokQGNPxqZvOnTjJp0w4QXC0n8u41AZ3O4hX5DhfF6GMTkQks65fBdqv2ms05z/wH0RiWM+e4J+wOLQwZss7a+Cat5g48QsM1PUeNGE9aQTY0sdtrT0yNz4CuB+vR2xSZ+ihrusUjO9bwwqZ26sUtH4umlKgu7/IXnXpCoSkqL6b3QdDHEsU9f4UbadkbDPjRnLG0oHQiJm6RlyztVY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 03/21/24 at 05:40am, Zhijian Li (Fujitsu) wrote: > ping > > > Any comment is welcome. I will have a look at this from kdump side. How do you test your code? By the way, there's issue reported by test robot. Thanks Baoquan > > > On 06/03/2024 18:28, Li Zhijian wrote: > > Hello folks, > > > > Compared with the V2[1] I posted a long time ago, this time it is a > > completely new proposal design. > > > > ### Background and motivate overview ### > > --- > > Crash dump is an important feature for troubleshooting the kernel. It is the > > final way to chase what happened at the kernel panic, slowdown, and so on. It > > is one of the most important tools for customer support. > > > > Currently, there are 2 syscalls(kexec_file_load(2) and kexec_load(2)) to > > configure the dumpable regions. Generally, (A)iomem resources registered with > > flags (IORESOURCE_SYSTEM_RAM | IORESOUCE_BUSY) for kexec_file_load(2) or > > (B)iomem resources registered with "System RAM" name prefix for kexec_load(2) > > are dumpable. > > > > The pmem use cases including fsdax and devdax, could map their vmemmap to > > their own devices. In this case, these part of vmemmap will not be dumped when > > crash happened since these regions are satisfied with neither the above (A) > > nor (B). > > > > In fsdax, the vmemmap(struct page array) becomes very important, it is one of > > the key data to find status of reverse map. Lacking of the information may > > cause difficulty to analyze trouble around pmem (especially Filesystem-DAX). > > That means troubleshooters are unable to check more details about pmem from > > the dumpfile. > > > > ### Proposal ### > > --- > > In this proposal, register the device backed vmemmap as a separate resource. > > This resource has its own new flag and name, and then teaches kexec_file_load(2) > > and kexec_load(2) to mark it as dumpable. > > > > Proposed flag: IORESOURCE_DEVICE_BACKED_VMEMMAP > > Proposed name: "Device Backed Vmemmap" > > > > NOTE: crash-utils also needs to adapt to this new name for kexec_load() > > > > With current proposal, the /proc/iomem should show as following for device > > backed vmemmap > > # cat /proc/iomem > > ... > > fffc0000-ffffffff : Reserved > > 100000000-13fffffff : Persistent Memory > > 100000000-10fffffff : namespace0.0 > > 100000000-1005fffff : Device Backed Vmemmap # fsdax > > a80000000-b7fffffff : CXL Window 0 > > a80000000-affffffff : Persistent Memory > > a80000000-affffffff : region1 > > a80000000-a811fffff : namespace1.0 > > a80000000-a811fffff : Device Backed Vmemmap # devdax > > a81200000-abfffffff : dax1.0 > > b80000000-c7fffffff : CXL Window 1 > > c80000000-147fffffff : PCI Bus 0000:00 > > c80000000-c801fffff : PCI Bus 0000:01 > > ... > > > > ### Kdump service reloading ### > > --- > > Once the kdump service is loaded, if changes to CPUs or memory occur, > > either by hot un/plug or off/onlining, the crash elfcorehdr should also > > be updated. There are 2 approaches to make the reloading more efficient. > > 1) Use udev rules to watch CPU and memory events, then reload kdump > > 2) Enable kernel crash hotplug to automatically reload elfcorehdr (>= 6.5) > > > > This reloading also needed when device backed vmemmap layouts change, Similar > > to what 1) does now, one could add the following as the first lines to the > > RHEL udev rule file /usr/lib/udev/rules.d/98-kexec.rules: > > > > # namespace updated: watch daxX.Y(devdax) and pfnX.Y(fsdax) of nd > > SUBSYSTEM=="nd", KERNEL=="[dp][af][xn][0-9].*", ACTION=="bind", GOTO="kdump_reload" > > SUBSYSTEM=="nd", KERNEL=="[dp][af][xn][0-9].*", ACTION=="unbind", GOTO="kdump_reload" > > # devdax <-> system-ram updated: watch daxX.Y of dax > > SUBSYSTEM=="dax", KERNEL=="dax[0-9].*", ACTION=="bind", GOTO="kdump_reload" > > SUBSYSTEM=="dax", KERNEL=="dax[0-9].*", ACTION=="unbind", GOTO="kdump_reload" > > > > Regarding 2), my idea is that it would need to call the memory_notify() in > > devm_memremap_pages_release() and devm_memremap_pages() to trigger the crash > > hotplug. This part is not yet mature, but it does not affect the whole feature > > because we can still use method 1) alternatively. > > > > [1] https://lore.kernel.org/lkml/02066f0f-dbc0-0388-4233-8e24b6f8435b@fujitsu.com/T/ > > -------------------------------------------- > > changes from V2[1] > > - new proposal design > > > > CC: Alison Schofield > > CC: Andrew Morton > > CC: Baoquan He > > CC: Borislav Petkov > > CC: Dan Williams > > CC: Dave Hansen > > CC: Dave Jiang > > CC: Greg Kroah-Hartman > > CC: "H. Peter Anvin" > > CC: Ingo Molnar > > CC: Ira Weiny > > CC: Thomas Gleixner > > CC: Vishal Verma > > CC: linux-cxl@vger.kernel.org > > CC: linux-mm@kvack.org > > CC: nvdimm@lists.linux.dev > > CC: x86@kernel.org > > > > Li Zhijian (7): > > mm: memremap: register/unregister altmap region to a separate resource > > mm: memremap: add pgmap_parent_resource() helper > > nvdimm: pmem: assign a parent resource for vmemmap region for the > > fsdax > > dax: pmem: assign a parent resource for vmemmap region for the devdax > > resource: Introduce walk device_backed_vmemmap res() helper > > x86/crash: make device backed vmemmap dumpable for kexec_file_load > > nvdimm: set force_raw=1 in kdump kernel > > > > arch/x86/kernel/crash.c | 5 +++++ > > drivers/dax/pmem.c | 8 ++++++-- > > drivers/nvdimm/namespace_devs.c | 3 +++ > > drivers/nvdimm/pmem.c | 9 ++++++--- > > include/linux/ioport.h | 4 ++++ > > include/linux/memremap.h | 4 ++++ > > kernel/resource.c | 13 +++++++++++++ > > mm/memremap.c | 30 +++++++++++++++++++++++++++++- > > 8 files changed, 70 insertions(+), 6 deletions(-) > >