From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E6B1C4332F for ; Thu, 13 Oct 2022 02:57:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E25A6B0071; Wed, 12 Oct 2022 22:57:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 56B656B0073; Wed, 12 Oct 2022 22:57:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E5936B0074; Wed, 12 Oct 2022 22:57:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 24F6B6B0071 for ; Wed, 12 Oct 2022 22:57:55 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D3B09C1247 for ; Thu, 13 Oct 2022 02:57:54 +0000 (UTC) X-FDA: 80014416468.26.552EA5E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 48238140022 for ; Thu, 13 Oct 2022 02:57:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665629872; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8r6RnStRQqo5NB/KrjFNVz2NdBUb9m52H/iX7wovmR8=; b=BMWF0OCRfC9A8hJsbasqrl9I1SWyWTLnBj6itb7UVnQ56y/dc3Sf4FWSWVjzVE4fd6jbHs Q2zsG9uKSiYY2M80g+NY4g9z3vh44DYfJ3YxOox34prF6teeGLAN4XlOMapVC3RCswx29u o1adleVwI5BzjzmsQz5gBcceoLi1J04= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-177-XLfnMYOyMX6ON2Jqjhde1Q-1; Wed, 12 Oct 2022 22:57:49 -0400 X-MC-Unique: XLfnMYOyMX6ON2Jqjhde1Q-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 21B381C05AF3; Thu, 13 Oct 2022 02:57:43 +0000 (UTC) Received: from localhost (ovpn-12-120.pek2.redhat.com [10.72.12.120]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8432F492B0F; Thu, 13 Oct 2022 02:57:37 +0000 (UTC) Date: Thu, 13 Oct 2022 10:57:28 +0800 From: Baoquan He To: Borislav Petkov Cc: Eric DeVolder , Oscar Salvador , Andrew Morton , david@redhat.com, linux-kernel@vger.kernel.org, x86@kernel.org, kexec@lists.infradead.org, ebiederm@xmission.com, dyoung@redhat.com, vgoyal@redhat.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, hpa@zytor.com, nramas@linux.microsoft.com, thomas.lendacky@amd.com, robh@kernel.org, efault@gmx.de, rppt@kernel.org, sourabhjain@linux.ibm.com, linux-mm@kvack.org Subject: Re: [PATCH v12 7/7] x86/crash: Add x86 crash hotplug support Message-ID: References: <53aed03e-2eed-09b1-9532-fe4e497ea47d@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BMWF0OCR; spf=pass (imf09.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665629874; a=rsa-sha256; cv=none; b=Xp4htn4qW3XVysoBpPmpczE4WXHZaTtCzJgUdU4N9JH01CdIrRYTlgT/Pd/IMO85ABM1Zn J4b7Ju4zmv2VhtIcSOA+xPqz+nrHQ2oHCEObAR0k94673h8zgAA8NeJCa4M4pdOdjEEAE1 X/owRFxxLWljTGAzKl6LnyFl+86CJgY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665629874; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8r6RnStRQqo5NB/KrjFNVz2NdBUb9m52H/iX7wovmR8=; b=dN+6G+ESlVgkKQFVrTaX8s8fH8zZ1haiZoXmepFqmeUHQMf2SuzZt/9tXfaQlBwCmCziWS on5oYka2HnUVknfbUQz1yhBRCMuL0Nf4ee32kPwIinG1UJqsjIgsH9C39T2s+qFGrDuglh FtZj+9dJSYrJEHn83SjX/RoII+cZ0iE= X-Rspamd-Queue-Id: 48238140022 X-Rspam-User: Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BMWF0OCR; spf=pass (imf09.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam04 X-Stat-Signature: a19uw7yo5gw98f8m8kd6x3q79fqsq6cy X-HE-Tag: 1665629873-148285 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/12/22 at 10:41pm, Borislav Petkov wrote: > On Wed, Oct 12, 2022 at 03:19:19PM -0500, Eric DeVolder wrote: > > We run here QEMU with the ability for 1024 DIMM slots. > > QEMU, haha. > > What is the highest count of DIMM slots which are hotpluggable on a > real, *physical* system today? Are you saying you can have 1K DIMM slots > on a board? The concern to range number mainly is on Virt guest systems. On baremetal system, basically only very high end server support memory hotplug. I ever visited customer's lab and saw one server, it owns 8 slots, on each slot a box containing about 20 cpus and 2T memory at most can be plugged in at one time. So people won't make too many slots for hotplugging since it's too expensive. I checked user space kexec code, the maximum memory range number is honored to x86_64 because of a HPE SGI system. After that, nobody complains about it. Please see below user space kexec-tools commit in https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git The memory ranges may be not all made by different DIMM slots, could be firmware reservatoin, e.g efi/BIOS diggged out physical memory, or the cpu logical address space is occupied by pci or other stuffs. I don't have a HPE SGI system at hand to check. commit 4a6d67d9e938a7accf128aff23f8ad4bda67f729 Author: Xunlei Pang Date: Thu Mar 23 19:16:59 2017 +0800 x86: Support large number of memory ranges We got a problem on one SGI 64TB machine, the current kexec-tools failed to work due to the insufficient ranges(MAX_MEMORY_RANGES) allowed which is defined as 1024(less than the ranges on the machine). The kcore header is insufficient due to the same reason as well. To solve this, this patch simply doubles "MAX_MEMORY_RANGES" and "KCORE_ELF_HEADERS_SIZE". Signed-off-by: Xunlei Pang Tested-by: Frank Ramsay Signed-off-by: Simon Horman diff --git a/kexec/arch/i386/kexec-x86.h b/kexec/arch/i386/kexec-x86.h index 33df3524f4e2..51855f8db762 100644 --- a/kexec/arch/i386/kexec-x86.h +++ b/kexec/arch/i386/kexec-x86.h @@ -1,7 +1,7 @@ #ifndef KEXEC_X86_H #define KEXEC_X86_H -#define MAX_MEMORY_RANGES 1024 +#define MAX_MEMORY_RANGES 2048 > > I hardly doubt that. The questioning is reasonable. 32K truly looks too much. Now CONFIG_NR_CPUS has the maximum number as 8192. And user space kexec-tools has maximum memory range number as 2048. We can take the current 8192 + 2048 = 10K as default value conservatively. Or take 8192 + 2048 * 2 = 12K which has two times of maximum memory range bumber in kexec-tools. What do you think? > > > So, for example, 1TiB requires 1024 DIMMs of 1GiB each with 128MiB > > memblocks, that results in 8K possible memory regions. So just going > > to 4TiB reaches 32K memory regions. > > Lemme see if I understand this correctly: when a system like that > crashes, you want to kdump *all* those 4TiB in a vmcore? How long would > that dump take to complete? A day? That is not a problem. The time of vmcore dumping mainly depends on the actual memory size, not on memory range numbers. when dumping vmcore, people use makedumpfile to filter zero page, free page, cache page, or user date page according to configuration. If memory is huge, they can use nr_cpus=x to enable multiple cpu to do multi-thread dumping. Kdump now support more than 10 TB vmcore dumping.