From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B606DC433B4 for ; Sat, 8 May 2021 09:22:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2C33661285 for ; Sat, 8 May 2021 09:22:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2C33661285 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8CF596B00BE; Sat, 8 May 2021 05:22:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 880256B00BF; Sat, 8 May 2021 05:22:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F9508D001A; Sat, 8 May 2021 05:22:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 525EF6B00BE for ; Sat, 8 May 2021 05:22:29 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E544C180ACF75 for ; Sat, 8 May 2021 09:22:28 +0000 (UTC) X-FDA: 78117523176.25.33A8C7D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 05A1190009F4 for ; Sat, 8 May 2021 09:21:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620465748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qMbRCZNa+Z/dbR5dpohB9L4zU+l12hVm7EkhqNpUKbo=; b=ONCVYZrWNMxzY+WUVPkxZ8YRjJAZPasK+VF96nOGFHQxRVQR1ClwiCDl36iRvhuN5XspK4 KI39gm9E7sqTKfZGZrspqO9gPX+F0pq6b8XWLVe+8ldGlw1bIcsZsMmuwJd2gSY300jF1r qJUvsaephhAJ87MSwEJ/8W9xaxqQ0ms= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-59-qt8Q-fQ1OmiaV5WsC9cq-w-1; Sat, 08 May 2021 05:22:22 -0400 X-MC-Unique: qt8Q-fQ1OmiaV5WsC9cq-w-1 Received: by mail-wr1-f71.google.com with SMTP id r12-20020adfc10c0000b029010d83323601so4735795wre.22 for ; Sat, 08 May 2021 02:22:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=qMbRCZNa+Z/dbR5dpohB9L4zU+l12hVm7EkhqNpUKbo=; b=JbX8M+1i1X1IbGbEwPgSSMbEZqMpifKEhpj7w1RP4XfYyoonV9F5s5HB39nUwcgADK u91AWpdKlytXQVFHHstO+1pLy2VxBYqZ7QEpyfrRCTEHIMKTFc4n6wBbN9/PMSqSL8aP T9X6lY4ZAZlEDU2SJrycuyhjf4QP5lC9tgLfoKrjBwz0pRGUbuSoXfuf634weUPOufQu E5WdOnjttEop43Sk4rUviPM9Fw3jBEujK14LokN4VPAkHZZ2MUUu2AH1ElddfkBHERsS RaNh3BbmWNDWk7NnQ4HCJm+Uzy+qyo2aOYN5Gr8kxtnrx7NJbF/uYg4CdGBiGv4qlAre gATg== X-Gm-Message-State: AOAM531PYuGHlIRaD6FHdR+CFbuE0xNgJ3rEeZYgR6kqAoPPfwr+h+Ld oec1H9ros+iZTDRU0hKGjphxmFQESFFqSIZS2uxPnG1ulIQrf9Fulgyg62I3A1OPeP0r+2zCgCk L9Kuxn/WmzvI= X-Received: by 2002:a5d:4d52:: with SMTP id a18mr18167275wru.45.1620465741032; Sat, 08 May 2021 02:22:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwMxatA9BKPUWc8GIPc6HWhf7+XknNHfMTIwxF3Bxpxi3T4Tk57QjPf9MeC+9QDis6+u+KYkQ== X-Received: by 2002:a5d:4d52:: with SMTP id a18mr18167226wru.45.1620465740703; Sat, 08 May 2021 02:22:20 -0700 (PDT) Received: from [192.168.3.132] (p5b0c60de.dip0.t-ipconnect.de. [91.12.96.222]) by smtp.gmail.com with ESMTPSA id n124sm17455782wmn.40.2021.05.08.02.22.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 08 May 2021 02:22:20 -0700 (PDT) To: Baoquan He Cc: Andrew Morton , andreyknvl@google.com, christian.brauner@ubuntu.com, colin.king@canonical.com, corbet@lwn.net, dyoung@redhat.com, frederic@kernel.org, gpiccoli@canonical.com, john.p.donnelly@oracle.com, jpoimboe@redhat.com, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mchehab+huawei@kernel.org, mike.kravetz@oracle.com, mingo@kernel.org, mm-commits@vger.kernel.org, paulmck@kernel.org, peterz@infradead.org, rdunlap@infradead.org, rostedt@goodmis.org, rppt@kernel.org, saeed.mirzamohammadi@oracle.com, samitolvanen@google.com, sboyd@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, vgoyal@redhat.com, yifeifz2@illinois.edu References: <20210507010432.IN24PudKT%akpm@linux-foundation.org> <889c6b90-7335-71ce-c955-3596e6ac7c5a@redhat.com> <20210508085133.GA2946@localhost.localdomain> From: David Hildenbrand Organization: Red Hat Subject: Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Message-ID: <2d0f53d9-51ca-da57-95a3-583dc81f35ef@redhat.com> Date: Sat, 8 May 2021 11:22:18 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <20210508085133.GA2946@localhost.localdomain> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ONCVYZrW; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf19.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 05A1190009F4 X-Stat-Signature: 4qm55rcsm81ctou7thtfdbwu5o6hk3hr Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf19; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620465715-528255 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> Let me take a look .... oh, there it is from 2009 >> >> https://marc.info/?t=3D125006512600002&r=3D1&w=3D2 >> >> and then we had it in 2018 >> >> https://lkml.org/lkml/2018/5/20/262 >=20 > Thanks for digging these two out, otherwise I may need do for people to > know the history better. Sure, I stumbled over this myself recently when wondering about what=20 fadump is. >> The issue I have with this: it's just plain wrong when you take memory >> hotplug into serious account as we see it quite heavily in VMs. You do= n't >> know what you'll need when building a kernel. Just pass it via the cmd= line >=20 > Hmm, kdump may have no issue with memory hotplug in crashkernel > reservation aspect. The system RAM size is not correlated to > crashkernel size directly, that's why the default value in this patch i= s "Not correlated directly" ... "1G-64G:128M,64G-1T:256M,1T-:512M" Am I still asleep and dreaming? :) > not linear related to system RAM size. The proportion of crashkernel > size to the total RAM size is thing we take into account. Usually > crashkernel 160M is enough on most of systems. If system RAM size is > larger, extra memory can be added just in case, and not bring much > impact to system. So, all the rules we have are essentially broken because they rely=20 completely on the system RAM during boot. >=20 > With our investigation, PCIe devices impact the crashkernel size, and > cpu number. There are always pci devices which driver require tens of K= B > meomry, even MB. E.g in below patch, my colleague Coiby found out the > i40e network card even cost 1.5G memory to initialize its ringbuffer on > ppc, and 85M on x86_64. >=20 > [PATCH v1 0/3] Reducing memory usage of i40e for kdump > http://lists.infradead.org/pipermail/kexec/2021-March/022117.html >=20 > Even though not all pci devices need surprisingly large memory like > i40e, system with hundreds of pci devices can also cost more memory tha= n > expected. This kind of system usually is high end server, specified > crashkernel value need be set manually. >=20 > So system RAM size is the least important part to influence crashkernel Aehm, not with fadump, no? > costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160= M > crashkernel is still enough. Just we would like to get a tiny extra par= t > to add to crashkernel if the total RAM is very large, that's the rule > for crashkernel=3Dauto. As for VMs, given their very few devices, virti= o > disk, NAT nic, etc, no matter how much memory is deployed and hot > added/removed, crashkernel size won't be influenced very much. My > personal understanding about it. That's an interesting observation. But you're telling me that we end up=20 wasting memory for the crashkernel because "crashkernel=3Dauto" which is=20 supposed to do something magical good automatically does something very=20 suboptimal? Oh my ... this is broken. Long story short: crashkernel=3Dauto is pure ugliness. Why can't we construct a crashkernel in user space when=20 installing/activating kdump and requiring a reboot for kdump to be=20 active as long as that crashkernel setting is not properly respected? Just have a look at the system properties (is_qemu(), #PCI, ...) and=20 propose a value for "crashkernel=3D". Check that that value is at least=20 active when activating kdump. Otherwise don't enable kdump and fail. Yes, it can be difficult with some newer/older kernels having some=20 different demands, but things should change drastically, and a distro=20 can always update its advises along with the kernel, no? You could even have a kernel interface that gives you the current=20 crashkernel size (maybe already there) vs. the recommended crashkernel=20 size. Make kdump or *whoever* activate that in the cmdline and let kdump=20 check if both values are satisfied when booting up. Also: this approach here doesn't make any sense when you want to do=20 something dependent on other cmdline parameters. Take "fadump=3Don" vs=20 "fadump=3Doff" as an example. You just cannot handle it properly as=20 proposed in this patch. To me the approach in this patch makes least=20 sense TBH. --=20 Thanks, David / dhildenb