linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Alexander Egorenkov <egorenar@linux.ibm.com>
Cc: agordeev@linux.ibm.com, akpm@linux-foundation.org,
	borntraeger@linux.ibm.com, cohuck@redhat.com, corbet@lwn.net,
	eperezma@redhat.com, frankja@linux.ibm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, imbrenda@linux.ibm.com, jasowang@redhat.com,
	kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-s390@vger.kernel.org, mcasquer@redhat.com, mst@redhat.com,
	svens@linux.ibm.com, thuth@redhat.com,
	virtualization@lists.linux.dev, xuanzhuo@linux.alibaba.com,
	zaslonko@linux.ibm.com
Subject: Re: [PATCH v2 1/7] s390/kdump: implement is_kdump_kernel()
Date: Wed, 16 Oct 2024 17:54:56 +0200	[thread overview]
Message-ID: <87956f31-472d-4091-8061-1e55fea7a3d7@redhat.com> (raw)
In-Reply-To: <76f4ed45-5a40-4ac4-af24-a40effe7725c@redhat.com>

On 16.10.24 17:47, David Hildenbrand wrote:
>>>
>>> When I wrote that code I was rather convinced that the variant in this patch
>>> is the right thing to do.
>>
>> A short explanation about what a stand-alone kdump is.
>>
>> * First, it's not really a _regular_ kdump activated with kexec-tools and
>>     executed by Linux itself but a regular stand-alone dump (SCSI) from the
>>     FW's perspective (one has to use HMC or dumpconf to execute it and not
>>     with kexec-tools like for the _regular_ kdump).
> 
> Ah, that makes sense.
> 
>> * One has to reserve crashkernel memory region in the old crashed kernel
>>     even if it remains unused until the dump starts.
>> * zipl uses regular kdump kernel and initramfs to create stand-alone
>>     dumper images and to write them to a dump disk which is used for
>>     IPLIng the stand-alone dumper.
>> * The zipl bootloader takes care of transferring the old kernel memory
>>     saved in HSA by the FW to the crashkernel memory region reserved by the old
>>     crashed kernel before it enters the dumper. The HSA memory is released
>>     by the zipl bootloader _before_ the dumper image is entered,
>>     therefore, we cannot use HSA to read old kernel memory, and instead
>>     use memory from crashkernel region, just like the regular kdump.
>> * is_ipl_type_dump() will be true for a stand-alone kdump because we IPL
>>     the dumper like a regular stand-alone dump (e.g. zfcpdump).
>> * Summarized, zipl bootloader prepares an environment which is expected by
>>     the regular kdump for a stand-alone kdump dumper before it is entered.
> 
> Thanks for the details!
> 
>>
>> In my opinion, the correct version of is_kdump_kernel() would be
>>
>> bool is_kdump_kernel(void)
>> {
>>           return oldmem_data.start;
>> }
>>
>> because Linux kernel doesn't differentiate between both the regular
>> and the stand-alone kdump where it matters while performing dumper
>> operations (e.g. reading saved old kernel memory from crashkernel memory region).
>>
> 
> Right, but if we consider "/proc/vmcore is available", a better version
> would IMHO be:
> 
> bool is_kdump_kernel(void)
> {
>             return dump_available();
> }
> 
> Because that is mostly (not completely) how is_kdump_kernel() would have
> worked right now *after* we had the elfcorehdr_alloc() during the
> fs_init call.
> 
> 
>> Furthermore, if i'm not mistaken then the purpose of is_kdump_kernel()
>> is to tell us whether Linux kernel runs in a kdump like environment and not
>> whether the current mode is identical to the proper and true kdump,
>> right ? And if stand-alone kdump swims like a duck, quacks like one, then it
>> is one, regardless how it was started, by kexecing or IPLing
>> from a disk.
> 
> Same thinking here.
> 
>>
>> The stand-alone kdump has a very special use case which most users will
>> never encounter. And usually, one just takes zfcpdump instead which is
>> more robust and much smaller considering how big kdump initrd can get.
>> stand-alone kdump dumper images cannot exceed HSA memory limit on a Z machine.
> 
> Makes sense, so it boils down to either
> 
> bool is_kdump_kernel(void)
> {
>            return oldmem_data.start;
> }
> 
> Which means is_kdump_kernel() can be "false" even though /proc/vmcore is
> available or
> 
> bool is_kdump_kernel(void)
> {
>            return dump_available();
> }
> 
> Which means is_kdump_kernel() can never be "false" if /proc/vmcore is
> available. There is the chance of is_kdump_kernel() being "true" if
> "elfcorehdr_alloc()" fails with -ENODEV.
> 
> 
> You're call :) Thanks!
> 

What I think we should do is the following (improved comment + patch
description), but I'll do whatever you think is better:


 From e86194b5195c743eff33f563796b9c725fecc65f Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@redhat.com>
Date: Wed, 4 Sep 2024 14:57:10 +0200
Subject: [PATCH] s390/kdump: provide custom is_kdump_kernel()

s390 currently always results in is_kdump_kernel() == false until
vmcore_init()->elfcorehdr_alloc() ran, because it sets
"elfcorehdr_addr = ELFCORE_ADDR_MAX;" early during setup_arch to deactivate
any elfcorehdr= kernel parameter.

Let's follow the powerpc example and implement our own logic. Let's use
"dump_available()", because this is mostly (with one exception when
elfcorehdr_alloc() fails with -ENODEV) when we would create /proc/vmcore
and when is_kdump_kernel() would have returned "true" after
vmcore_init().

This is required for virtio-mem to reliably identify a kdump
environment before vmcore_init() was called to not try hotplugging memory.

Update the documentation above dump_available().

Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
  arch/s390/include/asm/kexec.h |  4 ++++
  arch/s390/kernel/crash_dump.c |  6 ++++++
  arch/s390/kernel/smp.c        | 16 ++++++++--------
  3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
index 1bd08eb56d5f..bd20543515f5 100644
--- a/arch/s390/include/asm/kexec.h
+++ b/arch/s390/include/asm/kexec.h
@@ -94,6 +94,9 @@ void arch_kexec_protect_crashkres(void);
  
  void arch_kexec_unprotect_crashkres(void);
  #define arch_kexec_unprotect_crashkres arch_kexec_unprotect_crashkres
+
+bool is_kdump_kernel(void);
+#define is_kdump_kernel is_kdump_kernel
  #endif
  
  #ifdef CONFIG_KEXEC_FILE
@@ -107,4 +110,5 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
  int arch_kimage_file_post_load_cleanup(struct kimage *image);
  #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
  #endif
+
  #endif /*_S390_KEXEC_H */
diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index 51313ed7e617..43bbaf534dd2 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -237,6 +237,12 @@ int remap_oldmem_pfn_range(struct vm_area_struct *vma, unsigned long from,
  						       prot);
  }
  
+bool is_kdump_kernel(void)
+{
+	return dump_available();
+}
+EXPORT_SYMBOL_GPL(is_kdump_kernel);
+
  static const char *nt_name(Elf64_Word type)
  {
  	const char *name = "LINUX";
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 4df56fdb2488..bd41e35a27a0 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -574,7 +574,7 @@ int smp_store_status(int cpu)
  
  /*
   * Collect CPU state of the previous, crashed system.
- * There are four cases:
+ * There are three cases:
   * 1) standard zfcp/nvme dump
   *    condition: OLDMEM_BASE == NULL && is_ipl_type_dump() == true
   *    The state for all CPUs except the boot CPU needs to be collected
@@ -587,16 +587,16 @@ int smp_store_status(int cpu)
   *    with sigp stop-and-store-status. The firmware or the boot-loader
   *    stored the registers of the boot CPU in the absolute lowcore in the
   *    memory of the old system.
- * 3) kdump and the old kernel did not store the CPU state,
- *    or stand-alone kdump for DASD
- *    condition: OLDMEM_BASE != NULL && !is_kdump_kernel()
+ * 3) kdump or stand-alone kdump for DASD
+ *    condition: OLDMEM_BASE != NULL && !is_ipl_type_dump() == false
   *    The state for all CPUs except the boot CPU needs to be collected
   *    with sigp stop-and-store-status. The kexec code or the boot-loader
   *    stored the registers of the boot CPU in the memory of the old system.
- * 4) kdump and the old kernel stored the CPU state
- *    condition: OLDMEM_BASE != NULL && is_kdump_kernel()
- *    This case does not exist for s390 anymore, setup_arch explicitly
- *    deactivates the elfcorehdr= kernel parameter
+ *
+ * Note that the old kdump mode where the old kernel stored the CPU state
+ * does no longer exist: setup_arch explicitly deactivates the elfcorehdr=
+ * kernel parameter. The is_kdump_kernel() implementation on s390 is independent
+ * of the elfcorehdr= parameter, and is purely based on dump_available().
   */
  static bool dump_available(void)
  {
-- 
2.46.1


-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-10-16 15:55 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-14 14:46 [PATCH v2 0/7] virtio-mem: s390 support David Hildenbrand
2024-10-14 14:46 ` [PATCH v2 1/7] s390/kdump: implement is_kdump_kernel() David Hildenbrand
2024-10-14 18:20   ` Heiko Carstens
2024-10-14 19:26     ` David Hildenbrand
2024-10-15  8:30       ` Heiko Carstens
2024-10-15  8:41         ` David Hildenbrand
2024-10-15  8:53           ` David Hildenbrand
2024-10-15  8:56             ` David Hildenbrand
2024-10-15 10:08           ` Heiko Carstens
2024-10-15 10:40             ` David Hildenbrand
2024-10-16 13:35               ` Alexander Egorenkov
2024-10-16 15:47                 ` David Hildenbrand
2024-10-16 15:54                   ` David Hildenbrand [this message]
2024-10-21 12:46                   ` Alexander Egorenkov
2024-10-21 14:45                     ` David Hildenbrand
2024-10-23  7:42                       ` Heiko Carstens
2024-10-23  7:45                         ` David Hildenbrand
2024-10-23 11:17                       ` Alexander Egorenkov
2024-10-14 14:46 ` [PATCH v2 2/7] Documentation: s390-diag.rst: make diag500 a generic KVM hypercall David Hildenbrand
2024-10-14 18:04   ` Heiko Carstens
2024-10-14 19:35     ` David Hildenbrand
2024-10-15  8:12       ` Heiko Carstens
2024-10-15  8:16         ` David Hildenbrand
2024-10-15  8:21           ` Heiko Carstens
2024-10-15  8:32             ` David Hildenbrand
2024-10-15  8:46               ` Heiko Carstens
2024-10-15  8:48                 ` David Hildenbrand
2024-10-14 14:46 ` [PATCH v2 3/7] Documentation: s390-diag.rst: document diag500(STORAGE LIMIT) subfunction David Hildenbrand
2024-10-14 14:46 ` [PATCH v2 4/7] s390/physmem_info: query diag500(STORAGE LIMIT) to support QEMU/KVM memory devices David Hildenbrand
2024-10-14 18:43   ` Heiko Carstens
2024-10-14 19:42     ` David Hildenbrand
2024-10-15 15:01     ` Eric Farman
2024-10-15 15:20       ` Heiko Carstens
2024-10-25 10:52         ` David Hildenbrand
2024-10-16 10:37       ` Halil Pasic
2024-10-17  7:36   ` Alexander Gordeev
2024-10-17  8:19     ` David Hildenbrand
2024-10-17  9:53       ` Alexander Gordeev
2024-10-17 10:00         ` David Hildenbrand
2024-10-17 12:07           ` David Hildenbrand
2024-10-17 14:32             ` Alexander Gordeev
2024-10-17 14:36               ` David Hildenbrand
2024-10-30 14:30   ` Alexander Gordeev
2024-10-30 14:33     ` Alexander Gordeev
2024-10-14 14:46 ` [PATCH v2 5/7] virtio-mem: s390 support David Hildenbrand
2024-10-14 18:48   ` Heiko Carstens
2024-10-14 19:16     ` David Hildenbrand
2024-10-15  8:37       ` Heiko Carstens
2024-10-21  6:33         ` Christian Borntraeger
2024-10-21 12:19           ` David Hildenbrand
2024-10-14 14:46 ` [PATCH v2 6/7] lib/Kconfig.debug: default STRICT_DEVMEM to "y" on s390 David Hildenbrand
2024-10-14 18:53   ` Heiko Carstens
2024-10-14 14:46 ` [PATCH v2 7/7] s390/sparsemem: reduce section size to 128 MiB David Hildenbrand
2024-10-14 17:53   ` Heiko Carstens
2024-10-14 19:47     ` David Hildenbrand
2024-10-14 18:56 ` [PATCH v2 0/7] virtio-mem: s390 support Heiko Carstens
2024-10-14 19:17   ` David Hildenbrand
2024-10-15  7:57   ` Claudio Imbrenda
2024-10-25 10:54     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87956f31-472d-4091-8061-1e55fea7a3d7@redhat.com \
    --to=david@redhat.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=egorenar@linux.ibm.com \
    --cc=eperezma@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mcasquer@redhat.com \
    --cc=mst@redhat.com \
    --cc=svens@linux.ibm.com \
    --cc=thuth@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=zaslonko@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox