From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA946C3DA64 for ; Thu, 1 Aug 2024 02:52:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C6696B00AB; Wed, 31 Jul 2024 22:52:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 776E46B00AD; Wed, 31 Jul 2024 22:52:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 617D16B00B1; Wed, 31 Jul 2024 22:52:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 446976B00AB for ; Wed, 31 Jul 2024 22:52:58 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BD2B1404DB for ; Thu, 1 Aug 2024 02:52:57 +0000 (UTC) X-FDA: 82402154394.11.753962E Received: from out02.mta.xmission.com (out02.mta.xmission.com [166.70.13.232]) by imf02.hostedemail.com (Postfix) with ESMTP id 28A5780019 for ; Thu, 1 Aug 2024 02:52:54 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.232 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722480770; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A6afcgwGWs1l44rOAm3BfXRFbomBT6ij6Pz2mdroSaw=; b=27K02Ds2/8XFC8TMgfqbAMCfbYQcDVlPIlYUs4MI04qtR7NYpUJhXooFnTWXEP2biH4oJK s1P4rsBVmQX016GL3vND2KT+vh2GV6930Psbo4ajN3QxY/AH6pK7UaVy3F0Mktk18VKpb7 4Ee+olsovmJnN6bXRMAfcYhqVGjixKI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.232 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722480770; a=rsa-sha256; cv=none; b=fcoX/InsrX0pohFR98KeBej25UZK1x5Pp2cOMchLYF5tXUR8BcMdFQJ8zCdvHE/RJvc61i ebod6sEqjVtPZ+uPdkYV15/O/OriAW48EDIh1yejqx2WlOqVKyxpX9TbbtHrCTU7ak7VbV 4bzl/SfznOxJ1XCh7iFV/jTFRBhrXYc= Received: from in02.mta.xmission.com ([166.70.13.52]:49932) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1sZLvw-002zLX-5G; Wed, 31 Jul 2024 20:52:52 -0600 Received: from ip68-227-165-127.om.om.cox.net ([68.227.165.127]:40402 helo=email.froward.int.ebiederm.org.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1sZLvu-00GLbI-Q8; Wed, 31 Jul 2024 20:52:51 -0600 From: "Eric W. Biederman" To: Brian Mak Cc: Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" References: Date: Wed, 31 Jul 2024 21:52:07 -0500 In-Reply-To: (Brian Mak's message of "Wed, 31 Jul 2024 22:14:15 +0000") Message-ID: <877cd1ymy0.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1sZLvu-00GLbI-Q8;;;mid=<877cd1ymy0.fsf@email.froward.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.165.127;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX1+ZIrQkpc0EQZzhQN6PC7iJUidyY4UMcOc= X-SA-Exim-Connect-IP: 68.227.165.127 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [RFC PATCH] binfmt_elf: Dump smaller VMAs first in ELF cores X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) X-Rspam-User: X-Stat-Signature: mg9i5cphnhtp4hyw9yewb55mr71mpprj X-Rspamd-Queue-Id: 28A5780019 X-Rspamd-Server: rspam11 X-HE-Tag: 1722480774-261907 X-HE-Meta: U2FsdGVkX1/4mJoOJKLs97IxxZOkJPJM4LgSqhnfOri4lgqRGth3wKRTIhRQ47gBd5/XunvI4FX6sXpZWp8rILtXt68EeurbmKoE+C/IZEkokBlIahk0FcZoiDtWYAdE5Qs9C+5XOyDcHU1z0dpWKcyItT6MW11d/r0tt6IrNeXW4JX2xNbn7I9nSqXgURmR+jSb4E3M20HH35WUyxhtJtNHv6JJ5tmgG0uC+7noUQft4d2xH8kj1EW3I1bnBMggPkri8AuLov8UHq4fYzNmvVw6i3YwPgRqnd/V7U7N+zSZRr2KW+aLkbO4EwOPdv5ABrW/ToWkfrlK1+wlpLGWFX+RRurbScLod3dUIydt5GjWzTkIpI1vynVMPyYdHK5khdawcjcCq5YunFqWyiWRy49kL+yo+9FbXblhzE4P1ldPC70ykkUqq1q0EcF3erGl+ZWfQWLbZ2Qy1zpLMncOa0S/StIwaLuM9caIq+2p5kMIXVRbM5Gd38B7v5d1NWUPwcIZ1fIuZC1wAStksw6cUkuEkfEaZ+Ycchs7Y8UVP7/CFZ6ncRpBhjhfyr4NkXbStZpJxzSW6of6GILQvQ5KioHWSBl9sxrYh3OMLHYgviHM1sebx5Zjv04lyzKcBBPAsAqx8qAtR2xldfrCPOD4sYL0dA2qjTSPZQSNZ44JorHDUJnhiffGrbI5Lbowh6bBAt8woblA87ClTe3UlocY1m7PS+ZrfoEv/lZgIjEs6jm+0CrmSS2VOK/MFxVx10aHd5ehtCSbqiQBnL8FVXJ7NLB2oHAy5DiU0dMltOik/dcxNerBfZ0ImfN78cRJdyHczr3jNOIMhuG7h4V4fblPVWFemrU251irVvsEzJAloZbEZ/xMBYGhyxw1sbkJ7XofDLHvtHbssjE30uXoa8uX8gvlsXfyXsvTm/ZgWQEZAkImV4Rt2imdPNbFiL6Y7Ifg/rzGXXnE5x6amvhPv+P VRqbrghQ TZRrEW2lRmGwGaEG92bw98dhr7xG2nmqlRNusHq0WWJvCxX5017YqFrqp5NdlUKDrmamijyPXRCOCht0vEv1m9Br7x5Yaf//JBEg1fJGYjoucU9rsAO9DRkpuZWkd9+Yjlfutj+e5BiHgKlIA3tjTPOHl+h2N5w42wMhxW8ccuQwDLJU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Brian Mak writes: > Large cores may be truncated in some scenarios, such as daemons with stop > timeouts that are not large enough or lack of disk space. This impacts > debuggability with large core dumps since critical information necessary to > form a usable backtrace, such as stacks and shared library information, are > omitted. We can mitigate the impact of core dump truncation by dumping > smaller VMAs first, which may be more likely to contain memory for stacks > and shared library information, thus allowing a usable backtrace to be > formed. This sounds theoretical. Do you happen to have a description of a motivating case? A situtation that bit someone and resulted in a core file that wasn't usable? A concrete situation would help us imagine what possible caveats there are with sorting vmas this way. The most common case I am aware of is distributions setting the core file size to 0 (ulimit -c 0). One practical concern with this approach is that I think the ELF specification says that program headers should be written in memory order. So a comment on your testing to see if gdb or rr or any of the other debuggers that read core dumps cares would be appreciated. > We implement this by sorting the VMAs by dump size and dumping in that > order. Since your concern is about stacks, and the kernel has information about stacks it might be worth using that information explicitly when sorting vmas, instead of just assuming stacks will be small. I expect the priorities would look something like jit generated executable code segments, stacks, and then heap data. I don't have enough information what is causing your truncated core dumps, so I can't guess what the actual problem is your are fighting, so I could be wrong on priorities. Though I do wonder if this might be a buggy interaction between core dumps and something like signals, or io_uring. If it is something other than a shortage of storage space causing your truncated core dumps I expect we should first debug why the coredumps are being truncated rather than proceed directly to working around truncation. Eric > Signed-off-by: Brian Mak > --- > > Hi all, > > My initial testing with a program that spawns several threads and allocates heap > memory shows that this patch does indeed prioritize information such as stacks, > which is crucial to forming a backtrace and debugging core dumps. > > Requesting for comments on the following: > > Are there cases where this might not necessarily prioritize dumping VMAs > needed to obtain a usable backtrace? > > Thanks, > Brian Mak > > fs/binfmt_elf.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 62 insertions(+), 2 deletions(-) > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 19fa49cd9907..d45240b0748d 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -37,6 +38,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1990,6 +1992,22 @@ static void fill_extnum_info(struct elfhdr *elf, struct elf_shdr *shdr4extnum, > shdr4extnum->sh_info = segs; > } > > +static int cmp_vma_size(const void *vma_meta_lhs_ptr, const void *vma_meta_rhs_ptr) > +{ > + const struct core_vma_metadata *vma_meta_lhs = *(const struct core_vma_metadata **) > + vma_meta_lhs_ptr; > + const struct core_vma_metadata *vma_meta_rhs = *(const struct core_vma_metadata **) > + vma_meta_rhs_ptr; > + > + if (vma_meta_lhs->dump_size < vma_meta_rhs->dump_size) > + return -1; > + if (vma_meta_lhs->dump_size > vma_meta_rhs->dump_size) > + return 1; > + return 0; > +} > + > +static bool sort_elf_core_vmas = true; > + > /* > * Actual dumper > * > @@ -2008,6 +2026,7 @@ static int elf_core_dump(struct coredump_params *cprm) > struct elf_shdr *shdr4extnum = NULL; > Elf_Half e_phnum; > elf_addr_t e_shoff; > + struct core_vma_metadata **sorted_vmas = NULL; > > /* > * The number of segs are recored into ELF header as 16bit value. > @@ -2071,11 +2090,27 @@ static int elf_core_dump(struct coredump_params *cprm) > if (!dump_emit(cprm, phdr4note, sizeof(*phdr4note))) > goto end_coredump; > > + /* Allocate memory to sort VMAs and sort if needed. */ > + if (sort_elf_core_vmas) > + sorted_vmas = kvmalloc_array(cprm->vma_count, sizeof(*sorted_vmas), GFP_KERNEL); > + > + if (!ZERO_OR_NULL_PTR(sorted_vmas)) { > + for (i = 0; i < cprm->vma_count; i++) > + sorted_vmas[i] = cprm->vma_meta + i; > + > + sort(sorted_vmas, cprm->vma_count, sizeof(*sorted_vmas), cmp_vma_size, NULL); > + } > + > /* Write program headers for segments dump */ > for (i = 0; i < cprm->vma_count; i++) { > - struct core_vma_metadata *meta = cprm->vma_meta + i; > + struct core_vma_metadata *meta; > struct elf_phdr phdr; > > + if (ZERO_OR_NULL_PTR(sorted_vmas)) > + meta = cprm->vma_meta + i; > + else > + meta = sorted_vmas[i]; > + > phdr.p_type = PT_LOAD; > phdr.p_offset = offset; > phdr.p_vaddr = meta->start; > @@ -2111,7 +2146,12 @@ static int elf_core_dump(struct coredump_params *cprm) > dump_skip_to(cprm, dataoff); > > for (i = 0; i < cprm->vma_count; i++) { > - struct core_vma_metadata *meta = cprm->vma_meta + i; > + struct core_vma_metadata *meta; > + > + if (ZERO_OR_NULL_PTR(sorted_vmas)) > + meta = cprm->vma_meta + i; > + else > + meta = sorted_vmas[i]; > > if (!dump_user_range(cprm, meta->start, meta->dump_size)) > goto end_coredump; > @@ -2128,10 +2168,26 @@ static int elf_core_dump(struct coredump_params *cprm) > end_coredump: > free_note_info(&info); > kfree(shdr4extnum); > + kvfree(sorted_vmas); > kfree(phdr4note); > return has_dumped; > } > > +#ifdef CONFIG_DEBUG_FS > + > +static struct dentry *elf_core_debugfs; > + > +static int __init init_elf_core_debugfs(void) > +{ > + elf_core_debugfs = debugfs_create_dir("elf_core", NULL); > + debugfs_create_bool("sort_elf_core_vmas", 0644, elf_core_debugfs, &sort_elf_core_vmas); > + return 0; > +} > + > +fs_initcall(init_elf_core_debugfs); > + > +#endif /* CONFIG_DEBUG_FS */ > + > #endif /* CONFIG_ELF_CORE */ > > static int __init init_elf_binfmt(void) > @@ -2144,6 +2200,10 @@ static void __exit exit_elf_binfmt(void) > { > /* Remove the COFF and ELF loaders. */ > unregister_binfmt(&elf_format); > + > +#if defined(CONFIG_ELF_CORE) && defined(CONFIG_DEBUG_FS) > + debugfs_remove(elf_core_debugfs); > +#endif > } > > core_initcall(init_elf_binfmt); > > base-commit: 94ede2a3e9135764736221c080ac7c0ad993dc2d