From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3189FC54E60 for ; Thu, 14 Mar 2024 19:58:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3563800E2; Thu, 14 Mar 2024 15:58:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE550800B4; Thu, 14 Mar 2024 15:58:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAD81800E2; Thu, 14 Mar 2024 15:58:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9B1C1800B4 for ; Thu, 14 Mar 2024 15:58:17 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 60DEB80166 for ; Thu, 14 Mar 2024 19:58:17 +0000 (UTC) X-FDA: 81896706234.18.F284F8E Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) by imf02.hostedemail.com (Postfix) with ESMTP id CCAD78000A for ; Thu, 14 Mar 2024 19:58:14 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=shkAhrn4; spf=pass (imf02.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710446295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0bOLhFjMAf0PlBTMvDVYA/bwx6DlzT/WdrltAs6p8IE=; b=KjsANB+hZjyw9UtrciQcXcT72m8UPqUn/B05MADXZkijHquenViC+QUSB8YpfIC1dFdily 3A4tDhAegrHdfDknXSqk5m7rXa7C1OWfNXIwK6cVpSRaLLA2ADaE1qvZN6UckOIqUoXpD8 uM4/cVv4QfwauA0/im6qROMv+txryqg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710446295; a=rsa-sha256; cv=none; b=D6ksh6YzzQ3OjGb30VAydcWWRu0YticC05tEIIDnJwKp9AJlfG49jTmjv6l1hbOxOIn3BB MNe76PyiiDR1GpArgOD9ywE8oztzBUa5jazjccMoYBYDFJdPUnDbSod4Zf7ooyUEugrU1t /6ytFHqYawmUI2NOe+RYJ8qViWXKwGg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=shkAhrn4; spf=pass (imf02.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Thu, 14 Mar 2024 15:58:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1710446292; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0bOLhFjMAf0PlBTMvDVYA/bwx6DlzT/WdrltAs6p8IE=; b=shkAhrn4sqSQ4emlsGGQjXl6L4IV7dKfTie67wpSZvJ0iZsoIDMbQKsIvrMqEu9/1S91ts wsWfCG9OHLAACjaTkpJJ3s1U/ez6ohf3Imo0FL49eQ3T/hlMpFK3yXiQYmcd84F/Y+CUtm 89tjlDc5nTNYPgC5KnZXw1zB5kk71YM= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Matthew Wilcox Cc: "H. Peter Anvin" , Pasha Tatashin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org, bp@alien8.de, brauner@kernel.org, bristot@redhat.com, bsegall@google.com, dave.hansen@linux.intel.com, dianders@chromium.org, dietmar.eggemann@arm.com, eric.devolder@oracle.com, hca@linux.ibm.com, hch@infradead.org, jacob.jun.pan@linux.intel.com, jgg@ziepe.ca, jpoimboe@kernel.org, jroedel@suse.de, juri.lelli@redhat.com, kinseyho@google.com, kirill.shutemov@linux.intel.com, lstoakes@gmail.com, luto@kernel.org, mgorman@suse.de, mic@digikod.net, michael.christie@oracle.com, mingo@redhat.com, mjguzik@gmail.com, mst@redhat.com, npiggin@gmail.com, peterz@infradead.org, pmladek@suse.com, rick.p.edgecombe@intel.com, rostedt@goodmis.org, surenb@google.com, tglx@linutronix.de, urezki@gmail.com, vincent.guittot@linaro.org, vschneid@redhat.com Subject: Re: [RFC 00/14] Dynamic Kernel Stacks Message-ID: References: <20240311164638.2015063-1-pasha.tatashin@soleen.com> <2cb8f02d-f21e-45d2-afe2-d1c6225240f3@zytor.com> <2qp4uegb4kqkryihqyo6v3fzoc2nysuhltc535kxnh6ozpo5ni@isilzw7nth42> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Stat-Signature: gfhtd3ntq63jccbpbi5f4dnrnyn9dn8e X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: CCAD78000A X-Rspam-User: X-HE-Tag: 1710446294-466062 X-HE-Meta: U2FsdGVkX19m69lxY2BySb3pIKuutbU9qpoQHFcHk3mStwzw14qsC+MgQvd5grDc0py2ZCpwFvsY5Oa0MWhWwuM0bwdbj56+sRUHSa92ExcmuYc/p7yqubFn6IqL9KV7+CSKcX5Vx1egA/tprKdubL+2+wuBJ40NKO1bilBhqD3SqeSB/ylmHXCXR4b9mPrdy/dLLoRGPwkl9Rcpq5eOGQaQPn1SUqWAsLGns0E+Yy1VE8rwB1+78EBVtBtfuXkmeCc300Ea5lNFSorAl7BWsxpHG3IR8a5T0K/qI5ywM0dxK7H+cHcwnjUTH5Obu0eqvz9oz3pFf37/cHC86Fs2f5Cw+pPQctoQ+qWRDMBah/PzUFJiqMzzO/sFcuLf+hoHRZAe3wIeAMdJiYvg0amT6Sg99UwGXU09asRSreTGXDX+Vy5+ZilufF2BYZtyapagCeD/INRIDVafsMNnMHlLkm9KZG9Zb1pT7ITv/kCJYJfaFdiLcPsLBnQDDTDzxTHwD1BQ+gSldKadxYMNQvWh7I7UWKyE/dOH4l46ZgBBim1KB1nix11MbXjWztSA6J1jBMp4jn7TJPBohyuI6N6LxFI3b3vlVd6YN7bXAXE+8DBJeZV+NE1v6uqXt5eWmDZVN7QbdKsz5doyZW63xQ0yX6O83tgLAb2SXZvAbq8n5/AxWNfa49hW5tOcF2CXtBhakF3mwWRYqGg5CeKd0FMwM7seUzM8cu5QpRIqYX6z7AG9TYmEquxCjQ6Q3RljnVaVWbU0Ck/fbSAEDq3Crjgi5BPmR0ZptR7FoNNV74GncWI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 14, 2024 at 07:57:22PM +0000, Matthew Wilcox wrote: > On Thu, Mar 14, 2024 at 03:53:39PM -0400, Kent Overstreet wrote: > > On Thu, Mar 14, 2024 at 07:43:06PM +0000, Matthew Wilcox wrote: > > > On Tue, Mar 12, 2024 at 10:18:10AM -0700, H. Peter Anvin wrote: > > > > Second, non-dynamic kernel memory is one of the core design decisions in > > > > Linux from early on. This means there are lot of deeply embedded assumptions > > > > which would have to be untangled. > > > > > > I think there are other ways of getting the benefit that Pasha is seeking > > > without moving to dynamically allocated kernel memory. One icky thing > > > that XFS does is punt work over to a kernel thread in order to use more > > > stack! That breaks a number of things including lockdep (because the > > > kernel thread doesn't own the lock, the thread waiting for the kernel > > > thread owns the lock). > > > > > > If we had segmented stacks, XFS could say "I need at least 6kB of stack", > > > and if less than that was available, we could allocate a temporary > > > stack and switch to it. I suspect Google would also be able to use this > > > API for their rare cases when they need more than 8kB of kernel stack. > > > Who knows, we might all be able to use such a thing. > > > > > > I'd been thinking about this from the point of view of allocating more > > > stack elsewhere in kernel space, but combining what Pasha has done here > > > with this idea might lead to a hybrid approach that works better; allocate > > > 32kB of vmap space per kernel thread, put 12kB of memory at the top of it, > > > rely on people using this "I need more stack" API correctly, and free the > > > excess pages on return to userspace. No complicated "switch stacks" API > > > needed, just an "ensure we have at least N bytes of stack remaining" API. > > > > Why would we need an "I need more stack" API? Pasha's approach seems > > like everything we need for what you're talking about. > > Because double faults are hard, possibly impossible, and the FRED approach > Peter described has extra overhead? This was all described up-thread. *nod*