From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31243C54E68 for ; Sun, 17 Mar 2024 00:49:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 574296B0082; Sat, 16 Mar 2024 20:49:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FD246B0083; Sat, 16 Mar 2024 20:49:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 376F36B0085; Sat, 16 Mar 2024 20:49:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 225B16B0082 for ; Sat, 16 Mar 2024 20:49:09 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BE0EE1C0827 for ; Sun, 17 Mar 2024 00:49:08 +0000 (UTC) X-FDA: 81904696776.17.4A7126C Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) by imf01.hostedemail.com (Postfix) with ESMTP id 45C0640005 for ; Sun, 17 Mar 2024 00:49:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none ("invalid DKIM record") header.d=zytor.com header.s=2024031401 header.b=z41yZ42K; dmarc=pass (policy=none) header.from=zytor.com; spf=pass (imf01.hostedemail.com: domain of hpa@zytor.com designates 198.137.202.136 as permitted sender) smtp.mailfrom=hpa@zytor.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710636546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rmzZOkxjUewxpWdmHy83RYimgSmj73ySRleyMzLtqbI=; b=G5ck/s/oYGNFwsd3AAGMkZKOF+pD/xMvM4O5m4r86crXkf1Mthg7xnZeSh0j3Pto0oyE3r /W5zAyeU7yzs0dsCTDPuIwj4CuSeUFjE+NqtTjZhd33maDgOOGZPYW9sqC606fuy0wXqNI D0ccEdzG/XkILXigznI7E7gbOsradZ4= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none ("invalid DKIM record") header.d=zytor.com header.s=2024031401 header.b=z41yZ42K; dmarc=pass (policy=none) header.from=zytor.com; spf=pass (imf01.hostedemail.com: domain of hpa@zytor.com designates 198.137.202.136 as permitted sender) smtp.mailfrom=hpa@zytor.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710636546; a=rsa-sha256; cv=none; b=ynkyZjfAoNMmhMHLBJfwo+QFkg1DLD9a988hpvTbRWyK9AJ5+m4gx+OSMc6X/Z8J6jtcou 51yA145p5A0IXORJzfBHqwJjsObowIAyruAvA/+KMRQ7kYk79wvRaLyOVRu7m1PHNpnMyW xW9g7CiZfMZob81POMVUlDBSiGzb8s8= Received: from [127.0.0.1] ([76.133.66.138]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 42H0lK2R189356 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Sat, 16 Mar 2024 17:47:20 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 42H0lK2R189356 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024031401; t=1710636443; bh=rmzZOkxjUewxpWdmHy83RYimgSmj73ySRleyMzLtqbI=; h=Date:From:To:CC:Subject:In-Reply-To:References:From; b=z41yZ42KGUScTcXJ2MOe+aqXnWr/IYbXDlaSZqYmRMADbKGNn+vcvWpAcMgFKUlqX K/+7j7M1nBDD3MQNrBy1XxfNVgW+1+WnwOLcYN+dI8zhhbGRgIXDeewuHZxOBodhnb puVsURclHdgwcQzVQv2jCwh6TdGY2JZyy3rA57FCKiC7I1B4AquJZ9gFIFE+sOpgAK SEWFs7YTv1bcXncbPVoGUPVAeP59Z1biYUJpNgQTqClswne2A97EL3NwvDzHFnbJql mr22nMh51JH4MsZIOQ2XILXMQNZ9/mW0ndK9HU0Gb8Duq9tfFQa4qyghbUU4avDHD8 pF1Qbr51F37TQ== Date: Sat, 16 Mar 2024 17:47:18 -0700 From: "H. Peter Anvin" To: Matthew Wilcox CC: Pasha Tatashin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org, bp@alien8.de, brauner@kernel.org, bristot@redhat.com, bsegall@google.com, dave.hansen@linux.intel.com, dianders@chromium.org, dietmar.eggemann@arm.com, eric.devolder@oracle.com, hca@linux.ibm.com, hch@infradead.org, jacob.jun.pan@linux.intel.com, jgg@ziepe.ca, jpoimboe@kernel.org, jroedel@suse.de, juri.lelli@redhat.com, kent.overstreet@linux.dev, kinseyho@google.com, kirill.shutemov@linux.intel.com, lstoakes@gmail.com, luto@kernel.org, mgorman@suse.de, mic@digikod.net, michael.christie@oracle.com, mingo@redhat.com, mjguzik@gmail.com, mst@redhat.com, npiggin@gmail.com, peterz@infradead.org, pmladek@suse.com, rick.p.edgecombe@intel.com, rostedt@goodmis.org, surenb@google.com, tglx@linutronix.de, urezki@gmail.com, vincent.guittot@linaro.org, vschneid@redhat.com Subject: Re: [RFC 00/14] Dynamic Kernel Stacks User-Agent: K-9 Mail for Android In-Reply-To: References: <20240311164638.2015063-1-pasha.tatashin@soleen.com> <2cb8f02d-f21e-45d2-afe2-d1c6225240f3@zytor.com> Message-ID: <0EE22907-1D81-4FA6-823B-13F7A94D3F85@zytor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 45C0640005 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: fn4o986wi195zx9amg6x5n3ki67or7qq X-HE-Tag: 1710636545-369827 X-HE-Meta: U2FsdGVkX1+uQcI0gcDZrqoqFclUF0iW9Ja1gKWp2w8e21R9a8+zxtdcQrG9rqtHPD6pHBCh2coAEnhScyiyQygx+ABFNbuuPOGaZnBIyPD97WyMcnma019Wahrd4/XHyxi12q5e4J56F+QxGSeInPzoA8kOrMgv+6YHPj7W+Dj21yymLass1fh5zXd/ppBYXT8Ao+Mlzs10oFN2CyrmoCX4wwwFgvKKRWqC7gp4YN16U99zqjAIAiL3uwzoCZQoPbFQ3Q4To/5nOjXyL4TedHDB25vd0awRuhO0ORKUWFc7Y3+3o8h0f7m6t+jEFUDvUFjLns+nBOtyjHbzoREt0sR5oKVyS7aoJN9+0qD3vgxBRi85Zm1DCy4/P2riyByVAfipn5ahO2q51EePGFh+vsu1SozhynKoAL3pNK+yxHHRSbu7zb5ryCmr0fnNF6EoBMUlfUIHxpfL+TwxgN+nBpq3ufKVm4Vsjtjr1MsAC4a3L/dzMjAJdSGVE+teOrMqbt8zebkkuMngkpZBdajfQAkmm8ofuD0iZFXND07sQtIrG60iIpGlpo77TQYjrDMVE5CgKca6WaENK3RGFQf70OhDnQZDuv4AY+NT4ZqS0UnBbKv/wpnV7p4is5ama0sZgRzfGFwty6G5M+2vhjG7szpIgZXx1VhzwXplZkhSfjXulVwPRXEzfVvhWtUhslh356x7tIz2ji9+ik0cGn7Fv7qaqwG/p/uaacZlTrEMrq99FHvcrnts4bFlfsIaJhzaMjBLpHNjxheu6sW7p5YljTzqYKeSGY78OsKkqSe4TNVJHoGvSscdORIOJ4UnUUI6wXsseYwMFJ6u4pVWTBLBjBSp4qTbzHBkOP5Z0ebAQLnP55yTEAy8EhFk6rhNkTr8uspbdoAKfT4+AsI8psSkXytyWbO3rqIiRLmTLaMilffNPokVowN9qxvNXcz3rT2kkFJNfUg/kbVbo6LpYW9 bHnoupmK ou6kY7MGANcAD2aYGua1pBXqMukVfS3BixAZpXl5IsLrUbtZsOeG8qW7yrby2gNmr4hof+tDtkJ92EGT/rp34ELZTDD4nJyIkPUx4OF56EQKypVhXyfOJZZzGLG2QiPxhdWRI+I77lWjWjdh7wCkFciFnJfj6SZcKtZPVqOJnVbD7peR3jDYBMshbA/II2+jr+Y/5Rf0RMf+Oid26knrqeU6z9ToeDOehpjB3igygaM7MLEZp2FLudDqx8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On March 14, 2024 12:43:06 PM PDT, Matthew Wilcox w= rote: >On Tue, Mar 12, 2024 at 10:18:10AM -0700, H=2E Peter Anvin wrote: >> Second, non-dynamic kernel memory is one of the core design decisions i= n >> Linux from early on=2E This means there are lot of deeply embedded assu= mptions >> which would have to be untangled=2E > >I think there are other ways of getting the benefit that Pasha is seeking >without moving to dynamically allocated kernel memory=2E One icky thing >that XFS does is punt work over to a kernel thread in order to use more >stack! That breaks a number of things including lockdep (because the >kernel thread doesn't own the lock, the thread waiting for the kernel >thread owns the lock)=2E > >If we had segmented stacks, XFS could say "I need at least 6kB of stack", >and if less than that was available, we could allocate a temporary >stack and switch to it=2E I suspect Google would also be able to use thi= s >API for their rare cases when they need more than 8kB of kernel stack=2E >Who knows, we might all be able to use such a thing=2E > >I'd been thinking about this from the point of view of allocating more >stack elsewhere in kernel space, but combining what Pasha has done here >with this idea might lead to a hybrid approach that works better; allocat= e >32kB of vmap space per kernel thread, put 12kB of memory at the top of it= , >rely on people using this "I need more stack" API correctly, and free the >excess pages on return to userspace=2E No complicated "switch stacks" AP= I >needed, just an "ensure we have at least N bytes of stack remaining" API= =2E This is what stack probes basically does=2E It provides a very cheap "API"= that goes via the #PF (not #DF!) path in the slow case, but synchronously = at a well-defined point, but is virtually free in the common case=2E As a s= ide benefit, they can be compiler-generated, as some operating systems requ= ire them=2E