From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 803ECC54E60 for ; Sun, 17 Mar 2024 14:19:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C49D6B0082; Sun, 17 Mar 2024 10:19:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 675036B0083; Sun, 17 Mar 2024 10:19:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53C106B0085; Sun, 17 Mar 2024 10:19:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 441216B0082 for ; Sun, 17 Mar 2024 10:19:50 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DE0E6806EE for ; Sun, 17 Mar 2024 14:19:49 +0000 (UTC) X-FDA: 81906739698.21.5865466 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf07.hostedemail.com (Postfix) with ESMTP id 1458D40005 for ; Sun, 17 Mar 2024 14:19:47 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=aydFD62A; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf07.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710685188; a=rsa-sha256; cv=none; b=OiTVuXg64UkZYmNFgBZOcmpLlkxLrPtEgXZ20vN03i7SxZx4RLZ/qYp1nm2tRgp9yBgUfb aZJshrQwEocMmktZoi2Gebqas6McZfNDeDJ44+0Aa2U04bXDOJ806C4O490sRytsbJ4t7g 3i/HmffsjRPQswyOtO4ubD5sTlHD5PE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=aydFD62A; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf07.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710685188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YwNuNaKlBH+E/NiBRxGLeVBhU4zNhjLmRdGvzie+mxI=; b=vx02DzaQUX6vaPGgZTfqXxV6Boza3F/GyVyRqvI5msgftPuselzosFmqODypFaPY4a2VGD jXOpmh1FOmWqO+2dRE9cEcnqxc5EW9fXzhLakUrB2k2ZwKDCwvWF2gvtb6DMKCW74hKLDO Ukoax4vXvQaBd5PgKqpK/XQJbqgVn3w= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-42a9c21f9ecso18170811cf.0 for ; Sun, 17 Mar 2024 07:19:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710685187; x=1711289987; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YwNuNaKlBH+E/NiBRxGLeVBhU4zNhjLmRdGvzie+mxI=; b=aydFD62AR236wuxKsXIDRR9lxizht4dSeYCECVQzp1Ss9xslWUgGLhbai5vSEv0BGz lQkI8UkC0vvKQ/V61XIJzo+g1mEF6W4k2cYnJyHWx6MFrpztknwRLS4jpGXzlQG5z7V0 N1tajXXgu2KiR/kJSmpPZaXLGYJ3ipe2HTkrjRGJo3vZUeAx1axrl12MhHjo5XVGCWxm R6buJ36ikSAfBlHMUKS9nIB127BRPR2x8UvVauegbEfwcdOmI1Tyds3v64UdhkUAPMIx iJ2OVC1WQVUAH4CrK7Z+tGoHZFwoeWZ6Ht4TxZMBjSIVp7JlluPSElV3W2iI5D6P8PQU 5rqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710685187; x=1711289987; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YwNuNaKlBH+E/NiBRxGLeVBhU4zNhjLmRdGvzie+mxI=; b=LWUjkvCXpIPUgH31koysfIa4GE+8CWitv0BmU4zRVhWtzJaOXL5o79rwlK1HPZ2q15 S/awCSuavQxJrUMXizbMdzyh6xtMB0XlGxj7cTGMR1NPwHD0/+3YxrjeR3xCOyFJrzP0 LKrBrNjIArvk4chlht3aWVuF2NoKfHN1zmLNa/PdUHQoMc0cdlSOfpYDGCMH80aHVG4Q Xug3m7qJIkofyR/2zBWHOjXUcKCy06PJQhLkhuklhWUd948kyCOKZJNV4X7iBx3AseRV 3/JUu1fw0BNkOQ9RoPZZRposk4oAZoeqGOLolemcIwBQV6DvpPrZGdLuxDOHIhl3v8Wr n6Ng== X-Forwarded-Encrypted: i=1; AJvYcCVKraSeaWSzhZupNTsAH96bXqRUcbEHLOmzCyQyPimopq43qS2UZlW/nMT7of+jO4Smv8JSqxjiVGRLII0cNFm7O30= X-Gm-Message-State: AOJu0YwVdHY604HXkJ1bTFWpSnmI+XsEmbWQO2xzxtJIs2KW3DjWdLLI YpUXRvCfbIcDi5Uxli/1CqGRRSfCiq5v165aphas6sysKOAw40DEUpgMepsiTPKBg6CzMymmZ9c uzRMDTSdH6NGaEfjehsdaDQBoTo0kpmjb3i81PA== X-Google-Smtp-Source: AGHT+IE9nw94WlJTzj6ZoYKzVYmteg9oBLIfDGKl8BLN9y+nmWzFcBjqSPzBHzwvwC7FwgsvLZAe0ORBAOWm4Q3emGs= X-Received: by 2002:ac8:5c03:0:b0:42e:db36:ad2e with SMTP id i3-20020ac85c03000000b0042edb36ad2emr13697654qti.26.1710685187151; Sun, 17 Mar 2024 07:19:47 -0700 (PDT) MIME-Version: 1.0 References: <20240311164638.2015063-1-pasha.tatashin@soleen.com> <2cb8f02d-f21e-45d2-afe2-d1c6225240f3@zytor.com> <2qp4uegb4kqkryihqyo6v3fzoc2nysuhltc535kxnh6ozpo5ni@isilzw7nth42> <39F17EC4-7844-4111-BF7D-FFC97B05D9FA@zytor.com> In-Reply-To: From: Pasha Tatashin Date: Sun, 17 Mar 2024 10:19:10 -0400 Message-ID: Subject: Re: [RFC 00/14] Dynamic Kernel Stacks To: Matthew Wilcox Cc: "H. Peter Anvin" , Kent Overstreet , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org, bp@alien8.de, brauner@kernel.org, bristot@redhat.com, bsegall@google.com, dave.hansen@linux.intel.com, dianders@chromium.org, dietmar.eggemann@arm.com, eric.devolder@oracle.com, hca@linux.ibm.com, hch@infradead.org, jacob.jun.pan@linux.intel.com, jgg@ziepe.ca, jpoimboe@kernel.org, jroedel@suse.de, juri.lelli@redhat.com, kinseyho@google.com, kirill.shutemov@linux.intel.com, lstoakes@gmail.com, luto@kernel.org, mgorman@suse.de, mic@digikod.net, michael.christie@oracle.com, mingo@redhat.com, mjguzik@gmail.com, mst@redhat.com, npiggin@gmail.com, peterz@infradead.org, pmladek@suse.com, rick.p.edgecombe@intel.com, rostedt@goodmis.org, surenb@google.com, tglx@linutronix.de, urezki@gmail.com, vincent.guittot@linaro.org, vschneid@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1458D40005 X-Stat-Signature: 9su3wyyz1cgnbseqehhnj8u5dt7hsapx X-HE-Tag: 1710685187-869576 X-HE-Meta: U2FsdGVkX1+3knz1WWIgyBzEblBAxuI55aa+j9T7yOTICkDaicHzEh1PXdchKZ2pJBW+1QxUg5jg8nPzrQelPqWFbyJIOedy1m0QLXU8++TURvyJ71g3T7DX/yStlfRgJv74Gh3khISXcRC1jOtuBKPQ+T7dQ4vjWICBKuGIWK7+ugysPbRMHIVe004R/C8S88GHfpmB7q+SqyVxuEki4zJLQ/8X/OeKUKGo2PNM53v8tCmd3PSeJ/IR7O84TilBWM2k2e+2Zy8OojbbEAv7zOMaMBv0svm8hScWrnlc0agM2QCpU3HKTBSEHneUM8WeQxzTpZaB1N+LvlyAUa+4KPHX0Vbl6yezLvBaJhUlhtZrOthFLIa4RIDY3WRRTfdenLT5SmUrYADWvDgYy5Hzy88EXGLnILVFR7yPF/yePEAZIpfCuBsFSzvCLyTyy01ElYq+vFV3OBta9gHXh0Ppdv93nX/soMbj8nyCaov53FjoOkQSNEMUPmdUjfSuDDM2E4JAIMPZPhiY8w4oPiRl80araO9ylJjZM9QVwVZu8cWIcEKwfKjmH+e2VWuSe7lcf6Dc0WOHN//iVH6k002pOv0hLcyZriHRrpuX0+/XDvEQsvIv6TQ8pXvyc+aFTjXQwtm3JfVo/GCTb2jjIRUtJA82laP+GUYjU1z0EL8DkHlNCh7jLbIO1rmNns6j631yPxwbtZ9kwDu8ZcLp7R8RrqyIJiqhNeW3achcsXi30hqo+2a/3YxROxQ9iRc8boGS9xUWi6YnbwKQkq1gMLESnWWckICqFrbuxReEYHmgl/wiHiRfqknpry3xeegCBD8bKUYmj0LII2fMytaEncV1TqFffKaeSruDjU1PxaFIL2TZgLzp7ddH2rX1lmiHOjMt/T8H9ynSAha3RFg28NSiPAnt+hwyOD06bmk1VyKNpA/LsAgdhU8zz2Wz5ivfiEkSixE5/sG950CON3prmcm XiaOBNF4 cJAxfpd6wRnEzf+n7D3z7Fmqr0/PGt2wMaJOM0GDs9yndu2fsuVApzXrhXvVDgFRYhooHg9dKgE+CtRVC2h9NqhRvFvjII8hPZJoNfoWIw6BLNcYIQYYwgE/fE7CuIVGfXnTbofxV2q0r6hJrUoFUSTZ/+pn7relD/UDkm5RFfCJu6xNrdoW67wlxtbELWZBhcGIn6fH6Hef7j+1V1m38f4m0Dye5vmT8Tk2rYFRKkFoWWCtZRB0ecLtvCIt5CajC46tKsmgjnMq9bY/hBgqypFdYy3UayUukMRRWl1zYNr3xGg6MQilaA/YYcTjhIG1T27NcdD7dQxvAKOOxePA1YgZq/Gqu9muO4oQR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000010, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 16, 2024 at 8:41=E2=80=AFPM Matthew Wilcox wrote: > > On Sat, Mar 16, 2024 at 03:17:57PM -0400, Pasha Tatashin wrote: > > Expanding on Mathew's idea of an interface for dynamic kernel stack > > sizes, here's what I'm thinking: > > > > - Kernel Threads: Create all kernel threads with a fully populated > > THREAD_SIZE stack. (i.e. 16K) > > - User Threads: Create all user threads with THREAD_SIZE kernel stack > > but only the top page mapped. (i.e. 4K) > > - In enter_from_user_mode(): Expand the thread stack to 16K by mapping > > three additional pages from the per-CPU stack cache. This function is > > called early in kernel entry points. > > - exit_to_user_mode(): Unmap the extra three pages and return them to > > the per-CPU cache. This function is called late in the kernel exit > > path. > > > > Both of the above hooks are called with IRQ disabled on all kernel > > entries whether through interrupts and syscalls, and they are called > > early/late enough that 4K is enough to handle the rest of entry/exit. > > At what point do we replenish the per-CPU stash of pages? If we're > 12kB deep in the stack and call mutex_lock(), we can be scheduled out, > and then the new thread can make a syscall. Do we just assume that > get_free_page() can sleep at kernel entry (seems reasonable)? I don't > think this is an infeasible problem, I'd just like it to be described. Once irq is enabled it is perfectly OK to sleep and wait for the stack pages to become available. The following user entries that enable interrupts: do_user_addr_fault() local_irq_enable() do_syscall_64() syscall_enter_from_user_mode() local_irq_enable() __do_fast_syscall_32() syscall_enter_from_user_mode_prepare() local_irq_enable() exc_debug_user() local_irq_enable() do_int3_user() cond_local_irq_enable() With those it is perfectly OK to sleep and wait for the page to become available when we are in a situation where the per-cpu cache is empty, and alloc_page(GFP_NOWAIT) does not succeed. The other interrupts from userland never enable IRQs. We can have 3-pages per-cpu reserved for handling specifically IRQ-never enable cases, as there cannot be more than one ever needed. Pasha