From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4CA7EE57F6 for ; Thu, 12 Sep 2024 01:08:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22CD66B00B9; Wed, 11 Sep 2024 21:08:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1DD346B00BB; Wed, 11 Sep 2024 21:08:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 056726B00BC; Wed, 11 Sep 2024 21:08:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D85BE6B00B9 for ; Wed, 11 Sep 2024 21:08:38 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 510F6140BF1 for ; Thu, 12 Sep 2024 01:08:38 +0000 (UTC) X-FDA: 82554301116.12.5980486 Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by imf12.hostedemail.com (Postfix) with ESMTP id 79AD340005 for ; Thu, 12 Sep 2024 01:08:36 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LpAmEqnP; spf=pass (imf12.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726103212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VcDVTvtgjgyED80cdWwepObDNGz7pJtl8u4QlhG26f4=; b=tLoKGwEJvM/lrSfFhqe5e9JyKMUoiXrad7MBVSlVqJwB9dLfZtZoX+zWfHjY+7cD9wSVuY A/adI/MBkiQQsDvSUZeHvMqogyS7r0NkUelNW+gGQ0eRoJ1Clqiw/81qcfuW0971jDcO/5 MVUmmaF+hlTPrSa9HIlZxmUmdLOVWHs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726103212; a=rsa-sha256; cv=none; b=QV8D1lcTnhRy77sw94HqoHVYpVN43DVcY9kyfHPvH+INw0WWJ8+yymAFCfzN7i6gf4FEpu Fx6pj1Wg2q8GpEXS+pBYRQzEbftxoY/T1kJuiJ12xEbCZg66c8zHuXKy8u6HVyN/S17vuQ 1nZc9Qs07C8QbNho/EcmdEferrztWP0= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LpAmEqnP; spf=pass (imf12.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a7a81bd549eso47112666b.3 for ; Wed, 11 Sep 2024 18:08:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726103315; x=1726708115; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VcDVTvtgjgyED80cdWwepObDNGz7pJtl8u4QlhG26f4=; b=LpAmEqnPRbiBajDH9gKeteq/0diH1MPlq6XyT0L0B/2j6SRfdVqD2QBOQkLNKMioIZ 0jg/55yLnQ3kXMZ9oqOJuaHoLdQRkUeYWbCbx8AtFCpMcyzk1/enB1aF1vex26PKRXFo lAz9gsvnxs3HjJ0Ji9QfL+8DkmZxbDQQB9A1C7f/ncRUglKkVvCDpMw+naYgJed5hBy1 I5+Ibwves/ANVYdqkKXiPdB6Jwp7pL3awVKGzUFOrRvlP16bG5yg52ak0R+WgwiDBBTy sunB5+oBmv5M9IWntJ7PKDrsl/qkd6hFsPDpqcSEgRqNLayM+9rjNq1g9EN5xxWFxKah Wi9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726103315; x=1726708115; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VcDVTvtgjgyED80cdWwepObDNGz7pJtl8u4QlhG26f4=; b=RkMFeTREXlJP7MvgzsD7hhPYYGx+ucO6xOjDUivoRqABkfdbhqwZgd2mxFwGiyWW8T u604JikWewt4mv6rw2KYzoHFmnajtncBWWpvmZKf4X/7tgTbgdgsrM0tPU4I0RYNqszb KcCsbEysHT1cHds4nY5b7a7oeyBqDWjDv2A6qPg/n2BU6VDHTFVh8RFeURDc6qaUa2nU RqQMyeC2PmB64Y8vzICDovFL2kOLkX+crZjkcSE6KHqT8vUoAge3xpiJaWKL0r6xHE5x pIiLGYL/b03rPlEHIrwOn2iD+Z23drGpzzV4AuWdxTD1yID87YIT9ZXd029o72EESaI0 6o+w== X-Forwarded-Encrypted: i=1; AJvYcCX/L5aI9CMVfpEpEUd8XfWh6zyNmNLI2MX8WwC471K/1U/GOXHfEYBTVyKFbZWqOc1IgE2MksdKKA==@kvack.org X-Gm-Message-State: AOJu0YwpZG0eFL0Xbkood94erDqq3L14+raa2yVa7fDWkyIsFqq3/6WS ihL8zUPY0Dv3esC50b4sLa0IZyvs3YS3/8nWTHQ0csyLKpPvT3yT0fGQdc0HfOdC5jlcvYkLBdr zs+T+SMdCuxPKBsdx26Rk1VYoP0Q= X-Google-Smtp-Source: AGHT+IGuNHOwOxJr0yfxP7DuKyaFe2n8orJM8Wx7yZsuWqL16QoRup0udjib4V4TG14i+Rj1oiO2AzEZw1rxrAKnR1Q= X-Received: by 2002:a17:907:efcb:b0:a77:f2c5:84a9 with SMTP id a640c23a62f3a-a9029504f45mr115830966b.18.1726103314913; Wed, 11 Sep 2024 18:08:34 -0700 (PDT) MIME-Version: 1.0 References: <95c4efe9-e92a-46fe-bf41-9141e125332d@gmx.de> In-Reply-To: <95c4efe9-e92a-46fe-bf41-9141e125332d@gmx.de> From: Yang Shi Date: Wed, 11 Sep 2024 18:08:23 -0700 Message-ID: Subject: Re: [PATCH] [RFC] mm: mmap: Allow mmap(MAP_STACK) to map growable stack To: Helge Deller Cc: "Liam R. Howlett" , Helge Deller , linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org, linux-parisc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 79AD340005 X-Stat-Signature: 6ijhm4whgt54rsn3bwpgjiystfaq54k8 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726103316-515699 X-HE-Meta: U2FsdGVkX1+/dlCEvwpu0CyA+VsZRwScve382GrXAAlIRoZ7S/qRcR1ps12xpKcZi/RROENrfviQU9P9TGkbekWU2TaCpBngi0ZoWrfUGy9eD5NAOfvwcpaUXtbD3rfmmVa6BS8VQEekdT30qAUu4QSj+pPbn3q+0N+ezVa043e71WiMimhrak5tQ8wsPIMDurWa3O1s3vntaa+ZWFrWa9K6Uon/GG8YHjrch7ki2zU/8KsheiYu5DmOGYflvtMolgSmbRqbzi4JfJwPWnwUdCdViIAsRTDziM/2o6DNj2Q5cIRXnzHXG9djuPguVXZQwlL1NVpQyb0KvHUOgNOChnCte3Jt/weXyJ7RXKyGmjNFypNykw+M8UQ7Bm97bUvLBQvLxn2ZPY1qiYFxFX1qH7CYWksDvN3PFeHJ6NqzOgfDJ9KsMbNy0qRxosUuvjLY8zOzSgwFkjXl53PTV/gG3lXMOoKOkflhI18prD9PoG0Sz3f1BWAahYzpVpNgW2jaPcMpqxrgzct9P2V7OxMqQvAKcA5TczKk6XWY/EGYctX4lnDJ1LtxIXJ8AiEpkFmTuDu9SIJd8YYCWQ14B85JT/g57HXjf/8gG7XvJCvw4RfkEixEJUpaCov+FisdgycFyTtF8Lcs6QUsp7KOpLip3ks8mKNGnY6TLmhqvi4UrkYut4s2SEzzSQIpuHggWG2R6nRneQ8aST2VpeIaDB1j8XLRlrVbot4T8rhdZFuF75c0iczWXgXHOW6iUornaNkaxivHzTxmMPZ4THumom7vScvQ8fBgPkgXXGBF+UWvRfRbUt4Ro5n2DnSv+qz05ZR/QclZ7tr3+34AdlhZRxYvRoVWK8Wahclppwo+VlpNOPWpbOs/fEvMFC0nBfqqeV1T1Fxhw4ULCyPbA3UvOF7uW88LfS/6Cyio5qfs+f4DoitWUTPpndByh1+UZrsxOFEQjRTwAgFsY2EvMHLMO1z BcBxh5ga 9edpwzE0UnOI5gfAMxo884Yq2QdVY/OyX10L/g0FBPv0dJ2hPj0DdjI0TBQutAlLKmJR0elrho/IK7M/rvxgHdqMrZccio+dQmQnDXKaNOhdCv49ZNtB7PEiggdLW9lHhAGOmOJpjX3fcIsrqj1RGSzH2FMU4qJdZcNH+vejL7MqdT/hW0vsRxghNGxGYvRDX5TQEneGndlZ9244mSauJhMp10oLpkIPgfWDqublv8wehwRyrVweiVD6BQbQnw2oqvy2Oa+pswAb1u7kjPPM4Vem1o8ZemtDfhguKpz05yzx4mXjAxlK6QliZZo823MFLVtt4fB2odBuURySfdQEvJ6YQ2iBtzN6c0fNKPlc6ppSLk+AZoqV3TrBKb21ydFu+LIRgJXlQhaQ3IXV+7txxo6a2jd9OCA1YggAm61AyihyhsOuRRZyuBz0Hbg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 11, 2024 at 5:50=E2=80=AFPM Helge Deller wrote: > > On 9/12/24 01:05, Liam R. Howlett wrote: > > * Yang Shi [240911 18:16]: > >> On Wed, Sep 11, 2024 at 12:49=E2=80=AFPM Liam R. Howlett > >> wrote: > >>> > >>> * Helge Deller [240911 15:20]: > >>>> This is a RFC to change the behaviour of mmap(MAP_STACK) to be > >>>> sufficient to map memory for usage as stack on all architectures. > >>>> Currently MAP_STACK is a no-op on Linux, and instead MAP_GROWSDOWN > >>>> has to be used. > >>>> To clarify, here is the relevant info from the mmap() man page: > >>>> > >>>> MAP_GROWSDOWN > >>>> This flag is used for stacks. It indicates to the kernel virtual > >>>> memory system that the mapping should extend downward in memory.= The > >>>> return address is one page lower than the memory area that is > >>>> actually created in the process's virtual address space. Touchi= ng an > >>>> address in the "guard" page below the mapping will cause the map= ping > >>>> to grow by a page. This growth can be repeated until the mapping > >>>> grows to within a page of the high end of the next lower mapping= , > >>>> at which point touching the "guard" page will result in a SIGSEG= V > >>>> signal. > >>>> > >>>> MAP_STACK (since Linux 2.6.27) > >>>> Allocate the mapping at an address suitable for a process or thr= ead > >>>> stack. > >>>> > >>>> This flag is currently a no-op on Linux. However, by employing t= his > >>>> flag, applications can ensure that they transparently obtain sup= port > >>>> if the flag is implemented in the future. Thus, it is used in th= e > >>>> glibc threading implementation to allow for the fact that > >>>> some architectures may (later) require special treatment for > >>>> stack allocations. A further reason to employ this flag is > >>>> portability: MAP_STACK exists (and has an effect) on some > >>>> other systems (e.g., some of the BSDs). > >>>> > >>>> The reason to suggest this change is, that on the parisc architectur= e the > >>>> stack grows upwards. As such, using solely the MAP_GROWSDOWN flag wi= ll not > >>>> work. Note that there exists no MAP_GROWSUP flag. > >>>> By changing the behaviour of MAP_STACK to mark the memory area with = the > >>>> VM_STACK bit (which is VM_GROWSUP or VM_GROWSDOWN depending on the > >>>> architecture) the MAP_STACK flag does exactly what people would expe= ct on > >>>> all platforms. > >>>> > >>>> This change should have no negative side-effect, as all code which > >>>> used mmap(MAP_GROWSDOWN | MAP_STACK) still work as before. > >>>> > >>>> Signed-off-by: Helge Deller > >>>> > >>>> diff --git a/include/linux/mman.h b/include/linux/mman.h > >>>> index bcb201ab7a41..66bc72a0cb19 100644 > >>>> --- a/include/linux/mman.h > >>>> +++ b/include/linux/mman.h > >>>> @@ -156,6 +156,7 @@ calc_vm_flag_bits(unsigned long flags) > >>>> return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | > >>>> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | > >>>> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | > >>>> + _calc_vm_trans(flags, MAP_STACK, VM_STACK ) | > >>> > >>> Right now MAP_STACK can be used to set VM_NOHUGEPAGE, but this will > >>> change the user interface to create a vma that will grow. I'm not > >>> entirely sure this is okay? > >> > >> AFAICT, I don't see this is a problem. Currently huge page also skips > >> the VMAs with VM_GROWS* flags set. See vma_is_temporary_stack(). > >> __thp_vma_allowable_orders() returns 0 if the vma is a temporary > >> stack. > > > > If someone is using MAP_STACK to avoid having a huge page, they will > > also get a mapping that grows - which is different than what happens > > today. > > > > I'm not saying that's right, but someone could be abusing the existing > > flag and this will change the behaviour. > > Wouldn't a plain mmap() followed by madvise(MADV_NOHUGEPAGE) do exactly t= hat? > Why abusing MAP_STACK for that? Different sources and reports showed having huge pages for stack mapping hurts performance. A lot of applications, for example, pthread lib, allocate stack with MAP_STACK and they don't call MADV_NOHUGEPAGE on stack mapping. > > Helge > > >>> That is mmap(MAP_STACK) would set VM_NOHUGEPAGE right now, with this > >>> change you'd get VM_NOHUGEPAGE | VM_GROWS > >>> > >>>> _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | > >>>> arch_calc_vm_flag_bits(flags); > >>>> } >