From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D496CEE32D for ; Wed, 9 Oct 2024 17:03:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF79D6B00C4; Wed, 9 Oct 2024 13:03:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AA66A6B00C5; Wed, 9 Oct 2024 13:03:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9475C6B00C8; Wed, 9 Oct 2024 13:03:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 74FBE6B00C4 for ; Wed, 9 Oct 2024 13:03:30 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C9674818A1 for ; Wed, 9 Oct 2024 17:03:27 +0000 (UTC) X-FDA: 82654684938.09.B376616 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf23.hostedemail.com (Postfix) with ESMTP id 18849140025 for ; Wed, 9 Oct 2024 17:03:27 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=aXnbnZZn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of jannh@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728493381; a=rsa-sha256; cv=none; b=YG4Uun+Z4i9X/5zS7+gyIMveRBa+s0x5xF+kefFeo6LU4/a1Z67byWiyPXiQKffHtxuE80 hU1ZYgz6G/syDC8jJaLRmBHrHkNvrEm+sema4ICnp20Woe8QfOp4Ya4nGRDEEurP6J3fIo tXHyPFENHcmTSXPDWmzs9tF2b7Yd/HI= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=aXnbnZZn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of jannh@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728493381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8ec63M9Z2mnZ2BK+SUFLWuoDF9eppHWtyWp+65Wr+50=; b=w1njdyzybFy04Wic3D/CeAJUGvgUaOz5rgS8H/0mhsYoU8vE8KhzatCohvJ/kq+s/86yDp JICg0WzuQ5yvxhvkY4dnFLHu5WjxoB5qEH4UlVOdijAqjpQkwcq8aI7tYnamVDIti2+8mP kpmfAJGxR9qhTUo3Y/0WnyxiIXott4E= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-431157f7e80so6635e9.1 for ; Wed, 09 Oct 2024 10:03:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728493406; x=1729098206; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8ec63M9Z2mnZ2BK+SUFLWuoDF9eppHWtyWp+65Wr+50=; b=aXnbnZZno1dfVsWQ4yIct/DAhodlvNKMEYZNUEN1vu+WRAP5ZwAO+CWRfISt3xSm6I wwqCMgsQOJ4ZX45NVzVff0hXK+e9/ZAH5adPZBpPj/js8AXEcDSOpQLqTMcjPVWWhqE8 E/Zb2aOOCgeDL4vZ8fdwNR/ClyJoPDCdrhMz/4Qv57pH1UMNzhCfNmk78qUEAA5U2VnH GlZ8p+r+lMAeR/UPpPX9cU22wf63/fS3NrY8Wx8T9yuIa2Ut50ag4rkE0mhDNKm6zcj/ 4dcnchlyjEsVSX8r/bWpUQhyhefyerqlrjhVRW844obbEhj3YJVEcVo/DPJsACTTR0xq ZycA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728493406; x=1729098206; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8ec63M9Z2mnZ2BK+SUFLWuoDF9eppHWtyWp+65Wr+50=; b=wCQMTBZGAcbbRp57oip0gY/8R1xc+UfNveBd4YruTCPB2wO5KHkbk/CtEXvcPQmZyb 849CKsxo/4rs1PcvWv2md7Jns1glj/PCXQj/XzbJRBjPVWs/XcbnXAy2NmcDoru4FXa/ 145wYit41KyeSDGt5PqqiVSLMcTgSE4zqgM/kyBt/8HKFOI0Glsp4c19Lk2Clti0R7uX wdYUUrNvW+o2qUtQDDZzfsDHh/38AZu1rfIpOTuzIH60+a4N4E5Mav3MbKrGYnc6s2eR 0P9XdqqmmA8HLLEN0JkKSmf6JGH9XvdO/L+KroNeC/uICctvyeOjWXKzvm1ecklnJ9qV VzJA== X-Forwarded-Encrypted: i=1; AJvYcCXZLVf+q+iMoEszBy2TxlU21BWtHV5pP5G59EBCRPpNETpW9ljFhOCgDiyrslKb9pX5S7WnOGTpow==@kvack.org X-Gm-Message-State: AOJu0YzFh6nEi2pTwp+1LpqrFQieHWsAc5cGPJZuKyEKHGOhC15fHXAr 5KtgmuZYarZzAXJ5aly8HhqIxXJnFgaYWHOKSheCgtAmD6ppPCzpldJawJwViRPekhZiAJ3dMVO VRiVk8S9Mjt+/b9m2JzaJ5srbafoT7ygTBcc8 X-Google-Smtp-Source: AGHT+IH3gl8juZOigw+MxQlp2jNC2mCofepON7jhbWcSWQ/mk8khIIP1Uw/HQRJPdHyuneyy1GivHmlFb8tRVjjzCKM= X-Received: by 2002:a05:600c:b8a:b0:42c:b0b0:513a with SMTP id 5b1f17b1804b1-43058cf488cmr5665585e9.2.1728493405368; Wed, 09 Oct 2024 10:03:25 -0700 (PDT) MIME-Version: 1.0 References: <20241008-stack-gap-inaccessible-v1-1-848d4d891f21@google.com> <1eda6e90-b8d1-4053-9757-8f59c3a6e7ee@lucifer.local> <3d334639-feb9-474d-a4ed-6aa6d2afb33d@lucifer.local> In-Reply-To: <3d334639-feb9-474d-a4ed-6aa6d2afb33d@lucifer.local> From: Jann Horn Date: Wed, 9 Oct 2024 19:02:47 +0200 Message-ID: Subject: Re: [PATCH] mm: Enforce a minimal stack gap even against inaccessible VMAs To: Lorenzo Stoakes Cc: Andrew Morton , Hugh Dickins , Oleg Nesterov , Michal Hocko , Helge Deller , Vlastimil Babka , Ben Hutchings , Willy Tarreau , Rik van Riel , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Liam Howlett Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 5hwo7c3dxiogd4dpo4rkxrna7ojrdyfs X-Rspamd-Queue-Id: 18849140025 X-Rspamd-Server: rspam02 X-HE-Tag: 1728493407-927348 X-HE-Meta: U2FsdGVkX1/RjjNDAbj2KnZ4xnERUPT8jCU5IYc3zQyC1uzUE8W3F41he1YfvO5I/7q551+cpiX/+adwoCgUeaZQkpmUU4lm8CCtjIGwBFwrjvXZBhgYAXBGaCk0SGJaHl7rPHxjFB0bJVefcmS1h3RDTEiYzDAXAmfj16GTpS8s7QaoOBiWW21kq3GDpEpG5gJq89d2HNgZeJ5Mk9LBDbbcuOVwhS5Y/hQYa0CkN48CMjsoHizd2USIVFkHaLb+J86kguhhQw4ClccJkYgMTkarLkeyr562DuPZglzf9lgvB52F4zcsVnKBsK3tmG1NQHDS2sp7zTGSeRvr2lwwi34EcqT8BLdJFapiTjDCgzuNWnB+0OOwziy51A25eF2stB+L1fLYKJ3HIpDg4UOsAMSRPlDgjf/b+4Y38YRJ/zHnYPOsRZNCwWlHvBOV/9F0/m1rIHfi91asGvDVR0URO3fp6rwOaDKvzs4sN8xCWPH+NeWH2dQFIX770Np9ngYxdq/gR6OoI+xb8uwqua2ByJ783WFpyJc3BtxCekKfo0LU/9LDqfGAmiafiNyM4RrlEVQwhY0Wd4im6FKz3mZnZTAQlNqBdG3duZqaS2MJOAfm+pWKH5rDnJtkrp3+VEa3sitiSOOZZLPUwuT3tWsgiY6KbTqSFS530+uutp2FxudQAWnqVFazYAwCX4gfBwEGnz5VI/ry1TewYRJy0vcB+EHZseH1MPlQFeqwgUWWhZA0/FEjnujTI9hsmLEbCe405UPpxOVPsGZtLR1uRdAR7Gq3PrGiqg+kUh2z6kDcdajykxF0j6xHcEmD6GOjotIr/6bmsRViq0wZBMggqA20JMuWVnDczloijarJIkzwPSXWz7pgnoCYasaTF8te4UsTSmlQYoh/92pfpZpUSQp0fTSQI9L4/qupKJw0YnECJKlDo8pcPP9QsjkOtsov83dz3qXg/EtsM/c+JHLwZsz JC5BZMKk CPcZQeRrmVA/laBRpSaSqvDcOUZeFuWB6ElWeHR5uawHuD10QDwwVrrIu/x+dKYRYBd/5jUgGjt2XSa/VNp++pCh8B64i/RHaC8Fn7GYw4WWfELQ4qb4Z6a8HqyDWlO9cabaQCHwKYcNs2HI647fJSTmpkSmwmR0WcB3fbfPKifily94dse5JPHcm5e9JugWAnnHKQMhc/kyadh5c52lioKZp1Zv1LRdO8VydAAcvwB01A84tdt5/JMM+jGc0USQee1MTbX+s7PPeq9zW0ALvZ2WuE0bdDgTgbuJh7posAQDS7oY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 9, 2024 at 4:53=E2=80=AFPM Lorenzo Stoakes wrote: > On Tue, Oct 08, 2024 at 04:20:13PM +0200, Jann Horn wrote: > > Though, while writing the above reproducer, I noticed another dodgy > > scenario regarding the stack gap: MAP_FIXED_NOREPLACE apparently > > ignores the stack guard region, because it only checks for VMA > > intersection, see this example: [snip] > > That could also be bad: MAP_FIXED_NOREPLACE exists, from what I > > understand, partly so that malloc implementations can use it to grow > > heap memory chunks (though glibc doesn't use it, I'm not sure who > > actually uses it that way). We wouldn't want a malloc implementation > > to grow a heap memory chunk until it is directly adjacent to a stack. > > It seems... weird to use it that way as you couldn't be sure you weren't > overwriting another VMA. Here I'm talking about MAP_FIXED_NOREPLACE, not MAP_FIXED. MAP_FIXED_NOREPLACE is supposed to be sort of like calling mmap() with an address hint, except that if creating the VMA at the provided hint is not possible, it fails. I remember Daniel Micay talking about using it in his memory allocator at some point... > > > > @@ -1155,10 +1157,47 @@ int expand_downwards(struct vm_area_struct = *vma, unsigned long address) > > > > /* Enforce stack_guard_gap */ > > > > prev =3D vma_prev(&vmi); > > > > /* Check that both stack segments have the same anon_vma? */ > > > > - if (prev) { > > > > - if (!(prev->vm_flags & VM_GROWSDOWN) && > > > > - vma_is_accessible(prev) && > > > > - (address - prev->vm_end < stack_guard_gap)) > > > > + if (prev && !(prev->vm_flags & VM_GROWSDOWN) && > > > > + (address - prev->vm_end < stack_guard_gap)) { > > > > + /* > > > > + * If the previous VMA is accessible, this is the nor= mal case > > > > + * where the main stack is growing down towards some = unrelated > > > > + * VMA. Enforce the full stack guard gap. > > > > + */ > > > > + if (vma_is_accessible(prev)) > > > > + return -ENOMEM; > > > > + > > > > + /* > > > > + * If the previous VMA is not accessible, we have a p= roblem: > > > > + * We can't tell what userspace's intent is. > > > > + * > > > > + * Case A: > > > > + * Maybe userspace wants to use the previous VMA as a > > > > + * "guard region" at the bottom of the main stack, in= which case > > > > + * userspace wants us to grow the stack until it is a= djacent to > > > > + * the guard region. Apparently some Java runtime env= ironments > > > > + * and Rust do that? > > > > + * That is kind of ugly, and in that case userspace r= eally ought > > > > + * to ensure that the stack is fully expanded immedia= tely, but > > > > + * we have to handle this case. > > > > > > Yeah we can't break userspace on this, no doubt somebody is relying o= n this > > > _somewhere_. > > > > It would have to be a new user who appeared after commit 1be7107fbe18. > > And they'd have to install a "guard vma" somewhere below the main > > stack, and they'd have to care so much about the size of the stack > > that a single page makes a difference. > > You did say 'Apparently some Java runtime environments and Rust do that' > though right? Or am I misunderstanding? Ah, sorry, the context for this is in the commit message of commit 561b5e0709e4, and the upstream discussion leading up to it (https://lore.kernel.org/all/1499126133.2707.20.camel@decadent.org.uk/T/). So before commit 1be7107fbe18, these workloads worked fine despite the kernel unconditionally enforcing a single-page gap; and only when 1be7107fbe18 changed that gap to be 1MB, people started seeing issues, which 561b5e0709e4 was supposed to address. So my idea with this patch was to revert the behavior for such workloads to the pre-1be7107fbe18 situation.