From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DD74C6FD18 for ; Wed, 19 Apr 2023 14:38:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0052F900003; Wed, 19 Apr 2023 10:38:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ECF3B6B0078; Wed, 19 Apr 2023 10:38:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D49B1900003; Wed, 19 Apr 2023 10:38:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C59E76B0074 for ; Wed, 19 Apr 2023 10:38:44 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 88CB11A026B for ; Wed, 19 Apr 2023 14:38:44 +0000 (UTC) X-FDA: 80698396968.14.65049DA Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com [209.85.219.172]) by imf05.hostedemail.com (Postfix) with ESMTP id D747B100019 for ; Wed, 19 Apr 2023 14:38:40 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=paul-moore.com header.s=google header.b=LQN4aCOq; spf=pass (imf05.hostedemail.com: domain of paul@paul-moore.com designates 209.85.219.172 as permitted sender) smtp.mailfrom=paul@paul-moore.com; dmarc=pass (policy=none) header.from=paul-moore.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681915121; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fWZiMALMH89i7PDBcLl9ZeqJbldqh6JjZ32NtRW6De4=; b=kWKHbY8tsidwSyExdgxDx6ACtaiESQtpY94tMgC27DfXC3IoFNGdrrH3LPUNnTTQpxZ2xl gl5jaDtt8JmoAEpWFI/4Njad2ciX+Mrss9+PA1+2DoD7VZZvNJroITXcFiC/lqdHFttcyT YCW00vJgX1QjscvMRwP5QIOsFL38r1g= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=paul-moore.com header.s=google header.b=LQN4aCOq; spf=pass (imf05.hostedemail.com: domain of paul@paul-moore.com designates 209.85.219.172 as permitted sender) smtp.mailfrom=paul@paul-moore.com; dmarc=pass (policy=none) header.from=paul-moore.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681915121; a=rsa-sha256; cv=none; b=18DIujG4+Uof1Za6pEuSg+bfq52EaOeS2pC1VN3t/Im3G+JyooEw0SnK5p/T7EJTPJ8YTs v3JyyX/JRzwB+0Lqa5s9XvuBZnG54URhkjWiha5oo9g16H1A2elXE6aUzNgO9a6+/9AsV+ YXN5NE/mLWe9CKG9JFRpbXbSpmAi19M= Received: by mail-yb1-f172.google.com with SMTP id m14so14391992ybk.4 for ; Wed, 19 Apr 2023 07:38:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paul-moore.com; s=google; t=1681915120; x=1684507120; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fWZiMALMH89i7PDBcLl9ZeqJbldqh6JjZ32NtRW6De4=; b=LQN4aCOqoHQ5A0XxtIiHiweiMtxVPWBLvrXNdIl/cTWwf0SCEDWxa5DtjLN5b0TV0b pqweuMH6UWw3sANhI2F9HIF3EqYZivxt5L57nmH365GJ9Ngtnjy/C+cfy3U/9cc/ej0d lK0TAHQT1xBmSO80pIojRmzPa6Iaxsctu0wnpC9zX8wIpZM0lGsIx+qkSxsHf5YnVw/3 kNdwdnoEy9auCi7Ib1PuezR6J2Djb7+mawoXuzRMjRfLTELeh0o7Rawec08WY7tiXPV7 Io7koP2QS0TNmmsZ4i4wnYWbZ7Nda5sOYWMm+mtC0yVEQKAydo6G019+TMJVpgAgt+9E xctQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681915120; x=1684507120; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fWZiMALMH89i7PDBcLl9ZeqJbldqh6JjZ32NtRW6De4=; b=UQcVjHybCpJFO6bcL/8L1SHBecYASRC4XDcMdXmb2aASZVcirDK86K67zaSPM9u43K DSRoaaxUyA/jFdczCAfF+OFSIsF9q7/y4hcvEaLP33URJn2bQIT9pdyXZuupBMEOULdH 8zLZ4H0HECdVXarelrZKmERJRmxHfH9haEFHTmgrmo/F6BpLnlFgJS9b+lEc0/aLzzGu 0wOlqYaYKhZOeetJrLJa+j333x+sg6hKJbfv310NI430Ae4vIi+XWkEr2gOvW+BBnvnc kLC8+t3YUqAYFDqog0QmJ+hk0ZQGwyveBzQdIcpZWaCU2z65JoTM01cgv2UWQW/DR3T/ NEjw== X-Gm-Message-State: AAQBX9epZMkqeU8yj7NSv5FrJJUzST1nu+2mteiuZjPFO5RA7i7TtW6M MxwqlJRMPeerkrLwYRFG00PqpypIvT/oJRN3epgn X-Google-Smtp-Source: AKy350Z4qMNIiRgB/vcByC5+mNU91yN6YSfmxnRnim0l4y4Smf7s0LYagf+ZnWYtLR5cTFGYaT3phWDJ7z6zjqPa6EI= X-Received: by 2002:a25:aea2:0:b0:b8f:523f:3d02 with SMTP id b34-20020a25aea2000000b00b8f523f3d02mr21147135ybj.21.1681915119903; Wed, 19 Apr 2023 07:38:39 -0700 (PDT) MIME-Version: 1.0 References: <20230418210230.3495922-1-longman@redhat.com> <20230418141852.75e551e57e97f4b522957c5c@linux-foundation.org> <6c3c68b1-c4d4-dd82-58e8-f7013fb6c8e5@redhat.com> <22aee5ea-dd6b-ac2b-0b28-a25ee6602b48@redhat.com> In-Reply-To: From: Paul Moore Date: Wed, 19 Apr 2023 10:38:29 -0400 Message-ID: Subject: Re: [PATCH] mm/mmap: Map MAP_STACK to VM_STACK To: Matthew Wilcox Cc: Waiman Long , Hugh Dickins , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joe Mario , Barry Marson , Rafael Aquini , Stephen Smalley , Eric Paris , selinux@vger.kernel.org, James Morris , "Serge E. Hallyn" , linux-security-module@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: D747B100019 X-Stat-Signature: wg4yr3e8gi461tq9sthxciyhu4cffuuf X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1681915120-845073 X-HE-Meta: U2FsdGVkX1/OW/N1DbsEsAZj+TBDyqPMVAuCTWIkZXJmJnq5+CKOmC1dXiHXC+eqvLYN/X0ireG/Un2N8wWAWhjZ98k3nsINBKhZBKHob/LbnNEvV3xxpa23U8HBIL5x1Dm6P2pA/QIO4OF5hSw7IuUkwb+gA+loCMWUdgWFLpWnpKKVkdHnH1GF5m915MgNL0NN8hd2p6n6dJKZPvC+R2eZWwQc8RcfPH0VvRyoZUyxXHex3aMAQvEa8r6SVkFwiiZGbhj2yrgifVAAgVr66/ztauhVClZdFhc7w2xYq50UREJ4YZcuB1CPYiHhosWbf2Hbjm5jB+MiQW4XR7AwL20gvgCypP0l/gsV8SaGeIXKCCIm3iPYeJgzO2iyy1BH292K6xEZk93NCSNaaQ67rMZlD+u6SEpkqX+h07XVWNHzZaj2B6gkHF8Kuu3NEsC/4En/hAx8L57PO6TqfWOFo88XwIoyd0n8C24hBN7XvrsjWkqlbFX5KX3XSybJ8IYxB0CFuryiXbDvkdBtEMbm6OTpQMKQ1+2DmQgr2nGhi2UWYl1m9/Z/suzF1/3il2R2OQhWzTqh5ZAcQgoQpOLTMPPAGPFmBeIVaWrDXY6WYPueS7rfunS5WTg3+bZdTPdFixrjFsiy6teXUGoxr3hNdDlVkDhPSQA6vSz0VavHDnRJw5WedF1/5Gpze4jnf7jwhaVO9omCbcbR9TGcCnY+gpfCmwxm9+ww3GQzZH4O5cdIKxw6RWfDHSkpFG39RZWORsa4Xf5EvA5XRx4eCTZN7M+nGpqS4KlVFQ3H1j0Qys1WkXU5cuI3wNCx7F2+ueZZakrMXL+E8H9N0C3phj9zZd7n3D0jQ0uQkQ3Ak4UTewO5V8rwo3Hje3AOhwJweQtlcllLblklejQlB/xn9r8Y87cFYsI+4vNyOul52eSJA9y5WdcEvpJMgL9y+h1fAULaQFzfOL3SqxR7lYDquui CMB+WOW9 0SwlxpGIc2U7AavBXgLElV9UpKHzMHR/xZP/goPnzHtaTDicQLq9x5ZpDkzzp17FoZw2qf3mLyS/MFzFobaGcoT6oH+TTunA2e8Lb4qu2NJKvasblbM4RuLBtnKDerpQaeis/EgxJYBu6pfahD6xyIumXZvOehO0LJ+e4rvv7vXfeTR1LFgFCURo+s7H9aHQNw4hLmzOEtiJFpQezYndHhSThhd0pCsHV2bWlJnyA4o/mz/OX+nvVYLF0l6QSzhcK5+iTxshcNHnb64AljAkUSNb/d0InIbuwvyQinGHLoomMQPMljEZyFHz8YUuRsylMq/IAFyspYp19ZGX74RMLva+3u1Jbk7OF8WJDXXoYesXthU0s67cy79jQ0Qd8kvzApp+tkyQW6kGG7u+G8F+a4X5mnijDJ8/WduCrc9wilCNb2/rxsdfQaHdB95qQbgYsH2C00TWzTfTG7ugDAByW3rOtZ1tzEF9dFEUyry4t5WNBj1Oz2NKdjz7twkhcbSIBB7UpmHNNjy7Htk8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 11:24=E2=80=AFPM Matthew Wilcox wrote: > On Tue, Apr 18, 2023 at 09:45:34PM -0400, Waiman Long wrote: > > On 4/18/23 21:36, Hugh Dickins wrote: > > > On Tue, 18 Apr 2023, Waiman Long wrote: > > > > On 4/18/23 17:18, Andrew Morton wrote: > > > > > On Tue, 18 Apr 2023 17:02:30 -0400 Waiman Long wrote: > > > > > > > > > > > One of the flags of mmap(2) is MAP_STACK to request a memory se= gment > > > > > > suitable for a process or thread stack. The kernel currently ig= nores > > > > > > this flags. Glibc uses MAP_STACK when mmapping a thread stack. = However, > > > > > > selinux has an execstack check in selinux_file_mprotect() which= disallows > > > > > > a stack VMA to be made executable. > > > > > > > > > > > > Since MAP_STACK is a noop, it is possible for a stack VMA to be= merged > > > > > > with an adjacent anonymous VMA. With that merging, using mprote= ct(2) > > > > > > to change a part of the merged anonymous VMA to make it executa= ble may > > > > > > fail. This can lead to sporadic failure of applications that ne= ed to > > > > > > make those changes. > > > > > "Sporadic failure of applications" sounds quite serious. Can you > > > > > provide more details? > > > > The problem boils down to the fact that it is possible for user cod= e to mmap a > > > > region of memory and then for the kernel to merge the VMA for that = memory with > > > > the VMA for one of the application's thread stacks. This is causing= random > > > > SEGVs with one of our large customer application. > > > > > > > > At a high level, this is what's happening: > > > > > > > > 1) App runs creating lots of threads. > > > > 2) It mmap's 256K pages of anonymous memory. > > > > 3) It writes executable code to that memory. > > > > 4) It calls mprotect() with PROT_EXEC on that memory so > > > > it can subsequently execute the code. > > > > > > > > The above mprotect() will fail if the mmap'd region's VMA gets merg= ed with the > > > > VMA for one of the thread stacks. That's because the default RHEL = SELinux > > > > policy is to not allow executable stacks. > > > Then wouldn't the bug be at the SELinux end? VMAs may have been merg= ed > > > already, but the mprotect() with PROT_EXEC of the good non-stack rang= e > > > will then split that area off from the stack again - maybe the SELinu= x > > > check does not understand that must happen? > > > > The SELinux check is done per VMA, not a region within a VMA. After VMA > > merging, SELinux is probably not able to determine which part of a VMA = is a > > stack unless we keep that information somewhere and provide an API for > > SELinux to query. That can be quite a lot of work. So the easiest way t= o > > prevent this problem is to avoid merging a stack VMA with a regular > > anonymous VMA. > > To paraphrase you, "Yes, SELinux is buggy, but we don't want to fix it". > > Cc'ing the SELinux people so it can be fixed properly. SELinux needs some way to determine what memory region is currently being used by an application's stacks. The current logic can be found in selinux_file_mprotect(), the relevant snippet is below: int selinux_file_mprotect(struct vm_area_struct *vma, ...) { ... } else if (!vma->vm_file && ((vma->vm_start <=3D vma->vm_mm->start_stack && vma->vm_end >=3D vma->vm_mm->start_stack) || vma_is_stack_for_current(vma))) { rc =3D avc_has_perm(&selinux_state, sid, sid, SECCLASS_PROCESS, PROCESS__EXECSTACK, NULL); } ... } If someone has a better, more foolproof way to determine an application's stack please let us know, or better yet submit a patch :) --=20 paul-moore.com