From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 915B4C3DA45 for ; Wed, 10 Jul 2024 12:28:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B5CF6B008A; Wed, 10 Jul 2024 08:28:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 066686B008C; Wed, 10 Jul 2024 08:28:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E6F486B0092; Wed, 10 Jul 2024 08:28:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CB1566B008A for ; Wed, 10 Jul 2024 08:28:09 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 821801A1C7E for ; Wed, 10 Jul 2024 12:28:09 +0000 (UTC) X-FDA: 82323770298.22.3A681AB Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by imf27.hostedemail.com (Postfix) with ESMTP id C201E40021 for ; Wed, 10 Jul 2024 12:28:06 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=ellerman.id.au header.s=201909 header.b=qpIvD5RT; spf=pass (imf27.hostedemail.com: domain of mpe@ellerman.id.au designates 150.107.74.76 as permitted sender) smtp.mailfrom=mpe@ellerman.id.au; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720614472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Dk9oQcXatpAOcTNW8F66wOLoyQ5TH0P2B5WSusLytqQ=; b=JAbHietgLh+jtkXE4w+0Ypu/BldcmefglI6QN5XwIr5BVn1Czbxvndl9W98WjLQ8YDWRl7 F+wiUFRCIcyykH1feuCxW4ysCjNMkcmzbLSyLj1EHzIXsXjz/vFBqvN2EKjdwYYM//Q75T IHC1mmAvS7KMHoa4+OpUAqhUcpR3lm8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=ellerman.id.au header.s=201909 header.b=qpIvD5RT; spf=pass (imf27.hostedemail.com: domain of mpe@ellerman.id.au designates 150.107.74.76 as permitted sender) smtp.mailfrom=mpe@ellerman.id.au; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720614472; a=rsa-sha256; cv=none; b=PnM0yDXNevSYN/TYP2rXPtRQcmLwaPHny/76U33z5eKlmJI/jKoJsqKOW5xXzzLkjOqHfv i8sVXX6+9yttjepQ2CFICmcDkhRUUyv+WH9SS0ygNrt4Z1D8UuNhKRp58WADqSJqDEIWjC hTQ5gQFx9aLPee72Jo3cd3EuLO+U038= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ellerman.id.au; s=201909; t=1720614483; bh=Dk9oQcXatpAOcTNW8F66wOLoyQ5TH0P2B5WSusLytqQ=; h=From:To:Subject:In-Reply-To:References:Date:From; b=qpIvD5RTkuF9fGU3j3NgGN1Vt5MfoHL2HHmrNHOJyC91we37pZlyXosmrger+PW3P xYNcM4NoKB6F+zS3op6Adh4aUQ/GO5X3anLbHBKJBpZuEGM7OY4dWFcTZgPjeEY5f4 JZSBu8EIjk95riKS/NmG964c07ZcIUEbYhx9tlr3KHNr2PWsQWKh82az1UaD9qeMmx V7mc2L8OrExqqsImzjcs0xgLrIvvbHjxEdjyOKOitHRQ0LXY9MtICWlf1hi71K6M4v 6ajrqWSis8hBkINxnV3zI8uphyFN8xWUS0l9RxNYVASz8AwVdz74wXaxI2aPSx11Sq JQW2dbFkU6AuA== Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4WJxw23dKxz4w2M; Wed, 10 Jul 2024 22:28:02 +1000 (AEST) From: Michael Ellerman To: Lorenzo Stoakes , "Liam R. Howlett" , linux-mm@kvack.org, Andrew Morton , Suren Baghdasaryan , Vlastimil Babka , Lorenzo Stoakes , Matthew Wilcox , sidhartha.kumar@oracle.com, "Paul E . McKenney" , Bert Karwatzki , Jiri Olsa , linux-kernel@vger.kernel.org, Kees Cook , linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v3 16/16] mm/mmap: Move may_expand_vm() check in mmap_region() In-Reply-To: <0998f05b-9d5f-4b24-9030-22421e1dd859@lucifer.local> References: <20240704182718.2653918-1-Liam.Howlett@oracle.com> <20240704182718.2653918-17-Liam.Howlett@oracle.com> <8fbb424d-a781-4e61-af7a-904e281eba8c@lucifer.local> <0998f05b-9d5f-4b24-9030-22421e1dd859@lucifer.local> Date: Wed, 10 Jul 2024 22:28:01 +1000 Message-ID: <874j8x5t4e.fsf@mail.lhotse> MIME-Version: 1.0 Content-Type: text/plain X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C201E40021 X-Stat-Signature: snwjdsins1iuqhbexw1cm66fh65t6zcr X-HE-Tag: 1720614486-364251 X-HE-Meta: U2FsdGVkX182oeTjLJEw8+fb/GLqFyivGVWvpA1c2D7jlSMK4p3iBCFHo1nt/1Uk4m11hsa6VDmO7N5MkpMQiB1zphDH1PyZFA8CG1EH46gX1kGGWfv1fhZ1nbz2yB2acW1Tw1PRfQJYoC6ndt9CnZaoxM3eNjRXEzJRo6y0/oE/oLchicaaCfVO/fjl12UAcGKvyiUN3uXu2zDL31YuQPdmDvPJ1QJWwz96dz2eMkFG2pLneLcU4jhur3+9cy0pbuBhGYT4J9tyw1LDYfF/Tc6jWyMWMrNS19xlZHHv4pTfH1/cGXPgQvxkluLvTeWpfbw7ycPysAAPfeWyJ+BDVWshePXVXb4/Qlzwgx/QFqhHPV2328ZdNF5UJmO12VkSBoIjv5kAoRmJ4+tP91q2YFK1LjuHTbgPt6wlUYdwq53AW/ZnXNInxEc0c00AZbGBjFzzOX9RmGzIrT1POOnJmxRLWDNURDqHU74y5kQrl++EedTRwTdXBQTtXGHl0Udnn0cCAGB45BWn9e+2d3vK+YEXtslsiHgjyxivHmJ2IfVmQIaQOYSF1DcHeQaZCsnQGI6uGFkcDn3bLEwKzuZijX+pvp1T3e3Vc2Gw3hj45KGLYI+CZeQ286CSCnzmyrWzi7NtYZGIbn2ReJXloGKv/XWfQnWZyNC6brIR4GxpWIlFHASUqQmpos9z6pTZdPeZh+aiwHecMksy2YD7DkZ5z/dnEir4u+Qsv5HKi21IXBFuEYX773IAHwhk4gnCGRy0bSzJA/lojORWY68SeR2Z+RPqOd6ynTtCG16D4/OxVoI0MM6fvLbC/QYv9/0in3cbA5hcQhb9M+zoFpn0WI5WE8wpC8ARPSgftJFS/Tb8JCVDz9jg+wxhyR/36Ia/b3VyIoIU+Wa5E/ZgbO7vPqUo0DxMsa5K0TjT43BPgmM3b70kUgocQ+/qsh5toRivgzVx3Euq7INn5VeV67LpflT ZUu4CpPk 92xGOTtqUE3nMNWBQxYqJb9CCDgeSIbrePbWS/9anxmkFazlMKj1SWoIGUa6ShoAM3JgOzqB8J9E/mB9j7Oe59vIOC2XB87ablvglH5DwcxbyUJK9+tmovOB05GW+3iK1vqWHHJkPjHgGZIOx3mb2LMIiVbJ8pTXM9xVKH+2EGVxYAjO8j//efh+jRKKpchBfefcGo57MDogs2bZHodJCta1HDL6t87BSfglzho3pRdzjr7buqoCVCp3T9AaZHMJ/8YcSlqNn2MRuVbE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Lorenzo Stoakes writes: > On Mon, Jul 08, 2024 at 04:43:15PM GMT, Liam R. Howlett wrote: >> ... >> The functionality here has changed >> --- from --- >> may_expand_vm() check >> can_modify_mm() check >> arch_unmap() >> vms_gather_munmap_vmas() >> ... >> >> --- to --- >> can_modify_mm() check >> arch_unmap() >> vms_gather_munmap_vmas() >> may_expand_vm() check >> ... >> >> vms_gather_munmap_vmas() does nothing but figures out what to do later, >> but could use memory and can fail. >> >> The user implications are: >> >> 1. The return type on the error may change to -EPERM from -ENOMEM, if >> you are not allowed to expand and are trying to overwrite mseal()'ed >> VMAs. That seems so very rare that I'm not sure it's worth mentioning. >> >> >> 2. arch_unmap() called prior to may_expand_vm(). >> powerpc uses this to set mm->context.vdso = NULL if mm->context.vdso is >> within the unmap range. User implication of this means that an >> application my set the vdso to NULL prior to hitting the -ENOMEM case in >> may_expand_vm() due to the address space limit. >> >> Assuming the removal of the vdso does not cause the application to seg >> fault, then the user visible change is that any vdso call after a failed >> mmap(MAP_FIXED) call would result in a seg fault. The only reason it >> would fail is if the mapping process was attempting to map a large >> enough area over the vdso (which is accounted and in the vma tree, >> afaict) and ran out of memory. Note that this situation could arise >> already since we could run out of memory (not accounting) after the >> arch_unmap() call within the kernel. >> >> The code today can suffer the same fate, but not by the accounting >> failure. It can happen due to failure to allocate a new vma, >> do_vmi_munmap() failure after the arch_unmap() call, or any of the other >> failure scenarios later in the mmap_region() function. >> >> At the very least, this requires an expanded change log. > > Indeed, also (as mentioned on IRC) I feel like we need to look at whether > we _truly_ need this arch_unmap() call for a single, rather antiquated, > architecture. You can call it "niche" or "irrelevant" or "fringe", but "antiquated" is factually wrong :) Power10 came out of the fab just a few years ago at 7nm. > I mean why are they unmapping the VDSO, why is that valid, why does it need > that field to be set to NULL, is it possible to signify that in some other > way etc.? It was originally for CRIU. So a niche workload on a niche architecture. But from the commit that added it, it sounds like CRIU was using mremap, which should be handled these days by vdso_mremap(). So it could be that arch_unmap() is not actually needed for CRIU anymore. Then I guess we have to decide if removing our arch_unmap() would be an ABI break, regardless of whether CRIU needs it or not. cheers