From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12D67C3DA7F for ; Mon, 12 Aug 2024 17:38:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1CD56B0095; Mon, 12 Aug 2024 13:38:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CD176B0098; Mon, 12 Aug 2024 13:38:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 894A56B009A; Mon, 12 Aug 2024 13:38:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6B4B26B0095 for ; Mon, 12 Aug 2024 13:38:55 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CD12C1A031A for ; Mon, 12 Aug 2024 17:38:54 +0000 (UTC) X-FDA: 82444303788.21.C96E6D6 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf22.hostedemail.com (Postfix) with ESMTP id 036B9C002C for ; Mon, 12 Aug 2024 17:38:52 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fC+ORG5v; spf=pass (imf22.hostedemail.com: domain of jeffxu@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=jeffxu@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723484264; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iprzuAC7oR9BJI+gBUXU0X3c3yh4bHJSPdkmpXidPj0=; b=5QTDEHWiXn7eCAgjMTBTHuv1I/bdl3nnbJu9rX5btAbBRqFwmGekZw66UEkOmzxki63ag8 Bql6hQeaTcW4v5Nx4jwkGUz60zYVHJ1FPXxOLjZdt01B6R7ZkDM7rOHpvd9h/bjGVQ+eJ3 brnhxpnpJZmlisULTnSUpBIeoX5sn4g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723484264; a=rsa-sha256; cv=none; b=MwONBotH8yfE+6gUgie5La96g+a849qDKobxLx4/XjXICU/42plQzSVaoFa/VZh+he5k/6 JkBkfMwcqcdk5F+v2MlCVtw9ie3NP0HAkoahXl7EuseoGkBwMeJPmHkAcGTmKk+P56czpx 5lLHS8vYPIOWmBBHh3EhgzfbiaZHknA= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fC+ORG5v; spf=pass (imf22.hostedemail.com: domain of jeffxu@google.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=jeffxu@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5a1b073d7cdso1173a12.0 for ; Mon, 12 Aug 2024 10:38:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723484331; x=1724089131; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iprzuAC7oR9BJI+gBUXU0X3c3yh4bHJSPdkmpXidPj0=; b=fC+ORG5vybZyWXbLJljoXgKlgIbxvYrQLHRLwXf7Qlubf2c9LHRfTioQisHh1WZp3e +XyM3sr52Z4cpDWm/XWmO/ev/cw1gzeGzGIChqSSomFYJE7bMwiAz1y9DtHCyAoErX82 h98LMWIZ/P9qr2nN51f6p/EDromjAHLzEeBrd0jeQ+BjE/02u7gZ/h1iLuGuXpVyqBcm aT9BiNw8bh+9mVnBC36458XHQ+eQZ5DCBWcYcTXW+LDFgBUeDjbXunklMYL8SC556Ih7 KxROOgsSMeOAt7iXp90WOs5MWPLv/hE2xXwyoCNBP4B072dA07kUQQ96jGdbGZsDKfXy 0teQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723484331; x=1724089131; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iprzuAC7oR9BJI+gBUXU0X3c3yh4bHJSPdkmpXidPj0=; b=Xa4hDdIEYo9YLckTmj2rpLsq/Eq/uLENGRKqCO5ophEb6plzWMhdEl8Y1mkOoiBGJl bsWra2bf/yPbRA99Xq4Akcn4+lhwld98+P0PboDfVUGzvg1gBzubDxHv6+8jCxPGFsfa zCyG9Eo2nfqzwLeXzMLaV3VjNkl/odtI+EuAqZIp8FmIE29cZr5mEj2GeNHlZ52Wsotl V2LbTc1dxPNFTOSBXSuVvu6eNBPSUmhCRyCjpU585LGKruuRcwaUAEuPU3kUUvOCPxBe Wu4w30szhLa6oDSCGU2gfc8hBZUW4sZHhFdvKviUyNPSNkysTUaE6x0G6F7dKNC1atjT RlBA== X-Forwarded-Encrypted: i=1; AJvYcCWONg1jTJOeXEBGibvK3P2oTleAQYR0TPHRtgn7Y8KQQm4BGLwkLxZ1WtMbfsxt/VvkbHQyykYNsZzlFwL/O1YRMME= X-Gm-Message-State: AOJu0Yxiz95d+r6+MXxYPAN+UY0pGqnACCQFZpc/GvyZRTSCdPpxcFTG QaaIdBQ9H4+uKxs2ldXe7syjHVf2O8OVgvtbvDmBBhMTbO6YpjQecafykvhnSHeaO/fSrYqsdDM fE2waMIaK00v+sbLnK8hbuGXxtOO87LYmNbCI X-Google-Smtp-Source: AGHT+IHQyiPoaLvh95T/Ee6BxrCTzWeICyjqvFWYxngktv84QD/h42ujbUo/+38BhKZ0wfTKWHhTPsc4CGsp2V2mTzo= X-Received: by 2002:a05:6402:3591:b0:5a7:7f0f:b70b with SMTP id 4fb4d7f45d1cf-5bd4719d47emr1307a12.0.1723484330426; Mon, 12 Aug 2024 10:38:50 -0700 (PDT) MIME-Version: 1.0 References: <20240807211309.2729719-1-pedro.falcato@gmail.com> <20240807211309.2729719-3-pedro.falcato@gmail.com> <3hzwtm7jw25ng5gemkp42k5ypkfky25fxeevccnk2d6gcpft32@qwkwofgauqna> In-Reply-To: From: Jeff Xu Date: Mon, 12 Aug 2024 10:38:11 -0700 Message-ID: Subject: Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma To: "Liam R. Howlett" , Jeff Xu , Pedro Falcato , Andrew Morton , Vlastimil Babka , Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, oliver.sang@intel.com, torvalds@linux-foundation.org, Michael Ellerman , Kees Cook Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 036B9C002C X-Stat-Signature: 6fmpj7yq15b963t4d19odmtjzybj3dm7 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1723484332-779482 X-HE-Meta: U2FsdGVkX1+eMBIWaVHJJ0/xk68sblI0ch/+OjXFDVXMXz1yyt2a8hMyP0epA9WGOJQQHz5tDKs4Fijwib3/tFzaLJz63qYR0KPJtHIPqR9LC/UfjMCE2i34T8Ck8wzyJTHlsiAXqwYsJXCfz6e2xCVp5T/OFMSo0tYJbm8kg9mw2X9zsLOyxpPmGUknUgDPaC3g1sk+zR5wk4Med7oXlyyfx+WocQ6ua1UClg1kZ2D0ZPEM7hOTxSJc6+nl8vwKEtNNUCEZZMm+bS4lKAGpacEyizq5xFKwPn0v+SuPEnfPPpsd9a0T/O2CqvVHJmeoEXq6404+2tRQAUQ3YYLBw/LpTsAwiSI9dBDYsq7/D/hVdlF6sW9aFI92VtvP6bTC3NYOffmBvXcCUlWbSYB3lOrz6ZizE9l5s+tpF0S+zQnK2ZImbhEYSMyUIB8vFNZxvfWH5NYNOvoSklQ9XVtTN+ZzNad1vQS70A6A/lknjvNZ21c5n1XK1GqR5z9p/TIgf6BcL5D3q0NxtACdX6iGPU+dYE1ltuvRc1YyR3ntjQqecKkafm00quzaXdg3LrXE4ZKJ22/HIzuYQct4B9gJHFrZp2TdCXamxmP/E5kxiyiCZrc5ccOoKf0muuNtcgJI6xgqqJALFgwLVbjCdjenobDuM3lYQ2Q47KeVPKbW1uSblHOaiVabAgZEGD1RMA717jG4Vi7FeY63pkoGkKlhJKS3WQhzoCWsporGnLtdUzVFTc7aAqQ4SnIPbCVZHPEcuCGlbASP343RHuFCEhOCe5jieCbOJNs5n76wsqbkehSV9/+fRjWs9/AqOgMiw4lI3Is4BXLGWh4hO17zxC6tUs8ohHmpTrMDJyTTsbbHMN6PHta8+JE6pOm/IINC47XH1JLsfsQTntSxTds81skxUiQ3dvyW7fiJHT0YFJblNwHs5BpDc7mTApfv+ibdDpGnxqiVM0kKkeCAUOHRFFa l911RZXL xSXn6fXjt2OFioQ2NtFnXq6keeNol+ige1SAXtvlWJlLE4D21mbLiJ4tgGxOxf9cO18lsqiS+QbudG5pPA0yjYfRkAKL4VotXWwHCwAPVeWuMvaSYzfu0USwJqzLyksLyi9puCTkGyJkhV895OLLrgqqUq+OU8irVvgpPiLuhb4/U6PcRAqYrGHbfyEvj0sGS3lqHSbPh3K2/GoTcCiQU/G5u/RhG19xagMeszZBmwm2sL6VLAKYIVeIrVYRXD2BE3eWq6DVRZ6yb03w+UYsv2LKT8UOxBvSrvCSx4Pk2qae7C9478f7feUZicOXlZyoD3C9KHEutbjj/ua/JtRT1+ZsayMx3D2MWbaLCE7qBglONL3k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 12, 2024 at 9:58=E2=80=AFAM Liam R. Howlett wrote: > > * Jeff Xu [240812 10:30]: > > + Kees who commented on mseal() series before. Please keep Kees in the > > cc for future response to this series. > > > > On Fri, Aug 9, 2024 at 12:25=E2=80=AFPM Liam R. Howlett wrote: > > > > > > * Pedro Falcato [240809 14:53]: > > > > On Fri, Aug 9, 2024 at 5:48=E2=80=AFPM Liam R. Howlett wrote: > > > > > > > > > > * Liam R. Howlett [240809 12:15]: > > > > > > * Pedro Falcato [240807 17:13]: > > > > > > > We were doing an extra mmap tree traversal just to check if t= he entire > > > > > > > range is modifiable. This can be done when we iterate through= the VMAs > > > > > > > instead. > > > > > > > > > > > > > > Signed-off-by: Pedro Falcato > > > > > > > --- > > > > > > > mm/mmap.c | 13 +------------ > > > > > > > mm/vma.c | 23 ++++++++++++----------- > > > > > > > 2 files changed, 13 insertions(+), 23 deletions(-) > > > > > > > > > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > > > > > index 4a9c2329b09..c1c7a7d00f5 100644 > > > > > > > --- a/mm/mmap.c > > > > > > > +++ b/mm/mmap.c > > > > > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator = *vmi, struct vm_area_struct *vma, > > > > > > > unsigned long start, unsigned long end, struct li= st_head *uf, > > > > > > > bool unlock) > > > > > > > { > > > > > > > - struct mm_struct *mm =3D vma->vm_mm; > > > > > > > - > > > > > > > - /* > > > > > > > - * Check if memory is sealed before arch_unmap. > > > > > > > - * Prevent unmapping a sealed VMA. > > > > > > > - * can_modify_mm assumes we have acquired the lock on MM. > > > > > > > - */ > > > > > > > - if (unlikely(!can_modify_mm(mm, start, end))) > > > > > > > - return -EPERM; > > > > > > > - > > > > > > > - arch_unmap(mm, start, end); > > > > > > > - return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, = unlock); > > > > > > > + return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, e= nd, uf, unlock); > > > > > > > } > > > > > > > > > > > > > > /* > > > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > > > > > index bf0546fe6ea..7a121bcc907 100644 > > > > > > > --- a/mm/vma.c > > > > > > > +++ b/mm/vma.c > > > > > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator = *vmi, struct vm_area_struct *vma, > > > > > > > if (end < vma->vm_end && mm->map_count >=3D sysct= l_max_map_count) > > > > > > > goto map_count_exceeded; > > > > > > > > > > > > > > + /* Don't bother splitting the VMA if we can't unm= ap it anyway */ > > > > > > > + if (!can_modify_vma(vma)) { > > > > > > > + error =3D -EPERM; > > > > > > > + goto start_split_failed; > > > > > > > + } > > > > > > > + > > > > > > > > > > > > Would this check be better placed in __split_vma()? It could r= eplace > > > > > > both this and the next chunk of code. > > > > > > > > > > not quite. > > > > > > > > Yeah, I was going to say that splitting a sealed VMA is okay (and w= e > > > > allow it on mlock and madvise). > > > > > > splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's > > > feature. Jeff? > > > > > Splitting a sealed VMA is wrong. > > Whoever wants to split a sealed VMA should answer this question > > first: what is the use case for splitting a sealed VMA? > > If we lower the checks to __split_vma() and anywhere that does entire > vma modifications, then we would avoid modifications of the vma. This > particular loop breaks on the final vma, if it needs splitting, so we'd > still need the check in the main loop to ensure the full vma isn't > mseal()'ed. Splitting the start/end could be covered by the > __split_vma() function. > > > > > The V2 series doesn't have selftest change which indicates lack of > > testing. The out-of-loop check is positioned nearer to the API entry > > point and separated from internal business logic, thereby minimizing > > the testing requirements. However, as we move the sealing check > > further inward and intertwine it with business logic, greater test > > coverage becomes necessary to ensure the correctness of sealing > > is preserved. > > I would have expected more complete test coverage and not limited to > what is expected to happen with an up front test. Changes may happen > that you aren't Cc'ed on (or when you are away, etc) that could cause a > silent failure which could go undetected for a prolonged period of time. > > If splitting a vma isn't okay, then it would be safer to test at least > in some scenarios in the upstream tests. Ideally, all tests are > upstream so everyone can run the testing. > We will want to be careful about our expectation of test coverage offered in selftest. When adding mseal, I added 40+ test cases to ensure mseal didn't regress on existing mm api, i.e. you can take the mseal test , make a small modification (removing seal=3D1) and run on an old version of kernel and they will pass. I think it is wrong to expect the selftest is all it takes to find a regression if the dev is doing a *** major design/feature change ***. While it is possible to write test cases to guide all future changes, doing so requires much bigger effort with diminishing returns, essentially it is testing an "impossible to reach cases" in existing code. Thanks -Jeff > Thanks, > Liam