From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DEB1C52D6F for ; Wed, 21 Aug 2024 16:16:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E91856B013C; Wed, 21 Aug 2024 12:16:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E40DD6B013D; Wed, 21 Aug 2024 12:16:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBB296B013E; Wed, 21 Aug 2024 12:16:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id ABD0E6B013C for ; Wed, 21 Aug 2024 12:16:08 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3777AA81DC for ; Wed, 21 Aug 2024 16:16:08 +0000 (UTC) X-FDA: 82476754416.17.55822A8 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) by imf28.hostedemail.com (Postfix) with ESMTP id 34DEEC002A for ; Wed, 21 Aug 2024 16:16:05 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b="WgZt2g/X"; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf28.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.167.173 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724256901; a=rsa-sha256; cv=none; b=6B2LiCoO200wvxmXRjxJp+iBBJidhR4xGz05ar3YXgPhu88uTcWB/7g0OBGQjsUrP5EZDZ B4UGhvR88EJ43gdqtA7zOsSlF3rYN2VPqiRlhXHSPOMQJh2TmiZNu511NqrRPWCBu8RdFA kpkoet+IGvuV1bxZqmjDOkXRITp0V1g= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b="WgZt2g/X"; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf28.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.167.173 as permitted sender) smtp.mailfrom=jeffxu@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724256901; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ay4N+X8xFligirN80qjTXFQAPlG4tThmp3+jBLmN5qM=; b=6wQZ9JV3HQWLMehQQVQUGH95BnjE0r5pRaM0JnV2i25nIQHQI4ct65gIwNmkmqVKcmT9V9 bN9qzwbXOEgv9gVODTvKMtTsYUhDZhECNjGj3CA6roqoc/GscaygEe/6qQNI6L7K4POD7s f5F5Fme3i6BIikmvl+y/LMH1eWWDmqM= Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3db111a08c3so62482b6e.2 for ; Wed, 21 Aug 2024 09:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1724256965; x=1724861765; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ay4N+X8xFligirN80qjTXFQAPlG4tThmp3+jBLmN5qM=; b=WgZt2g/XgV1FK6NQDtCRf2UMkfX+k2kd3LJi4WgVmQOCgLw35W+Ac4VbY3eUFtwGtX NshWAEGnTK49Tw2zCJdNoYuB6jpiO4RDycv+QGGFW/D1CY631MdeU/nwgpCbNpUuleWt hFEWsgJG4I0V6rjFO6EFzIkwxJRbvk0MmS1So= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724256965; x=1724861765; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ay4N+X8xFligirN80qjTXFQAPlG4tThmp3+jBLmN5qM=; b=J4kgwv3Q2/2yDe4mQ9Xh3T02Trcw/YoqhPowTVaEoYRAR5PL+wGJ+GaexPPlcjbRBm F7jjRH7UCzFIrQj4H4bHg+d6nKE7XdAMj6wWiChRrRNLndog6MhF2ES9AWmgXisgF/wI l061h8TnAr0dXqmXq6pabydyRIm5ZaMoehQEWCY7WskKYD0eUhYA+t0F6X/ZtkYqciwU CNaZ8MtVvqmjQIKWp9tQeqVaAkIfyS3W6ljmiPJneLk3Fs6pXCcaknl9BeSbogm3kCAV ZzP5MJvEKJ5ZVpBjSUNzO5MO2gfUgneyiZ4PE9aeL6B7IUE13OGaijHh8qUJiNHp28M+ mtWQ== X-Forwarded-Encrypted: i=1; AJvYcCXbwz2q4Zj92Gb5RtMQT5UnR1teWT+gIAgcdIode53rqb1ZwxpCFQ5M0ZbQWl5jlLsO4Z1cnCAqaw==@kvack.org X-Gm-Message-State: AOJu0YyYdeWW7kPJc7/KcgorS5uBnQFyBBwXFmAeQ+eUg/9riumJUVxA eCMqoshWkc7Mfg/nUXYCTsJO1byHf0V5D9Y35IbUHM8zXeicnupvhfdh4LS4P35vaE+XhMy3Edo 3+kqWPm514+7Mz5/XdhBAFliwffn23CrLcYVH X-Google-Smtp-Source: AGHT+IECN3jXMHQ89hKP+cbvp8WlOQnMNfhtIo6rfpsvat0UipyYBThpaC5bMpZj5d7hLc/B6tAUMQQDAMA7oJ9ZsQk= X-Received: by 2002:a05:6870:65a3:b0:260:edbc:d7fc with SMTP id 586e51a60fabf-2737eef29d7mr1841616fac.4.1724256964997; Wed, 21 Aug 2024 09:16:04 -0700 (PDT) MIME-Version: 1.0 References: <20240817-mseal-depessimize-v3-0-d8d2e037df30@gmail.com> <20240817-mseal-depessimize-v3-2-d8d2e037df30@gmail.com> In-Reply-To: <20240817-mseal-depessimize-v3-2-d8d2e037df30@gmail.com> From: Jeff Xu Date: Wed, 21 Aug 2024 09:15:52 -0700 Message-ID: Subject: Re: [PATCH v3 2/7] mm/munmap: Replace can_modify_mm with can_modify_vma To: Pedro Falcato Cc: Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Lorenzo Stoakes , Shuah Khan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, oliver.sang@intel.com, torvalds@linux-foundation.org, Michael Ellerman , Kees Cook Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 34DEEC002A X-Stat-Signature: hmi8xy3bo4mtcan6eitkh7gze5crbpbb X-Rspam-User: X-HE-Tag: 1724256965-169625 X-HE-Meta: U2FsdGVkX18EyH9wmXZIDa5n/0eObCS/mjjgR9ZvRodkJyJ/3V+j4pEPThymPmFXAcg2O5QfViFYK96cu7qchBCmTMkaW4ET+g44Owwd9EEki+KJtLni5qiBkQj1gtukdSuSDXQoq/0hyNb2zH0M59UbRXiTN0vqQdcQOoWPJW1YcPMTZSvp7k/UzzDGMTgHjoj9FdydODJxO4AUOf/S4nw6nJFyf716hRSiQ2TnAcaZq7zzPFf96V66PNyVf8woFHwgLnQ9RXwCd7WthLoZCg6xHM2CermT0O/kj/jv/K6im9K84clRFUN/bBuJwNo6Sfaci5oHFf7Nff3/sDs2LpYOtF53CHHZe+VBHermDo5S6r8Udh94+pzSEZth16p5OGfTr3UzFJ/gTqNhY40l9PyyjHB4Jlgve+SkwGbxh/Lgv3sDKXq/R5i0K9tx0ftz8qb8u0+zDVFchKYppKuX9YGOpcHBHF/rfAAYe/JXMylJgdSWZbvG+p58dFlWgtlCrNn9xzZrneKd4GO182GR0KB2gr3vuawfd1GZg1vqwru/wBGoX/kW/3exsAD4m1qvF+xQXLLLs7f9jFrKBY+7Q4IGzLvqRpy9SMf1aeEEBnCWiKKJLdNky9fQxJ6JDsaSu0NepZ9JdqXm+JXrdc8M8sgrnQCpiz5XCwOSLckA18k3vgFXPHI3/47WHh+WLi1BMrjZejvC0aXgfhCPBtF3PvO70SCodGECRCamg9Sx+bIdc1mFFQIbo3PETHctnU5bj39G8Q7Pz+Rx6Bf5VFAWHtNtb058hsGtZEWu3dCdwQnKyL6SngeJkVASgoYxKVHr8ysdLgNL44aZ7ZTyxeFTsLfNT/OOiEHe7hkXhaPSObmfNB0yrvDyIaR76V3UusrUZPBY83ooCRG68fhXE1XKEmBh3WYlrOCZVepbj6WRVxdYdNzqukmPWpvpmvaF4Lv3P3hG/LObJYpSdjezn5b CTi3MOoQ I9bRzHhd/VRUUV8MQfTbJnA5BEkQsLEO12YxG+b+UsMf9hRMbyZczmeFvN2KhA7tDluRbF7PCJ7MN/8tBPXT4ywqRqaG6c+2AkKK9P6D+XOd2zdcsBKJNPcYo1e4Cnq9PlbYzlOxHc0yxsWCMMMncE4d9sjEoKiHWxmayjrJHuZDttjHaElGZ+WKeqKWRkuXJmkuHx2U9AyP1dqp5zzRzv6zly3ccLYHyDNsVCrx1HcsjP6ifiypzgKq6lSPI0ni6CMpAQzCtH6QH6kAkRmANNQ861k6B8+jKciNLKjH2RrsCy4WW+bAChfeqHRk4KHdQBpfK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 16, 2024 at 5:18=E2=80=AFPM Pedro Falcato wrote: > > We were doing an extra mmap tree traversal just to check if the entire > range is modifiable. This can be done when we iterate through the VMAs > instead. > > Signed-off-by: Pedro Falcato > --- > mm/mmap.c | 11 +---------- > mm/vma.c | 19 ++++++++++++------- > 2 files changed, 13 insertions(+), 17 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 3af256bacef3..30ae4cb5cec9 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1740,16 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct= vm_area_struct *vma, > unsigned long start, unsigned long end, struct list_head = *uf, > bool unlock) > { > - struct mm_struct *mm =3D vma->vm_mm; > - > - /* > - * Check if memory is sealed, prevent unmapping a sealed VMA. > - * can_modify_mm assumes we have acquired the lock on MM. > - */ > - if (unlikely(!can_modify_mm(mm, start, end))) > - return -EPERM; Another approach to improve perf is to clone the vmi (since it already point to the first vma), and pass the cloned vmi/vma into can_modify_mm check, that will remove the cost of re-finding the first VMA. The can_modify_mm then continues from cloned VMI/vma till the end of address range, there will be some perf cost there. However, most address ranges in the real world are within a single VMA, in practice, the perf cost is the same as checking the single VMA, 99.9% case. This will help preserve the nice sealing feature (if one of the vma is sealed, the entire address range is not modified) > - > - return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); > + return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, = unlock); > } > > /* > diff --git a/mm/vma.c b/mm/vma.c > index 84965f2cd580..5850f7c0949b 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct= vm_area_struct *vma, > if (end < vma->vm_end && mm->map_count >=3D sysctl_max_ma= p_count) > goto map_count_exceeded; > > + /* Don't bother splitting the VMA if we can't unmap it an= yway */ > + if (!can_modify_vma(vma)) { > + error =3D -EPERM; > + goto start_split_failed; > + } > + > error =3D __split_vma(vmi, vma, start, 1); > if (error) > goto start_split_failed; > @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct= vm_area_struct *vma, > */ > next =3D vma; > do { > + if (!can_modify_vma(next)) { > + error =3D -EPERM; > + goto modify_vma_failed; > + } > + > /* Does it split the end? */ > if (next->vm_end > end) { > error =3D __split_vma(vmi, next, end, 0); > @@ -815,6 +826,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct = vm_area_struct *vma, > __mt_destroy(&mt_detach); > return 0; > > +modify_vma_failed: > clear_tree_failed: > userfaultfd_error: > munmap_gather_failed: > @@ -860,13 +872,6 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct m= m_struct *mm, > if (end =3D=3D start) > return -EINVAL; > > - /* > - * Check if memory is sealed, prevent unmapping a sealed VMA. > - * can_modify_mm assumes we have acquired the lock on MM. > - */ > - if (unlikely(!can_modify_mm(mm, start, end))) > - return -EPERM; > - > /* Find the first overlapping VMA */ > vma =3D vma_find(vmi, end); > if (!vma) { > > -- > 2.46.0 >