From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CBD1CD8CAE for ; Tue, 10 Oct 2023 18:11:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E44838E0003; Tue, 10 Oct 2023 14:11:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF4A58D0002; Tue, 10 Oct 2023 14:11:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE2EC8E0003; Tue, 10 Oct 2023 14:11:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BDA1E8D0002 for ; Tue, 10 Oct 2023 14:11:16 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7699FC03BD for ; Tue, 10 Oct 2023 18:11:15 +0000 (UTC) X-FDA: 81330343710.03.E944662 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf17.hostedemail.com (Postfix) with ESMTP id 898AE40014 for ; Tue, 10 Oct 2023 18:11:13 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iT2dcJvA; spf=pass (imf17.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696961473; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CMKJqRCieFxS9Yt+I9ZmXS+e5dIH6+uWGrTvHacyvUw=; b=GeL13PXSGqRauSdjCMYKPVqoAj6XUaB15fQtajJoM7U5Gi+2LXqSRRdq5bfvAQa3FYb4RD nJhLD3hr4Y/+iZrXw5cmxiJzF+LzD3SuapXX9nQxUWJ66JvjLJJ8OYFzassQaAe+WOqtY2 GlfOAWjXStNW0rfvxgnosr+EEk1LmRo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696961473; a=rsa-sha256; cv=none; b=7a/6+Okm+2vBWhRJlvb8j5CJ/CaJBj1gPPBRV5yI8vquc47g4J5OB83kgQ6jOqYzGDz/fz k21HEp0QT8ZgQHraQPaqIKPSMAx9E1Axr1XbQjx4ulLyJrVLjLbt5CVohT27KL0EAqB1a+ PwoT8I+PjNINYlZ/iEIyQeH2vlYSVC0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iT2dcJvA; spf=pass (imf17.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-406650da82bso55236625e9.3 for ; Tue, 10 Oct 2023 11:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696961472; x=1697566272; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=CMKJqRCieFxS9Yt+I9ZmXS+e5dIH6+uWGrTvHacyvUw=; b=iT2dcJvAifgOOVAPmERKvOkndvxd1e1lNdc2MNCFLeOIzbs30pluwFcSx8EzcdE51+ +QPXd8LVUyVBZo0OZq/JMTjIZMfnyAFtR4+U5kF5dOBnw4na2T5SshVMpzLjR8rY3Ikb DhgOleJzoXRCv7OkaqlcJGSe1D3EoFSWYa0fwDwJa1senF4p+WFooq5kgpiVV6avaQAA RJBeh+Ngonso1UaoWfVO5bHYlxNuZrHMiPPxFMxZ7s9ZCRlGQ5DjgKSfYBlmF5u2YYsN N5GjRNDSTL6ybzbZ0+/0sTy+yiBr9Q/lkAdF+W1KJMLu5cqUee876QNq3VQfz2Irg2yu hv8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696961472; x=1697566272; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CMKJqRCieFxS9Yt+I9ZmXS+e5dIH6+uWGrTvHacyvUw=; b=iK8LDxzcdr3dMnaeneGe/u+z6w0rETrO4RAxDgFwYP4n3w1XNowB0ac2ASkENcGz8P x56MMiE3xTgEdWanqXULKPOoIRK3dAx8v7vWWCuAUADvpBmNwD9bDzcTGQhRiss/c7XS siE7/ZM1rcGC18hjRNqs5cde8HW6GBA0XZPEoyWJxgcbycI6MU4I2qxaQ/ijRi4ezHiN EORZH+reXa67sM47zhYMiR8gD0As8PMsEodKOuTg/b31fXQry1aqPf2FZSXE3PEkZe+g gSBlDOCDkTaMUv2AjA2Vbyiu36R5wX/xQUutSLXfn+s1dstvtQ0MyCJde6tp75I6PZn4 4sVQ== X-Gm-Message-State: AOJu0YzCLSHGMORiTFrxyhhPUku+GfYT1Ruw7wTLG+D1gg1Vdu+CAGZz 6bhuIazhsepeiZgFKjJ8v5U= X-Google-Smtp-Source: AGHT+IEJFrBCYlgJDwaTTRTyrKvS3QdTooKYqgrvhUV60Y68KFy9TuvSrFzFRjMUjCSS7M24CuRAlQ== X-Received: by 2002:a1c:4b12:0:b0:405:4a8c:d4f8 with SMTP id y18-20020a1c4b12000000b004054a8cd4f8mr17490235wma.30.1696961471450; Tue, 10 Oct 2023 11:11:11 -0700 (PDT) Received: from localhost ([2a00:23c5:dc8c:8701:1663:9a35:5a7b:1d76]) by smtp.gmail.com with ESMTPSA id s26-20020a7bc39a000000b004064cd71aa8sm14712788wmj.34.2023.10.10.11.11.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 11:11:10 -0700 (PDT) Date: Tue, 10 Oct 2023 19:11:09 +0100 From: Lorenzo Stoakes To: Vlastimil Babka Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Alexander Viro , Christian Brauner , "=Liam R . Howlett" , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2 2/5] mm: abstract the vma_merge()/split_vma() pattern for mprotect() et al. Message-ID: References: <4b8ffa8e-2d34-e51e-504f-9aa43ded70eb@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4b8ffa8e-2d34-e51e-504f-9aa43ded70eb@suse.cz> X-Rspamd-Queue-Id: 898AE40014 X-Rspam-User: X-Stat-Signature: gk7ga36be5kpf9cjekt64k41a94dh83u X-Rspamd-Server: rspam03 X-HE-Tag: 1696961473-179521 X-HE-Meta: U2FsdGVkX18HdfU3SNy3SiIetzoUO7Qju9pQihzgtP2zHYgi7X3e+IGL5oW/sgqFX+xQtQRS+lEiJ4MXUk9nHvgjMO7UEX84PH7cZlWCUVBD8prGQ6PaY4IJ3zWGgGvHOiQoTpqFa8yMQNFhaqRuKM0jveNn598SJuiWH3Hd9UnbL0fEp5SvhGFA1kCKDV9nprYzSXNb59ILamM2/zaa1/OC3tHoDuDfJwD+ixXI9tsOX/63IhAnwUEq+I/+1lH0q85eyQQvLRB/wjuMcLN9NVvrGNQbWKLpu4DGqzn6HWulSg2sa7sTv9mBXiXrBpWJ58kL359NFtxifOKVrDOL5Dq6k0BvhalYUVsATv7hgeGe/BZbJ6OVzAsowTe+mRiBQg6kVLsMPbPE787TGwHt5coEJlciJDDQTOn8FPQbBydlcznZvIeTEXIJ1TDcllBvua1rH87KBQxmAzXlph3D3T6OaKf3+A3hr5L7y07TS2gxdtqoa0tNJXNu6ahhxYNQxXL9F9MQG/S94jgqOpsP7n+uNCvOpMLmlKrRkGeiNgTXktiWuzU7C4GoH1waMFq//yoQRI4fKpfAOdgwB36oi4bh6bgXcvF9iTWyv2hg63CVbawPO3r7P2vRyQ7/CFiQi0Ham8TZixRMYUA6Im6l8/KIerecRTxyDNrCr3ku414VyB8CyKxpJ2ky1R1wtSl63LCdS58C/HtbxLlJ4zLnxIJYZX7c2YEkxdCNtB37GaoJszGQsSRoWSmzmZa3jGASXat4N1Vqcv4mgJ7zUEWJ9nBAqCSYygp4TCrzwmXjrVAtTiWnqncVaccYyEEG580eyjZq4LOzGvE8JkeAg7WDSSQ62sYB4R3yD+qxUKddWAjtPiLQdo88srYTr2/MWagy2M95bRBkk3B4rEM06pnCDTiHb+eL7Ok19v8IMBegrZRMMPL0wu7Z7wPjfuG10tUL2joDtAQv4Om5IgcfhO4 u4yi0rsz 9vlBrix5UKPbHFaqQfNPcLRO1TJdyobCBnDlnKrMN6kkmIbb0xrMjph4Dq35OH3HarFZQEFtwoL6g+KBJYGzyCDelLkrxrZ8a8g7rpGc6kV3yAQ9SZ1zNpB1YtVLqw0DG1HgCUYIxonfchZmqNHakO8WdZfZIhzr5G1tTY+K+xk7KFEM0D4jWy4yVn+EIOuk6s7e0lXykeHvoOq8b1OZlM4lC4oyDyANd9+viTtVXKPE99wRfwM0ex9R7gfthik9CK5XQ40E+MPFEKY72BapPMLUmKsdBNMRn8ogzFjQLtT13//bPsCrUwoG8qR8gAUIGJm9kzHNW30yMfxK8hp/xNkiH0xrK9ZEQhAVub3gEn+GjjmKokP+7PlQbnsmaH96XRkFSvflGjY8Sgz63BGldAvjxqAMQ391Hv0DkLAj80Nw2dAij2iqDFjMnw1b+IJ4TySrwPkXVfUcz2mqlUe9lMchRIA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Oct 10, 2023 at 09:12:21AM +0200, Vlastimil Babka wrote: > On 10/9/23 22:53, Lorenzo Stoakes wrote: > > mprotect() and other functions which change VMA parameters over a range > > each employ a pattern of:- > > > > 1. Attempt to merge the range with adjacent VMAs. > > 2. If this fails, and the range spans a subset of the VMA, split it > > accordingly. > > > > This is open-coded and duplicated in each case. Also in each case most of > > the parameters passed to vma_merge() remain the same. > > > > Create a new function, vma_modify(), which abstracts this operation, > > accepting only those parameters which can be changed. > > > > To avoid the mess of invoking each function call with unnecessary > > parameters, create inline wrapper functions for each of the modify > > operations, parameterised only by what is required to perform the action. > > > > Note that the userfaultfd_release() case works even though it does not > > split VMAs - since start is set to vma->vm_start and end is set to > > vma->vm_end, the split logic does not trigger. > > > > In addition, since we calculate pgoff to be equal to vma->vm_pgoff + (start > > - vma->vm_start) >> PAGE_SHIFT, and start - vma->vm_start will be 0 in this > > instance, this invocation will remain unchanged. > > > > Signed-off-by: Lorenzo Stoakes > > Reviewed-by: Vlastimil Babka > > some nits below: > > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -2437,6 +2437,51 @@ int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, > > return __split_vma(vmi, vma, addr, new_below); > > } > > > > +/* > > + * We are about to modify one or multiple of a VMA's flags, policy, userfaultfd > > + * context and anonymous VMA name within the range [start, end). > > + * > > + * As a result, we might be able to merge the newly modified VMA range with an > > + * adjacent VMA with identical properties. > > + * > > + * If no merge is possible and the range does not span the entirety of the VMA, > > + * we then need to split the VMA to accommodate the change. > > + */ > > This could describe the return value too? It's not entirely trivial. > But I also wonder if we could just return 'vma' for the split_vma() cases > and the callers could simply stop distinguishing whether there was a merge > or split, and their code would become even simpler? > It seems to me most callers don't care, except mprotect, see below... What a great idea, thanks! I have worked through and implemented this and it does indeed work and simplify things even further, cheers! > > > +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, > > + struct vm_area_struct *prev, > > + struct vm_area_struct *vma, > > + unsigned long start, unsigned long end, > > + unsigned long vm_flags, > > + struct mempolicy *policy, > > + struct vm_userfaultfd_ctx uffd_ctx, > > + struct anon_vma_name *anon_name) > > +{ > > + pgoff_t pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > > + struct vm_area_struct *merged; > > + > > + merged = vma_merge(vmi, vma->vm_mm, prev, start, end, vm_flags, > > + vma->anon_vma, vma->vm_file, pgoff, policy, > > + uffd_ctx, anon_name); > > + if (merged) > > + return merged; > > + > > + if (vma->vm_start < start) { > > + int err = split_vma(vmi, vma, start, 1); > > + > > + if (err) > > + return ERR_PTR(err); > > + } > > + > > + if (vma->vm_end > end) { > > + int err = split_vma(vmi, vma, end, 0); > > + > > + if (err) > > + return ERR_PTR(err); > > + } > > + > > + return NULL; > > +} > > + > > /* > > * do_vmi_align_munmap() - munmap the aligned region from @start to @end. > > * @vmi: The vma iterator > > diff --git a/mm/mprotect.c b/mm/mprotect.c > > index b94fbb45d5c7..6f85d99682ab 100644 > > --- a/mm/mprotect.c > > +++ b/mm/mprotect.c > > @@ -581,7 +581,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > > long nrpages = (end - start) >> PAGE_SHIFT; > > unsigned int mm_cp_flags = 0; > > unsigned long charged = 0; > > - pgoff_t pgoff; > > + struct vm_area_struct *merged; > > int error; > > > > if (newflags == oldflags) { > > @@ -625,34 +625,19 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > > } > > } > > > > - /* > > - * First try to merge with previous and/or next vma. > > - */ > > - pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > > - *pprev = vma_merge(vmi, mm, *pprev, start, end, newflags, > > - vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), > > - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > > - if (*pprev) { > > - vma = *pprev; > > - VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); > > - goto success; > > + merged = vma_modify_flags(vmi, *pprev, vma, start, end, newflags); > > + if (IS_ERR(merged)) { > > + error = PTR_ERR(merged); > > + goto fail; > > } > > > > - *pprev = vma; > > - > > - if (start != vma->vm_start) { > > - error = split_vma(vmi, vma, start, 1); > > - if (error) > > - goto fail; > > - } > > - > > - if (end != vma->vm_end) { > > - error = split_vma(vmi, vma, end, 0); > > - if (error) > > - goto fail; > > + if (merged) { > > + vma = *pprev = merged; > > + VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); > > This VM_WARN_ON() is AFAICS the only piece of code that cares about merged > vs split. Would it be ok to call it for the split vma cases as well, or > maybe remove it? This is simply asserting a fundamental requirement of vma_merge() in general, i.e. that the flags of what was merged match those of the VMA that is being merged. This is already checked in the VMA merge implementation, so this feels super redundant, so I think we're good to simply remove it. > > > + } else { > > + *pprev = vma; > > } > > > > -success: > > /* > > * vm_flags and vm_page_prot are protected by the mmap_lock > > * held in write mode. >