From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1035DCD68ED for ; Tue, 10 Oct 2023 07:12:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75CCA8D00AD; Tue, 10 Oct 2023 03:12:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 70D508D006D; Tue, 10 Oct 2023 03:12:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D5548D00AD; Tue, 10 Oct 2023 03:12:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4F4F78D006D for ; Tue, 10 Oct 2023 03:12:26 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 14425801A4 for ; Tue, 10 Oct 2023 07:12:26 +0000 (UTC) X-FDA: 81328683492.18.161E102 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf12.hostedemail.com (Postfix) with ESMTP id B4D4840017 for ; Tue, 10 Oct 2023 07:12:23 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="gp6OEPo/"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="l6cXuIk/"; dmarc=none; spf=pass (imf12.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696921944; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6BrX/uXOL7D1lmY2XvIGdfkkJkG5aTexZ1MApRXO43s=; b=HtTjka4rEp/D55YpQ6Ra7j6VAhCGPV2ERKGslI/v9vWCPzYa3ld1ZjXt8uC4WqlYx8WLB2 pe3aTBdEbPgVRATsGD/mdR7MY8+xTrnl4HkopXEc110MmCeBUXDPOyN44Mqb9W/ApkzerD DoexZV2thBz2iO5+086pX8jhTo1wNEk= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="gp6OEPo/"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="l6cXuIk/"; dmarc=none; spf=pass (imf12.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696921944; a=rsa-sha256; cv=none; b=7YnBC0ySP2ILaBl/bIMQGKm8vKSRwAmwtn+yxwHwj1H4OiH9gy9SWyx9UrQyMv//vcDWe2 CPsyPNSQ1AiIzxmP6xbxPnhUINhWfnP6ANnmiyuEzPODci7eUVgZTcgf2PVqqnIbRzEn2I nW1ScROAvhx9udaToPPkz9XVm81X1jw= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id DD5802188D; Tue, 10 Oct 2023 07:12:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1696921941; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6BrX/uXOL7D1lmY2XvIGdfkkJkG5aTexZ1MApRXO43s=; b=gp6OEPo/ofBTxZnCA6Ex4rXYzAHkJ35kX9i44gUB/3hR6WKUQZk640Q+IODk+07eOJzt4W EH87+lIfZZbgsv2ZljfxoUPtYKYLCkTsx+rJXD+FRqbgnmPD73BgkVcT5CoD+5HCpaJCkh NZQtxf+Pd8owA9/5IONe27uJ0QoVfcM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1696921941; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6BrX/uXOL7D1lmY2XvIGdfkkJkG5aTexZ1MApRXO43s=; b=l6cXuIk/aTWTnfxOtNw1OCvx1Wbspdss4ceLQYDb6EzkSExgmLtHsyPeBT9bNdg8YEJy+6 Oj8QoCWQsVSu8oBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BCEA01358F; Tue, 10 Oct 2023 07:12:21 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 2OyQLFX5JGXZMgAAMHmgww (envelope-from ); Tue, 10 Oct 2023 07:12:21 +0000 Message-ID: <4b8ffa8e-2d34-e51e-504f-9aa43ded70eb@suse.cz> Date: Tue, 10 Oct 2023 09:12:21 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v2 2/5] mm: abstract the vma_merge()/split_vma() pattern for mprotect() et al. Content-Language: en-US To: Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Alexander Viro , Christian Brauner Cc: "=Liam R . Howlett" , linux-fsdevel@vger.kernel.org References: From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: B4D4840017 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 7qwzbb1kxs4mbxtgdesjowc5op1yuney X-HE-Tag: 1696921943-27349 X-HE-Meta: U2FsdGVkX1/5UHBfGk7l8HyLVk6r0i63v93Uh2Xmya+Wl7KSn2XqzJWBxlPL2FuN0ZS5WvhJV/0WrD/nnVA5gw+NitX5PCHpN511YBk6+cnoWlWsELwsr1HOFvaBdPILCP9qKk3jxTZoEn6kyMqFUkD++iTmyN0zKhVSDWnalX2jenZKTrF20bNIprhKUwjcVCwDE3t2WQD9DFCukwbC4ApxuvAMfHl6+ah8dpkZjtlxTNKrVNQKJ1GP9H9cjIVzzC7CKIiyT2PJUoQGjQvoiCL951UJLYz8jkW/6PiSoliDQFAxgzvtGdDfCRM7zQF74j6TX8M/Ig2YlMQvgfY3505+RIrey+TISZ0/jTOD3XgocG7aNPDY9WMyWpefOrvKLfstnbv58BAcE7XSyF/4R9Wv4tWMEsPe8SJ86G5g+HFXejO5TIX+bmwawV4qScDbYMJFsHgScfApc9EsAAE3ph1yV23bpeA/e9sXrLfO3Rbk425dIIfMb031K9cO1BDtO1izjCr/RYgjaygkOSrDWY94euKasRNPjdd1maEhK9khBLlkJ21r9mp+lhmImk3s+iheTXcaJFmEJ0GOV5Wd5LbY3EnSrxGiwRq9jXzagQjrNiwJoP/ub3lOMueU/7NkL+Wx41OVN4kUVoFdHFYPg0cIz/mhTS+odJZ6zxpHeyZVLGUft6JGKvE838l8BydtxgzL02V0hhw7DgLQaMMkO4nFpKQiAyNlApqD7hgDAH+dXqwZCN8UI5tCQlqm6xrntWC71NN/F3gJo+Ar8PbC4ROOmW5YfaXnhapFFpluCmlMg9aAHlQugIbcHG7P46+k2WDCtEmGfeeGrU6m0YXOaNLHO5Z5HwGlzg6w8L7OXGYJ4Aqnenbja6qB3qBnkDnEg8ObaVYDXS9xBQrK8H86C3HYsLyixul630a0V6GBlxN6PeCpUCvRaKH+Fd/wqlLwNj8b5EdPK6h46nO95P4 EukL3dh8 PnPi7u+FO5hFftyynx1Gjl9Jp5h9vrOXTiT0oH6EFU93M7bcM6dodCfFBcjanuuQIztDvOnm6/Qfe8hAoc27R/bKJRspsS/ZHVX/6enMj3YBB5F0T7kiokTwxSojBxujc1Ceuhz4FQvXaxCjc3yrNcyQYwpGrbMrdAR75rpeGgtRxh38ty4I5pajKiYLCorl9ZMiorNj6NFue8zD70pIkzGuxdqg87abmePUaSETq8Ywkg/KfHHC1c5WHS0SU+fj2trIBu+UvKYTZYjM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/9/23 22:53, Lorenzo Stoakes wrote: > mprotect() and other functions which change VMA parameters over a range > each employ a pattern of:- > > 1. Attempt to merge the range with adjacent VMAs. > 2. If this fails, and the range spans a subset of the VMA, split it > accordingly. > > This is open-coded and duplicated in each case. Also in each case most of > the parameters passed to vma_merge() remain the same. > > Create a new function, vma_modify(), which abstracts this operation, > accepting only those parameters which can be changed. > > To avoid the mess of invoking each function call with unnecessary > parameters, create inline wrapper functions for each of the modify > operations, parameterised only by what is required to perform the action. > > Note that the userfaultfd_release() case works even though it does not > split VMAs - since start is set to vma->vm_start and end is set to > vma->vm_end, the split logic does not trigger. > > In addition, since we calculate pgoff to be equal to vma->vm_pgoff + (start > - vma->vm_start) >> PAGE_SHIFT, and start - vma->vm_start will be 0 in this > instance, this invocation will remain unchanged. > > Signed-off-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka some nits below: > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2437,6 +2437,51 @@ int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, > return __split_vma(vmi, vma, addr, new_below); > } > > +/* > + * We are about to modify one or multiple of a VMA's flags, policy, userfaultfd > + * context and anonymous VMA name within the range [start, end). > + * > + * As a result, we might be able to merge the newly modified VMA range with an > + * adjacent VMA with identical properties. > + * > + * If no merge is possible and the range does not span the entirety of the VMA, > + * we then need to split the VMA to accommodate the change. > + */ This could describe the return value too? It's not entirely trivial. But I also wonder if we could just return 'vma' for the split_vma() cases and the callers could simply stop distinguishing whether there was a merge or split, and their code would become even simpler? It seems to me most callers don't care, except mprotect, see below... > +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, > + struct vm_area_struct *prev, > + struct vm_area_struct *vma, > + unsigned long start, unsigned long end, > + unsigned long vm_flags, > + struct mempolicy *policy, > + struct vm_userfaultfd_ctx uffd_ctx, > + struct anon_vma_name *anon_name) > +{ > + pgoff_t pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > + struct vm_area_struct *merged; > + > + merged = vma_merge(vmi, vma->vm_mm, prev, start, end, vm_flags, > + vma->anon_vma, vma->vm_file, pgoff, policy, > + uffd_ctx, anon_name); > + if (merged) > + return merged; > + > + if (vma->vm_start < start) { > + int err = split_vma(vmi, vma, start, 1); > + > + if (err) > + return ERR_PTR(err); > + } > + > + if (vma->vm_end > end) { > + int err = split_vma(vmi, vma, end, 0); > + > + if (err) > + return ERR_PTR(err); > + } > + > + return NULL; > +} > + > /* > * do_vmi_align_munmap() - munmap the aligned region from @start to @end. > * @vmi: The vma iterator > diff --git a/mm/mprotect.c b/mm/mprotect.c > index b94fbb45d5c7..6f85d99682ab 100644 > --- a/mm/mprotect.c > +++ b/mm/mprotect.c > @@ -581,7 +581,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > long nrpages = (end - start) >> PAGE_SHIFT; > unsigned int mm_cp_flags = 0; > unsigned long charged = 0; > - pgoff_t pgoff; > + struct vm_area_struct *merged; > int error; > > if (newflags == oldflags) { > @@ -625,34 +625,19 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > } > } > > - /* > - * First try to merge with previous and/or next vma. > - */ > - pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > - *pprev = vma_merge(vmi, mm, *pprev, start, end, newflags, > - vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), > - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > - if (*pprev) { > - vma = *pprev; > - VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); > - goto success; > + merged = vma_modify_flags(vmi, *pprev, vma, start, end, newflags); > + if (IS_ERR(merged)) { > + error = PTR_ERR(merged); > + goto fail; > } > > - *pprev = vma; > - > - if (start != vma->vm_start) { > - error = split_vma(vmi, vma, start, 1); > - if (error) > - goto fail; > - } > - > - if (end != vma->vm_end) { > - error = split_vma(vmi, vma, end, 0); > - if (error) > - goto fail; > + if (merged) { > + vma = *pprev = merged; > + VM_WARN_ON((vma->vm_flags ^ newflags) & ~VM_SOFTDIRTY); This VM_WARN_ON() is AFAICS the only piece of code that cares about merged vs split. Would it be ok to call it for the split vma cases as well, or maybe remove it? > + } else { > + *pprev = vma; > } > > -success: > /* > * vm_flags and vm_page_prot are protected by the mmap_lock > * held in write mode.