From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA659C27C52 for ; Fri, 7 Jun 2024 00:14:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6528B6B00A0; Thu, 6 Jun 2024 20:14:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 603816B00A1; Thu, 6 Jun 2024 20:14:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CA9B6B00A6; Thu, 6 Jun 2024 20:14:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2FD2C6B00A0 for ; Thu, 6 Jun 2024 20:14:57 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4BC0F1201FB for ; Fri, 7 Jun 2024 00:14:56 +0000 (UTC) X-FDA: 82202172192.15.7573CED Received: from mail-yb1-f182.google.com (mail-yb1-f182.google.com [209.85.219.182]) by imf02.hostedemail.com (Postfix) with ESMTP id 8C08580004 for ; Fri, 7 Jun 2024 00:14:54 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FSlptv2M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.219.182 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717719294; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+eF3WC0S/6RgLrCcIQVGlYiaYG+UbXjTsmyozrkqgz8=; b=fMTeHgC3lJU7qLsVNH3s6+u2CLGd4YIPLzi3V6s4yTWYED2+umkmnL1wwbEKsS6D05oveY axLo/CFu4HOOYItFQwiM3DcaGsR/9dVqBoOKmjjccICQ/WhP2wu6dOeAI9UnxbZALt6teB D9GFR7/SB5bla29wwKh2rKChbltDz6I= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FSlptv2M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.219.182 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717719294; a=rsa-sha256; cv=none; b=imdTexE043GvxtWaA2G5QEG6iUVVXHi2xLECGk5IZ0l2hmGLk8u9Qh+uhMh9o+1psXnwSP uDRIUSew0erYmX5GcjXko/DLknlDqr2ZXbMUfeBWHdTN35B7W7ExyUJCStU5X8ZsCQQbYo HUKnoA2+VmoebS9eHR3iBRoZUDFkxGA= Received: by mail-yb1-f182.google.com with SMTP id 3f1490d57ef6-df481bf6680so1717824276.3 for ; Thu, 06 Jun 2024 17:14:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717719293; x=1718324093; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+eF3WC0S/6RgLrCcIQVGlYiaYG+UbXjTsmyozrkqgz8=; b=FSlptv2MSuRVPedgvIJ0TYDbDfMSDq6dSaKV2MzLuJc8lbCIM4QQFZFmw9X+KGzFFQ VurolfM2YQSA08LfviOKMFDXGN3rtNukK5y4T3WZSOEiwF+1pMjWbLY+v6xBkThWEyN2 wffRz1yPH52E1T06RD331XtEKeR/DGHJe6/n1/LCB/7cTJzCGcq+lVg+GTmtmOB3n9cb EXsowKCWRxLxhPrezG9cFvmIAb2iOV6gTIAdzICx1z4WhAJZBWmprmzDNlX0nSvSRwWn R/SdtJ9n/P5VuCy4Ws3j6CtavNlWFG/Gr5zaJZBHykCqt2nODRmCYH501IVBU4pCsVAf ZB5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717719293; x=1718324093; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+eF3WC0S/6RgLrCcIQVGlYiaYG+UbXjTsmyozrkqgz8=; b=MnArfVrH0TFK8CgZKYSlv3PnGgAoiDvom1SBmPb1tlUfWZ8L7d2DSZno3Zehxo6PYh 1eWN7kvNnFbVgmlPMFZh61EzX5x65tTvYwb88Mit1V0dKCT+Z9ETIyl7LqmGHy5cntCZ NIQ371zgVxj1iri0wxDfy45+OWaTolKtOxiQ3aQE9el9q2rgYeRP2JeRsLMW06zaMzCz EmgkuK/gx4rUz8AVmDKZuBANIYhkzGnxg3EEiH1geM9Wvq7u7eSLqjbaAj0RhywLRD6n xUG2/ipJYxhszYtchN+HXH5C871lolN/c43mwUAEKSrByOaLc+S7IlYbw6e4QNL3e4om L16g== X-Forwarded-Encrypted: i=1; AJvYcCVaN+4/tjXHFKOi2TABM7XutISjvN7z3fUpiM2Z8pz4yvxKOHtnE5wLvlvlaSQSKUgAKoF2S51J6AVD0ha9Kmvo5ME= X-Gm-Message-State: AOJu0YwMVftJ0owiYRoJJytlEfcft062mAU8mrvgHjPsa+5tqNXdmAGm blba2ms87CIwerCcD36giIYXVfuWcpMnoQznuXXDHONA1ctEYdg95sDmIkwZF8nJp5MbqOZTAeD 3mznex6l3DezirTJtwjiwB8EkcdZzixI+fSod X-Google-Smtp-Source: AGHT+IEdhMX0D8OjtW6dAblDMt6aDio5sK4QaHdcTMdMHidp0Gqnq5XSOtYVRNrhNlpXIlzh8dUCogC6KyQhG1f6eKA= X-Received: by 2002:a25:ef50:0:b0:df1:ce95:5490 with SMTP id 3f1490d57ef6-dfaf65c6857mr944656276.18.1717719293250; Thu, 06 Jun 2024 17:14:53 -0700 (PDT) MIME-Version: 1.0 References: <20240531163217.1584450-1-Liam.Howlett@oracle.com> <20240531163217.1584450-3-Liam.Howlett@oracle.com> In-Reply-To: <20240531163217.1584450-3-Liam.Howlett@oracle.com> From: Suren Baghdasaryan Date: Thu, 6 Jun 2024 17:14:40 -0700 Message-ID: Subject: Re: [RFC PATCH 2/5] mm/mmap: Split do_vmi_align_munmap() into a gather and complete operation To: "Liam R. Howlett" Cc: Andrii Nakryiko , Vlastimil Babka , sidhartha.kumar@oracle.com, Matthew Wilcox , Lorenzo Stoakes , linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8C08580004 X-Stat-Signature: wjfwgm8cjrmi7t8wwkmjkia7ijubahsq X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717719294-696489 X-HE-Meta: U2FsdGVkX1/qy2JEBY1C4jcBYw8c0/OYD/rETcGhF3fvlQXZhDJS2Ba0pKO4KXr7A8sLZznoFoqK/9tjM68sBg9f3GREsHa6l9PSRpv3mc8KP/IuRlCQk62ojjR9C2/cU7/Gj4wNYWEGv4A1iuMBNLjmZq230P4QsKKBC1MnNtod41VNVjVwuL6RoLQud5TaCBQlBvDCImzB1TUr7odNY+AnSirAa2b0c3u5zwcE8cWWL+SN5qh0nu6qfaSguVQAVbiERNQtsESyjc7cTv9evKg/s50UlCnzmbIudZV8sCRbS/Pce0E58yTIzZj5MzbnkT4CUQz8vzapW1mavJPSy8d/jMpbivZeenp7SXeuV2tMQNqng+JR8bwU1YB1KJzC/6j44u6A59L2WRtc98MZ/3zp57EnaiRM5NlknGzqMMJTcU+ZJfp35eCucuLodvbw0+U6bX+baSsW52LOP3sC2po4pmInVMuXdkhRrADHYfDQWGDkWtBGFRIhT606DuWibD1erR90JV9X6bJiX0MLUvwvyjToQoKK6uZhWlkeZgNZeoTyELbhOHEB6qFTYhlX8wAkL1Ihs0zwKElS24i8FilREveE5p/VINMrpPRw7K3aNnWO3RL72p36DcMf9V26elyzM193FMHjYHmNdTlafrHzqghmOtwUjLgM9LBGtJQbMN9BTCyMIGhiOE89gF2gwMK4OR4hlcICBhnvNiv+BTDqLF8GQltYhwyjdW4hTFgu/swEwV6//ZscxmTDGBGUQIta4GxdKPKcpM2j/C2jefF8t/c04LVY7EfajdPXrXdVHLJzmxeCXkASlHVk/y2J1DAbZNNUSYVGMU6NcKqtYrorscKH9eRtpQEvWLO+68aCaLUImBPJ8h1MN1qt39LGamyXXrd1y/lOZNR/DprTxkUi9BHWRCUfN/9cNo7dilTe2SEJVQWurgyHScbFOTUX3HdxP97F5vfAaqNgCjQ o+ByXtt0 PnzoWs4tu1c4xiL8gJsmObVp9C/3MpKCPRFyegOKJ6Q0WadHqXAlyvSV1TeoFF01uC6oF0vp6IhtumNgN+QOoFV9mOJB9VdC37CqzMumPr1bN77uJnlSUArif0x5eVdI3Paid3Cy78O8moO3gF8OF9QFGCGopMKNuTl1L3qWxs3nxf1TJtgpHySX0WLWVBXynxYPwhZttnx23VVDbGLAAL4IXHfkit5vUS6E0IRrfStYTFg3uxBAylxiWi/66xXPVFInYG3S6h5nRs4iNhBZu2NhNjgInzlmy54dq3M6DmS5jxJoITcYu8fPJzn5l5eAQ/rrpSwcerFx9dIG/gjXdBqYYBoivh9Bl2xEgoMPYU2o/pdmpWrOHqIv72AxqYWs+mPmIbRDAZriTxQo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 31, 2024 at 9:33=E2=80=AFAM Liam R. Howlett wrote: > > Split the munmap function into a gathering of vmas and a cleanup of the > gathered vmas. This is necessary for the later patches in the series. > > Signed-off-by: Liam R. Howlett The refactoring looks correct but it's quite painful to verify all the pieces. Not sure if it could have been refactored in more gradual steps... Reviewed-by: Suren Baghdasaryan > --- > mm/mmap.c | 143 ++++++++++++++++++++++++++++++++++++++---------------- > 1 file changed, 101 insertions(+), 42 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 31d464e6a656..fad40d604c64 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2340,6 +2340,7 @@ static inline void remove_mt(struct mm_struct *mm, = struct ma_state *mas) > > if (vma->vm_flags & VM_ACCOUNT) > nr_accounted +=3D nrpages; > + nit: here and below a couple of unnecessary empty lines. > vm_stat_account(mm, vma->vm_flags, -nrpages); > remove_vma(vma, false); > } > @@ -2545,33 +2546,45 @@ struct vm_area_struct *vma_merge_extend(struct vm= a_iterator *vmi, > vma->vm_userfaultfd_ctx, anon_vma_name(vma)); > } > > + > +static inline void abort_munmap_vmas(struct ma_state *mas_detach) > +{ > + struct vm_area_struct *vma; > + int limit; > + > + limit =3D mas_detach->index; > + mas_set(mas_detach, 0); > + /* Re-attach any detached VMAs */ > + mas_for_each(mas_detach, vma, limit) > + vma_mark_detached(vma, false); > + > + __mt_destroy(mas_detach->tree); > +} > + > /* > - * do_vmi_align_munmap() - munmap the aligned region from @start to @end= . > + * vmi_gather_munmap_vmas() - Put all VMAs within a range into a maple t= ree > + * for removal at a later date. Handles splitting first and last if nec= essary > + * and marking the vmas as isolated. > + * > * @vmi: The vma iterator > * @vma: The starting vm_area_struct > * @mm: The mm_struct > * @start: The aligned start address to munmap. > * @end: The aligned end address to munmap. > * @uf: The userfaultfd list_head > - * @unlock: Set to true to drop the mmap_lock. unlocking only happens o= n > - * success. > + * @mas_detach: The maple state tracking the detached tree > * > - * Return: 0 on success and drops the lock if so directed, error and lea= ves the > - * lock held otherwise. > + * Return: 0 on success > */ > static int > -do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma= , > +vmi_gather_munmap_vmas(struct vma_iterator *vmi, struct vm_area_struct *= vma, > struct mm_struct *mm, unsigned long start, > - unsigned long end, struct list_head *uf, bool unlock) > + unsigned long end, struct list_head *uf, > + struct ma_state *mas_detach, unsigned long *locked_vm= ) > { > - struct vm_area_struct *prev, *next =3D NULL; > - struct maple_tree mt_detach; > - int count =3D 0; > + struct vm_area_struct *next =3D NULL; > int error =3D -ENOMEM; > - unsigned long locked_vm =3D 0; > - MA_STATE(mas_detach, &mt_detach, 0, 0); > - mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK= _MASK); > - mt_on_stack(mt_detach); > + int count =3D 0; > > /* > * If we need to split any vma, do it now to save pain later. > @@ -2610,15 +2623,14 @@ do_vmi_align_munmap(struct vma_iterator *vmi, str= uct vm_area_struct *vma, > goto end_split_failed; > } > vma_start_write(next); > - mas_set(&mas_detach, count); > - error =3D mas_store_gfp(&mas_detach, next, GFP_KERNEL); > + mas_set(mas_detach, count++); > + if (next->vm_flags & VM_LOCKED) > + *locked_vm +=3D vma_pages(next); > + > + error =3D mas_store_gfp(mas_detach, next, GFP_KERNEL); > if (error) > goto munmap_gather_failed; > vma_mark_detached(next, true); > - if (next->vm_flags & VM_LOCKED) > - locked_vm +=3D vma_pages(next); > - > - count++; > if (unlikely(uf)) { > /* > * If userfaultfd_unmap_prep returns an error the= vmas > @@ -2643,7 +2655,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struc= t vm_area_struct *vma, > #if defined(CONFIG_DEBUG_VM_MAPLE_TREE) > /* Make sure no VMAs are about to be lost. */ > { > - MA_STATE(test, &mt_detach, 0, 0); > + MA_STATE(test, mas_detach->tree, 0, 0); > struct vm_area_struct *vma_mas, *vma_test; > int test_count =3D 0; > > @@ -2663,13 +2675,29 @@ do_vmi_align_munmap(struct vma_iterator *vmi, str= uct vm_area_struct *vma, > while (vma_iter_addr(vmi) > start) > vma_iter_prev_range(vmi); > > - error =3D vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL); > - if (error) > - goto clear_tree_failed; > + return 0; > > - /* Point of no return */ > - mm->locked_vm -=3D locked_vm; > +userfaultfd_error: > +munmap_gather_failed: > +end_split_failed: > + abort_munmap_vmas(mas_detach); > +start_split_failed: > +map_count_exceeded: > + return error; > +} > + > +static void > +vmi_complete_munmap_vmas(struct vma_iterator *vmi, struct vm_area_struct= *vma, > + struct mm_struct *mm, unsigned long start, > + unsigned long end, bool unlock, struct ma_state *mas_deta= ch, > + unsigned long locked_vm) > +{ > + struct vm_area_struct *prev, *next; > + int count; > + > + count =3D mas_detach->index + 1; > mm->map_count -=3D count; > + mm->locked_vm -=3D locked_vm; > if (unlock) > mmap_write_downgrade(mm); > > @@ -2682,30 +2710,61 @@ do_vmi_align_munmap(struct vma_iterator *vmi, str= uct vm_area_struct *vma, > * We can free page tables without write-locking mmap_lock becaus= e VMAs > * were isolated before we downgraded mmap_lock. > */ > - mas_set(&mas_detach, 1); > - unmap_region(mm, &mas_detach, vma, prev, next, start, end, count, > + mas_set(mas_detach, 1); > + unmap_region(mm, mas_detach, vma, prev, next, start, end, count, > !unlock); > /* Statistics and freeing VMAs */ > - mas_set(&mas_detach, 0); > - remove_mt(mm, &mas_detach); > + mas_set(mas_detach, 0); > + remove_mt(mm, mas_detach); > validate_mm(mm); > if (unlock) > mmap_read_unlock(mm); > > - __mt_destroy(&mt_detach); > - return 0; > + __mt_destroy(mas_detach->tree); > +} > > -clear_tree_failed: > -userfaultfd_error: > -munmap_gather_failed: > -end_split_failed: > - mas_set(&mas_detach, 0); > - mas_for_each(&mas_detach, next, end) > - vma_mark_detached(next, false); > +/* > + * do_vmi_align_munmap() - munmap the aligned region from @start to @end= . > + * @vmi: The vma iterator > + * @vma: The starting vm_area_struct > + * @mm: The mm_struct > + * @start: The aligned start address to munmap. > + * @end: The aligned end address to munmap. > + * @uf: The userfaultfd list_head > + * @unlock: Set to true to drop the mmap_lock. unlocking only happens o= n > + * success. > + * > + * Return: 0 on success and drops the lock if so directed, error and lea= ves the > + * lock held otherwise. > + */ > +static int > +do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma= , > + struct mm_struct *mm, unsigned long start, > + unsigned long end, struct list_head *uf, bool unlock) > +{ > + struct maple_tree mt_detach; > + MA_STATE(mas_detach, &mt_detach, 0, 0); > + mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK= _MASK); > + mt_on_stack(mt_detach); > + int error; > + unsigned long locked_vm =3D 0; > > - __mt_destroy(&mt_detach); > -start_split_failed: > -map_count_exceeded: > + error =3D vmi_gather_munmap_vmas(vmi, vma, mm, start, end, uf, > + &mas_detach, &locked_vm); > + if (error) > + goto gather_failed; > + > + error =3D vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL); > + if (error) > + goto clear_area_failed; > + > + vmi_complete_munmap_vmas(vmi, vma, mm, start, end, unlock, &mas_d= etach, > + locked_vm); > + return 0; > + > +clear_area_failed: > + abort_munmap_vmas(&mas_detach); > +gather_failed: > validate_mm(mm); > return error; > } > -- > 2.43.0 >