From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D825AC27C53 for ; Fri, 7 Jun 2024 14:39:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DBEE6B008A; Fri, 7 Jun 2024 10:39:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68BA26B009B; Fri, 7 Jun 2024 10:39:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52C816B00A6; Fri, 7 Jun 2024 10:39:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 310B26B008A for ; Fri, 7 Jun 2024 10:39:00 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CF47DA387C for ; Fri, 7 Jun 2024 14:38:59 +0000 (UTC) X-FDA: 82204349598.22.0C8CA3C Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by imf11.hostedemail.com (Postfix) with ESMTP id F16B540017 for ; Fri, 7 Jun 2024 14:38:57 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OZgmOOX6; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717771138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VPY3kEn9dtxckyYDxTCJfVvE65ZOGNBoardFeiSOUO4=; b=J4+FFD0kAX8dCrzGGbsjm75oKZ5zu1QelDc5ydIil2Oqgfd0uCpi4rpu5TZgBLwuGa44M5 P51+FO2kUBLeTpneK3yI6CN0TflyNmp9qOWw1HtHT52P1T+9IBSgFf28lRw9sXY2XanXIv 7L6FErhykYNMXEP0pkssORtakX7+MjA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OZgmOOX6; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717771138; a=rsa-sha256; cv=none; b=k7AgB6aVx8urn92Oi9bs6NC7vbBOJFpIYNona5c1f7R8FauQgYZxhFAavrWXQpSfAwci+b pnBEq4PEMrxYxBGmh14IYExhK2Btvw0kw4adUQDFA8NCZf2SqgMc7DOVTsLMpKYgHaHzAO zmRblCSzh7ut/Fuv1XCv3znPUc4BnGw= Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-dfab5f7e749so2286302276.0 for ; Fri, 07 Jun 2024 07:38:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717771137; x=1718375937; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VPY3kEn9dtxckyYDxTCJfVvE65ZOGNBoardFeiSOUO4=; b=OZgmOOX6cJuYGZ+KQVgrWlqsffUkNvbjliXPeBJsHpI8m9+axpNT20VwnUR3wEgv6U eTPM3iA/a/gCgbcm0sqxJ6owRxcFM7bp1bPQ5nYJ3qi07NFRbt5gu5XNOjxzkqV1OJKM YhORJms2YnG3hgPB9UuMqYvzM4RN6wI8bYZLzMtDcpAK0wp2nZtW1D/BH7Ecokqs3Z6Y PkFiFyrQUGwfNtBiPBZOsOxU7k4KkgquCqeqdx6QIyJhXhCp9TfmJn4961e1P02M3haN PZaQHLH3q6AIaI2W4VlleVa4kd9LNrA8eNVt6imPEAJA1zZhDqE2MKofV/dLBXcHXANI DFCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717771137; x=1718375937; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VPY3kEn9dtxckyYDxTCJfVvE65ZOGNBoardFeiSOUO4=; b=Y1BGXqR4YnObBT31yD8vKwFnet38SiWbdcrJeOhKwIUVOKm1f6ABqt4/cf0K7/mWJU f/3Rur4EyVuijximyynFTIGUqK9SmuOwdbnna2d/IRzqBxvLv50usUFyoRNBVKAUiuwd a6KkMCquVM/KpAqgOr2dDPSgTyuKVW26nY4VjXpo6KBe+bNhQ4quFY8i7QT2fWAPw5CJ sY8TJ3K1722rIDS/HfI2wfV+nonaaZ8rpPaSrEn4ZssE0ibYCmZBgyR8WzxqiRbXR/bl U8v+OUquOqOS0rHSIr4PlCu7/32dzbrmAQEJgXHfVGax8DGxyMAxJD6PsXEkZl1j7yrO cjWA== X-Forwarded-Encrypted: i=1; AJvYcCVFNha80LBYrEM6NZwzjAzZmJMZT5RCQ9LP47qsUZSpNHUVcPEVIykapV8ElTB+QGlik5ljn7DhCCv9/ygQw9fyG9k= X-Gm-Message-State: AOJu0YxknF7fY4aYjmgwk6y9BWJzqV63WDnbrz6pc1GTV8TDwk3rKopW FXuFmqXK2k5s01uFSVDB5poMSGHQUghQ4zlpwqK6RRQ4B8pakYFmA0HZ+KZiXpI32Cyq3Wpie+j BVjppzexb/P2zlvOacHyGFL1Dlo3wfinyhXCm X-Google-Smtp-Source: AGHT+IF9IGu66PpUZhf3pesqD1C7cZCpvg6vwgT7dOW1Z95P6GM8MHILMIrHEn4UXDxfXFqf7IuVk6prvEibZoqUPs8= X-Received: by 2002:a25:84c7:0:b0:dfa:b64b:48a5 with SMTP id 3f1490d57ef6-dfaf663bf34mr2482386276.19.1717771136602; Fri, 07 Jun 2024 07:38:56 -0700 (PDT) MIME-Version: 1.0 References: <20240531163217.1584450-1-Liam.Howlett@oracle.com> <20240531163217.1584450-4-Liam.Howlett@oracle.com> In-Reply-To: <20240531163217.1584450-4-Liam.Howlett@oracle.com> From: Suren Baghdasaryan Date: Fri, 7 Jun 2024 07:38:45 -0700 Message-ID: Subject: Re: [RFC PATCH 3/5] mm/mmap: Introduce vma_munmap_struct for use in munmap operations To: "Liam R. Howlett" Cc: Andrii Nakryiko , Vlastimil Babka , sidhartha.kumar@oracle.com, Matthew Wilcox , Lorenzo Stoakes , linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: F16B540017 X-Stat-Signature: neaiz3zc9ezkedbfze8xcsgy6i7cbixd X-HE-Tag: 1717771137-615078 X-HE-Meta: U2FsdGVkX1/ItlZic0w0kREBrr96lCBp88mshgTwRJNOdcbS8XWR0TUn06xtx381KeyJ59TIMzzbmKJWsGhJb5Dp/PjNMchoKD2TSszHz1pUqcii+TYgMTn5zCi9fvbXA2iLm2FQl8kqp1x7eL4XHt+ZXj4a3AS6KHK0JTdD37QDMvTOq7BsuvDKNlY9SAhIugf1CXMDcsqOSITtkx9YBM+HvqgiPWzdx+O+aD2McFWLwGHnsI+gcttf7Be64NIsqt7MtIcgB/oiXU1TOsjoZjG1kQgfgPBNm6I0iK93f4xkhJ9p1TC1XGEgTYPbbl+Us7ozfAlgWft1dfwTQz0Yyv5USJgbXWlPxAAUy7JDlKYkeF2Kc4O9jyYk16aOtXN2xJZBGsx+Cc7G317RB6HXg0Gy/6xqATqHXZBSogDhyOi5fvTtwLctVm29m1k3IFkbX378VUkv+Nz3p85ABDddmELomFn+rAPAaIhHF0TKxje5a7PlspTaVDsipg4CuvnPEmDiMnTRSqJfqIyV6n6+MukfcyoUlD8r1fnxK/40KUHrCZ+i/IIkiev07YVUCJslX8woqy/dJROKDW8SRFGe9/MNqtu7yD9GzGLdrLq0qA1ZVYVOZvy/2YUAeG8QEeO4t3aR8a5917uBgWypvzItbC95vvIcz2VoEu8EFLXookUS2h6oFrVuOKyUcP91nEPXuUTqFDyy9uUDSLwVNg5jNVc2Gd35AFTMntIInhJKCf2YM1p34OI241q0UrfCQD5VzPT8h3BcO/cHq92EjoqQVN2xvw8SmZf1kY4oe94P9/u1Tbfh8Fv70wvLR3tEhUVsqUEqGpMAi34vbDUjrFwTWiASk08r7St+5EaPQuWllJ5YsLRGciZzyQKykU9L+E4+7L+bR9+nRlqmtfgSavhgHyDJx9XYdNI6gZEjHxJ0oM0ppAnYgcb0P+CgWineHMRXlBa5VdtWI1oeYcitgvW dTvzLfG7 jB/YcZ3qXLOgmOc1UWSidZmJZrzhDlF+SKgnOcbq8n/DEra0a1aLY6Ok/wBd3WcFjioRvlHlphm8eP4mUpswVZbL7Cf1G2e8y9r5xOjpL157O6Nc4u/dDJN67JbSoZh5b/bheHLXFgBqvZBTSbfqspfwhoEITCbWiu+CzluXs4xdDKqDH1tzLi/ByROqYnqeKSNSVYDtoG9x/H4PiAC5tgDc8TbPDMn5D6kK4FXSBpxNWlt5Aqyx4+MzybL47IQUXrrMq30nPMiMVMYCZuhYFmCE+MW0OLDhvud4R+Zf0r40DCDyNUsiHrKbHeZFdg3rIfmokGWS7LLx+fCGukJo0kDpxgGJBZO1VpL4Xg2vg9lmTH3eO0zNNro2wDUpbrJM8E+KYp/2TkZz0dm4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 31, 2024 at 9:33=E2=80=AFAM Liam R. Howlett wrote: > > Use a structure to pass along all the necessary information and counters > involved in removing vmas from the mm_struct. > > Signed-off-by: Liam R. Howlett Reviewed-by: Suren Baghdasaryan > --- > mm/internal.h | 16 ++++++ > mm/mmap.c | 133 +++++++++++++++++++++++++++++--------------------- > 2 files changed, 94 insertions(+), 55 deletions(-) > > diff --git a/mm/internal.h b/mm/internal.h > index b2c75b12014e..6ebf77853d68 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1428,6 +1428,22 @@ struct vma_prepare { > struct vm_area_struct *remove2; > }; > > +/* > + * vma munmap operation > + */ > +struct vma_munmap_struct { > + struct vma_iterator *vmi; > + struct mm_struct *mm; > + struct vm_area_struct *vma; /* The first vma to munmap */ > + struct list_head *uf; /* Userfaultfd list_head */ > + unsigned long start; /* Aligned start addr */ > + unsigned long end; /* Aligned end addr */ > + int vma_count; /* Number of vmas that will be re= moved */ > + unsigned long nr_pages; /* Number of pages being removed = */ > + unsigned long locked_vm; /* Number of locked pages */ > + bool unlock; /* Unlock after the munmap */ > +}; > + > void __meminit __init_single_page(struct page *page, unsigned long pfn, > unsigned long zone, int nid); > > diff --git a/mm/mmap.c b/mm/mmap.c > index fad40d604c64..57f2383245ea 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -459,6 +459,31 @@ static inline void init_vma_prep(struct vma_prepare = *vp, > init_multi_vma_prep(vp, vma, NULL, NULL, NULL); > } > > +/* > + * init_vma_munmap() - Initializer wrapper for vma_munmap_struct > + * @vms: The vma munmap struct > + * @vmi: The vma iterator > + * @vma: The first vm_area_struct to munmap > + * @start: The aligned start address to munmap > + * @end: The aligned end address to munmap > + * @uf: The userfaultfd list_head > + * @unlock: Unlock after the operation. Only unlocked on success > + */ > +static inline void init_vma_munmap(struct vma_munmap_struct *vms, > + struct vma_iterator *vmi, struct vm_area_struct *vma, > + unsigned long start, unsigned long end, struct list_head = *uf, > + bool unlock) > +{ > + vms->vmi =3D vmi; > + vms->vma =3D vma; > + vms->mm =3D vma->vm_mm; > + vms->start =3D start; > + vms->end =3D end; > + vms->unlock =3D unlock; > + vms->uf =3D uf; > + vms->vma_count =3D 0; > + vms->nr_pages =3D vms->locked_vm =3D 0; > +} > > /* > * vma_prepare() - Helper function for handling locking VMAs prior to al= tering > @@ -2340,7 +2365,6 @@ static inline void remove_mt(struct mm_struct *mm, = struct ma_state *mas) > > if (vma->vm_flags & VM_ACCOUNT) > nr_accounted +=3D nrpages; > - > vm_stat_account(mm, vma->vm_flags, -nrpages); > remove_vma(vma, false); > } > @@ -2562,29 +2586,20 @@ static inline void abort_munmap_vmas(struct ma_st= ate *mas_detach) > } > > /* > - * vmi_gather_munmap_vmas() - Put all VMAs within a range into a maple t= ree > + * vms_gather_munmap_vmas() - Put all VMAs within a range into a maple t= ree > * for removal at a later date. Handles splitting first and last if nec= essary > * and marking the vmas as isolated. > * > - * @vmi: The vma iterator > - * @vma: The starting vm_area_struct > - * @mm: The mm_struct > - * @start: The aligned start address to munmap. > - * @end: The aligned end address to munmap. > - * @uf: The userfaultfd list_head > + * @vms: The vma munmap struct > * @mas_detach: The maple state tracking the detached tree > * > * Return: 0 on success > */ > -static int > -vmi_gather_munmap_vmas(struct vma_iterator *vmi, struct vm_area_struct *= vma, > - struct mm_struct *mm, unsigned long start, > - unsigned long end, struct list_head *uf, > - struct ma_state *mas_detach, unsigned long *locked_vm= ) > +static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms, > + struct ma_state *mas_detach) > { > struct vm_area_struct *next =3D NULL; > int error =3D -ENOMEM; > - int count =3D 0; > > /* > * If we need to split any vma, do it now to save pain later. > @@ -2595,17 +2610,18 @@ vmi_gather_munmap_vmas(struct vma_iterator *vmi, = struct vm_area_struct *vma, > */ > > /* Does it split the first one? */ > - if (start > vma->vm_start) { > + if (vms->start > vms->vma->vm_start) { > > /* > * Make sure that map_count on return from munmap() will > * not exceed its limit; but let map_count go just above > * its limit temporarily, to help free resources as expec= ted. > */ > - if (end < vma->vm_end && mm->map_count >=3D sysctl_max_ma= p_count) > + if (vms->end < vms->vma->vm_end && > + vms->mm->map_count >=3D sysctl_max_map_count) > goto map_count_exceeded; > > - error =3D __split_vma(vmi, vma, start, 1); > + error =3D __split_vma(vms->vmi, vms->vma, vms->start, 1); > if (error) > goto start_split_failed; > } > @@ -2614,24 +2630,24 @@ vmi_gather_munmap_vmas(struct vma_iterator *vmi, = struct vm_area_struct *vma, > * Detach a range of VMAs from the mm. Using next as a temp varia= ble as > * it is always overwritten. > */ > - next =3D vma; > + next =3D vms->vma; > do { > /* Does it split the end? */ > - if (next->vm_end > end) { > - error =3D __split_vma(vmi, next, end, 0); > + if (next->vm_end > vms->end) { > + error =3D __split_vma(vms->vmi, next, vms->end, 0= ); > if (error) > goto end_split_failed; > } > vma_start_write(next); > - mas_set(mas_detach, count++); > + mas_set(mas_detach, vms->vma_count++); > if (next->vm_flags & VM_LOCKED) > - *locked_vm +=3D vma_pages(next); > + vms->locked_vm +=3D vma_pages(next); > > error =3D mas_store_gfp(mas_detach, next, GFP_KERNEL); > if (error) > goto munmap_gather_failed; > vma_mark_detached(next, true); > - if (unlikely(uf)) { > + if (unlikely(vms->uf)) { > /* > * If userfaultfd_unmap_prep returns an error the= vmas > * will remain split, but userland will get a > @@ -2641,16 +2657,17 @@ vmi_gather_munmap_vmas(struct vma_iterator *vmi, = struct vm_area_struct *vma, > * split, despite we could. This is unlikely enou= gh > * failure that it's not worth optimizing it for. > */ > - error =3D userfaultfd_unmap_prep(next, start, end= , uf); > + error =3D userfaultfd_unmap_prep(next, vms->start= , > + vms->end, vms->uf)= ; > > if (error) > goto userfaultfd_error; > } > #ifdef CONFIG_DEBUG_VM_MAPLE_TREE > - BUG_ON(next->vm_start < start); > - BUG_ON(next->vm_start > end); > + BUG_ON(next->vm_start < vms->start); > + BUG_ON(next->vm_start > vms->end); > #endif > - } for_each_vma_range(*vmi, next, end); > + } for_each_vma_range(*(vms->vmi), next, vms->end); > > #if defined(CONFIG_DEBUG_VM_MAPLE_TREE) > /* Make sure no VMAs are about to be lost. */ > @@ -2659,21 +2676,21 @@ vmi_gather_munmap_vmas(struct vma_iterator *vmi, = struct vm_area_struct *vma, > struct vm_area_struct *vma_mas, *vma_test; > int test_count =3D 0; > > - vma_iter_set(vmi, start); > + vma_iter_set(vms->vmi, vms->start); > rcu_read_lock(); > - vma_test =3D mas_find(&test, count - 1); > - for_each_vma_range(*vmi, vma_mas, end) { > + vma_test =3D mas_find(&test, vms->vma_count - 1); > + for_each_vma_range(*(vms->vmi), vma_mas, vms->end) { > BUG_ON(vma_mas !=3D vma_test); > test_count++; > - vma_test =3D mas_next(&test, count - 1); > + vma_test =3D mas_next(&test, vms->vma_count - 1); > } > rcu_read_unlock(); > - BUG_ON(count !=3D test_count); > + BUG_ON(vms->vma_count !=3D test_count); > } > #endif > > - while (vma_iter_addr(vmi) > start) > - vma_iter_prev_range(vmi); > + while (vma_iter_addr(vms->vmi) > vms->start) > + vma_iter_prev_range(vms->vmi); > > return 0; > > @@ -2686,38 +2703,44 @@ vmi_gather_munmap_vmas(struct vma_iterator *vmi, = struct vm_area_struct *vma, > return error; > } > > -static void > -vmi_complete_munmap_vmas(struct vma_iterator *vmi, struct vm_area_struct= *vma, > - struct mm_struct *mm, unsigned long start, > - unsigned long end, bool unlock, struct ma_state *mas_deta= ch, > - unsigned long locked_vm) > +/* > + * vmi_complete_munmap_vmas() - Update mm counters, unlock if directed, = and free > + * all VMA resources. > + * > + * do_vmi_align_munmap() - munmap the aligned region from @start to @end= . > + * @vms: The vma munmap struct > + * @mas_detach: The maple state of the detached vmas > + * > + */ > +static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms, > + struct ma_state *mas_detach) > { > struct vm_area_struct *prev, *next; > - int count; > + struct mm_struct *mm; > > - count =3D mas_detach->index + 1; > - mm->map_count -=3D count; > - mm->locked_vm -=3D locked_vm; > - if (unlock) > + mm =3D vms->mm; > + mm->map_count -=3D vms->vma_count; > + mm->locked_vm -=3D vms->locked_vm; > + if (vms->unlock) > mmap_write_downgrade(mm); > > - prev =3D vma_iter_prev_range(vmi); > - next =3D vma_next(vmi); > + prev =3D vma_iter_prev_range(vms->vmi); > + next =3D vma_next(vms->vmi); > if (next) > - vma_iter_prev_range(vmi); > + vma_iter_prev_range(vms->vmi); > > /* > * We can free page tables without write-locking mmap_lock becaus= e VMAs > * were isolated before we downgraded mmap_lock. > */ > mas_set(mas_detach, 1); > - unmap_region(mm, mas_detach, vma, prev, next, start, end, count, > - !unlock); > + unmap_region(mm, mas_detach, vms->vma, prev, next, vms->start, vm= s->end, > + vms->vma_count, !vms->unlock); > /* Statistics and freeing VMAs */ > mas_set(mas_detach, 0); > remove_mt(mm, mas_detach); > validate_mm(mm); > - if (unlock) > + if (vms->unlock) > mmap_read_unlock(mm); > > __mt_destroy(mas_detach->tree); > @@ -2746,11 +2769,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, str= uct vm_area_struct *vma, > MA_STATE(mas_detach, &mt_detach, 0, 0); > mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK= _MASK); > mt_on_stack(mt_detach); > + struct vma_munmap_struct vms; > int error; > - unsigned long locked_vm =3D 0; > > - error =3D vmi_gather_munmap_vmas(vmi, vma, mm, start, end, uf, > - &mas_detach, &locked_vm); > + init_vma_munmap(&vms, vmi, vma, start, end, uf, unlock); > + > + error =3D vms_gather_munmap_vmas(&vms, &mas_detach); > if (error) > goto gather_failed; > > @@ -2758,8 +2782,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struc= t vm_area_struct *vma, > if (error) > goto clear_area_failed; > > - vmi_complete_munmap_vmas(vmi, vma, mm, start, end, unlock, &mas_d= etach, > - locked_vm); > + vms_complete_munmap_vmas(&vms, &mas_detach); > return 0; > > clear_area_failed: > -- > 2.43.0 >