From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36E33C27C54 for ; Thu, 6 Jun 2024 17:14:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF0556B00A0; Thu, 6 Jun 2024 13:14:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BABA26B00A3; Thu, 6 Jun 2024 13:14:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A40E46B00A4; Thu, 6 Jun 2024 13:14:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 885D56B00A0 for ; Thu, 6 Jun 2024 13:14:08 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3B5DE1A174A for ; Thu, 6 Jun 2024 17:14:08 +0000 (UTC) X-FDA: 82201111776.07.7211E70 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by imf30.hostedemail.com (Postfix) with ESMTP id 578E28001B for ; Thu, 6 Jun 2024 17:14:06 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gAHfmrop; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of surenb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717694046; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hRK/mnhg9puUTTzwI/wwQsIKXT3wkOFAvZtn/HKANeU=; b=fEmNVza5PM0tc/QIaje0EF5eccbaaLR4rDvCt1166ldCeOhCFAzZ41Sm1ca0ry2bKIOeVZ k3oq2s3H9lAyu1nNAnxewFkM5SECx5c68tQLcWQyztbTekF45f/ldl8kYRZo/QC/BoJTpc oZEJ7mlcR9xkCGErC53xUAmFSFcdfHk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717694046; a=rsa-sha256; cv=none; b=mSKpMGIz64qpAO7VhqTHfIBZSc/X8sbxxciFu28OLAozRwW3TaTJ5h3RqegFyG4ObVI7lu vTDQitFDzhfQvKetk4ohF+7sgif5DFCpdvcsPoQ948IOluPtEPACVLsAeQ08qS19W1XP64 +GC0F9zm4YXMwIiwlmVWlcGu57gNMFQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gAHfmrop; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of surenb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=surenb@google.com Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-dfaf39fc890so192570276.3 for ; Thu, 06 Jun 2024 10:14:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717694045; x=1718298845; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hRK/mnhg9puUTTzwI/wwQsIKXT3wkOFAvZtn/HKANeU=; b=gAHfmrop9KCkK2EGFIvT2KeKoZnDhIk7qCKIbxw6dUrEh0ThXeSRLiVZe/wVh49vUa fStcBmBOtPlTt8abIVCPB03K+IJdZaKXWmbb4KXKc/i/k/++/HwhQ+KUMqnHZ/ZhxnBI IN7W52ncb+v6oyV8ioQBy0QJs1ZauQIh4gD2POBworz7ReYcSfpQ4JzkTzTqqvy7UaX2 PXt5zGt3E6Hz9c1/cI3RDxmJ8nWp5lwXsqGGXUAvetar1eQ7XMZoOpKTvJrTfFhllyAi Oa89I0J43JuEJP1iuLx8n2qzlth3HUmxJbiGlQxaqpjVE6RPB0oqa8OCTqJ5a5Z2gCBZ TokQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717694045; x=1718298845; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hRK/mnhg9puUTTzwI/wwQsIKXT3wkOFAvZtn/HKANeU=; b=tbc3h6nj3IpZLs2/K3pG4G2JHDGSfCVZmTqfKg+C1kX1xQOysHo0WY2TxfZU3X7LkY 6VRB9vlKD5IZvLty64g0HtgNcsKVujE2pCPYYtQAP1utqnXemftDJ1v5u9o9cdKN/Lj0 q3hrU/EpnARy1zWMbtwv/crVM+qi4Jxk0Da6ffwrKFlTHLeSajO/wV0nYllHFXnR63yb BDRl2zjiJLw5E4UAzsfrIWgSO2jjkhdDxKhITZ71J6d/JL1WquGKMB+dmgDFZWW9lzgt GRe87C4M0jNdZwIQ+L7MUeS9U9cgzNXBgbZzpq5CJXXMHuHM+RRdyl4x6UiUtIfvSdIt /CFQ== X-Forwarded-Encrypted: i=1; AJvYcCXLZDTwMyiSENHpfCQCPvlMNddeogXN4IVfDV5ldwCD3zeA7QW0TDLzW1syJ/jpu+jGmzDeom6wm3k6tcUX3jKLk3c= X-Gm-Message-State: AOJu0Yw0t1Ld/Y2o2DPtFBpy2KZ68Y4IcNQ4DHuZQHsiArh/tMnUSrst VPs0AyNNtoCPGyq9IFEMzebzMuz8HLbyHNWXpaZa2jPIoMPF7FS5P1UpPxCYeQtPOE3WR1xeV8N WwnJNiyiiXk4zYrMANhP3cFMoDs5gG4/9k71n X-Google-Smtp-Source: AGHT+IE6g1Qt1a0uWEMeEkhuX3E8E0Aujra9U26A4am4FT81CMUGmJmJOs73aDziKCtCgxpH2azFXeJj7iJZZjlkoqQ= X-Received: by 2002:a0d:d709:0:b0:615:1a0:78ea with SMTP id 00721157ae682-62cbff3b0eamr45661767b3.34.1717694044790; Thu, 06 Jun 2024 10:14:04 -0700 (PDT) MIME-Version: 1.0 References: <20240605002459.4091285-1-andrii@kernel.org> <20240605002459.4091285-2-andrii@kernel.org> <5fmylram4hhrrdl7vf6odyvuxcrvhipsx2ij5z4dsfciuzf4on@qwk7qzze6gbt> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 6 Jun 2024 10:13:52 -0700 Message-ID: Subject: Re: [PATCH v3 1/9] mm: add find_vma()-like API but RCU protected and taking VMA lock To: Andrii Nakryiko Cc: "Liam R. Howlett" , Matthew Wilcox , Andrii Nakryiko , linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, rppt@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: kjdocdmi6n3npwqbj4jwe5roosgbtd7m X-Rspamd-Queue-Id: 578E28001B X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1717694046-954251 X-HE-Meta: U2FsdGVkX18sLTaQRKqXw2yZMUyk+QXCFRfAEbljV1X0PvdiNikbt3DYsKkYA7NyC4BE/2cK9YQufGQazibX/64QidqBRK+GicGNpgU/fse2lhPdObQ7wmgp32xstoopqYhNnP9deZqvRNMDFI7HPVw6G5DyXLDfpBfnqtqQkrIPli324dCzbVaxEg4K7vv/4r1vuNbuGa8kFawngZyxPoO7SUah2cwmPJx0V1YukiQjRU3Hl/gC4f0t2T2y+yCRuWqIELp1Qz9hin1HYi9ENGG09IPBYiyMZKBsF3+MTlG+hLYwHsahwZCynTd5+y3owWmzvmFFw+O/eg79Xvfb0DcAXxk5AS7TOJudjUPSCI2yZmZZSILPpUt/VndpBmJZvfFmmY/9dcQLcyXBiJhYa0NSwtfEY6L+ZoGgMzDP77/3RzMSwt38hKaOOeIrjEYqzGby0AFq34dB5FPDXcqJpNnjYA94DJ0LDOWGtFV10TWIygH604kbPOgIcHH5x67xGYhcyZlb7VlspXQsbRXaX5fXvm57fnBqRQ5AbrDuhEfNUJyH+EZnzsoLbFgzNNC7f40Ekmyx2KI3ckmbDzM32nqA+jj6ikR7vf20cRbKV5vTJyzE8aMLl5f77pGH8DJNh5VUQ+ADF9HAvSZ3iEEJViYYGi2KZyX9VUEGcIRvqKzuvJVIOeESEsH6pxx22mrIshkduuA43JAaoDFBpM+Px6c62ozEVCi27DlmgamvixoJ1muiiFgu10z6XIIp4EijB7EmuAN2nahnbyKkOBADJY7HqZiN/6mOLzQaQwsWZaZNtwJKAhAQ6EbC6jko+4sC/eiTf7gdjUnxxop00J1yXBO1so/dYzC7uRBk8Bp9cvgzx68KLjm16VjvjM8uNtLg44TGQ+qi7CoEt0AvFJtInSVHMjv1poZjNuKVPno1cC3TiylsJSe7qi5Tp/kQBSvDpNQXCI0xR9XmFRKO1j0 KBrRIc/R R9DyNLjrd+luXdc0+pPOh8/cV4Dh0YhUctH46I/j1Uo01R3NHR0k3UehGTvWbe3hh9b1PhkTXhzKUMLn+kh37fKrjSGvhYkoCsKV2pjnF6WcmRFCgLSMWEZ6yhBqbz8ZNEiu7N+TLsIwQW5lRMqYYmNu5LUJ7iDRKwfPHHfEoBIRuAIWtcnuNlxSzH4R+FxAnx1VVAGhvtDm/hrJRN7J0kGc/fz2OWOiJ/lVNXjI/0pXN7L7MDMCAP9bi3nr7IBNhO5ky32XtW+4QXwSDKyGhIzfVni2oPLplu03O2StHvbOadpCzMWrwDzQyiOt/x3/xaRvUbR2IEky9Tvx43yEN+AF78hHlHRXodLcwZG0qBWuXymjV7RjzDq8CJ17Ygz8ShcbU0VeX7Hma0Dbl0tIVfBqvnEO0gCl7tsnwXpPRiDQrpDTkyBuCUNzdgoRHNtcxczujzjDdwSSSQMLM2BzcuuLO+vmNilT75+NZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 9:52=E2=80=AFAM Andrii Nakryiko wrote: > > On Wed, Jun 5, 2024 at 4:22=E2=80=AFPM Suren Baghdasaryan wrote: > > > > On Wed, Jun 5, 2024 at 10:03=E2=80=AFAM Liam R. Howlett wrote: > > > > > > * Andrii Nakryiko [240605 12:27]: > > > > On Wed, Jun 5, 2024 at 9:24=E2=80=AFAM Andrii Nakryiko > > > > wrote: > > > > > > > > > > On Wed, Jun 5, 2024 at 9:13=E2=80=AFAM Andrii Nakryiko > > > > > wrote: > > > > > > > > > > > > On Wed, Jun 5, 2024 at 6:33=E2=80=AFAM Liam R. Howlett wrote: > > > > > > > > > > > > > > * Matthew Wilcox [240604 20:57]: > > > > > > > > On Tue, Jun 04, 2024 at 05:24:46PM -0700, Andrii Nakryiko w= rote: > > > > > > > > > +/* > > > > > > > > > + * find_and_lock_vma_rcu() - Find and lock the VMA for a= given address, or the > > > > > > > > > + * next VMA. Search is done under RCU protection, withou= t taking or assuming > > > > > > > > > + * mmap_lock. Returned VMA is guaranteed to be stable an= d not isolated. > > > > > > > > > > > > > > > > You know this is supposed to be the _short_ description, ri= ght? > > > > > > > > Three lines is way too long. The full description goes bet= ween the > > > > > > > > arguments and the Return: line. > > > > > > > > > > > > Sure, I'll adjust. > > > > > > > > > > > > > > > > > > > > > > > + * @mm: The mm_struct to check > > > > > > > > > + * @addr: The address > > > > > > > > > + * > > > > > > > > > + * Returns: The VMA associated with addr, or the next VM= A. > > > > > > > > > + * May return %NULL in the case of no VMA at addr or abo= ve. > > > > > > > > > + * If the VMA is being modified and can't be locked, -EB= USY is returned. > > > > > > > > > + */ > > > > > > > > > +struct vm_area_struct *find_and_lock_vma_rcu(struct mm_s= truct *mm, > > > > > > > > > + unsigned long ad= dress) > > > > > > > > > +{ > > > > > > > > > + MA_STATE(mas, &mm->mm_mt, address, address); > > > > > > > > > + struct vm_area_struct *vma; > > > > > > > > > + int err; > > > > > > > > > + > > > > > > > > > + rcu_read_lock(); > > > > > > > > > +retry: > > > > > > > > > + vma =3D mas_find(&mas, ULONG_MAX); > > > > > > > > > + if (!vma) { > > > > > > > > > + err =3D 0; /* no VMA, return NULL */ > > > > > > > > > + goto inval; > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > + if (!vma_start_read(vma)) { > > > > > > > > > + err =3D -EBUSY; > > > > > > > > > + goto inval; > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > + /* > > > > > > > > > + * Check since vm_start/vm_end might change before we= lock the VMA. > > > > > > > > > + * Note, unlike lock_vma_under_rcu() we are searching= for VMA covering > > > > > > > > > + * address or the next one, so we only make sure VMA = wasn't updated to > > > > > > > > > + * end before the address. > > > > > > > > > + */ > > > > > > > > > + if (unlikely(vma->vm_end <=3D address)) { > > > > > > > > > + err =3D -EBUSY; > > > > > > > > > + goto inval_end_read; > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > + /* Check if the VMA got isolated after we found it */ > > > > > > > > > + if (vma->detached) { > > > > > > > > > + vma_end_read(vma); > > > > > > > > > + count_vm_vma_lock_event(VMA_LOCK_MISS); > > > > > > > > > + /* The area was replaced with another one */ > > > > > > > > > > > > > > > > Surely you need to mas_reset() before you goto retry? > > > > > > > > > > > > > > Probably more than that. We've found and may have adjusted t= he > > > > > > > index/last; we should reconfigure the maple state. You shoul= d probably > > > > > > > use mas_set(), which will reset the maple state and set the i= ndex and > > > > > > > long to address. > > > > > > > > > > > > Yep, makes sense, thanks. As for the `unlikely(vma->vm_end <=3D > > > > > > address)` case, I presume we want to do the same, right? Basica= lly, on > > > > > > each retry start from the `address` unconditionally, no matter = what's > > > > > > the reason for retry. > > > > > > > > > > ah, never mind, we don't retry in that situation, I'll just put > > > > > `mas_set(&mas, address);` right before `goto retry;`. Unless we s= hould > > > > > actually retry in the case when VMA got moved before the requeste= d > > > > > address, not sure, let me know what you think. Presumably retryin= g > > > > > will allow us to get the correct VMA without the need to fall bac= k to > > > > > mmap_lock? > > > > > > > > sorry, one more question as I look some more around this (unfamilia= r > > > > to me) piece of code. I see that lock_vma_under_rcu counts > > > > VMA_LOCK_MISS on retry, but I see that there is actually a > > > > VMA_LOCK_RETRY stat as well. Any reason it's a MISS instead of RETR= Y? > > > > Should I use MISS as well, or actually count a RETRY? > > > > > > > > > > VMA_LOCK_MISS is used here because we missed the VMA due to a write > > > happening to move the vma (rather rare). The VMA_LOCK missed the vma= . > > > > > > VMA_LOCK_RETRY is used to indicate we need to retry under the mmap lo= ck. > > > A retry is needed after the VMA_LOCK did not work under rcu locking. > > > > Originally lock_vma_under_rcu() was used only inside page fault path, > > so these counters helped us quantify how effective VMA locking is when > > handling page faults. With more users of that function these counters > > will be affected by other paths as well. I'm not sure but I think it > > makes sense to use them only inside page fault path, IOW we should > > probably move count_vm_vma_lock_event() calls outside of > > lock_vma_under_rcu() and add them only when handling page faults. > > Alright, seems like I should then just drop count_vm_vma_lock_event() > from the API I'm adding. That would be my preference but as I said, I'm not 100% sure about this direction. > > > > > > > > > Thanks, > > > Liam