From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA7E5E77188 for ; Thu, 26 Dec 2024 17:12:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 723F06B00B1; Thu, 26 Dec 2024 12:12:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AD186B00B2; Thu, 26 Dec 2024 12:12:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 526A36B00B3; Thu, 26 Dec 2024 12:12:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 312166B00B1 for ; Thu, 26 Dec 2024 12:12:32 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9F7AF444C5 for ; Thu, 26 Dec 2024 17:12:31 +0000 (UTC) X-FDA: 82937752044.03.4A712EE Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf18.hostedemail.com (Postfix) with ESMTP id BF3051C000A for ; Thu, 26 Dec 2024 17:12:10 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GPcaQyFE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735233130; a=rsa-sha256; cv=none; b=dai9QNj3W3EV3R/pMDWSUmJch+Cylt3Jd4sAPQWb6L3DyN3ncMlOfmoOmPk9KLgnbvLRiB VrFcycUfJO+DkRnHxn3xlW1eKpIPopecXoIA1hCsVo692VOAXJ7GQ+1znDa3l8HY5s47f+ /Hgh6emWVddEA1f4BfeezSHY989U0Iw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GPcaQyFE; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735233130; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5MXnyStNwZl37BgSmuBxzAQVENQnr6Y4pRG9WUmvfaY=; b=4X9pDwVkn76YHaQ34qmNS1aAfAbL5Xbeeb61xUr0qyItX4jtZiOEOhy+T+1adK1Br5psI0 AFJ77nNaY0Rg6T9G9/imlizWmabdDYWeqWN6BqHb0FbZu2n93OwtLvIZBZC+RoF2iz+FEp ubFagtFXzZb+0lklDbR2+hX5bytNiE4= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-467abce2ef9so1684161cf.0 for ; Thu, 26 Dec 2024 09:12:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735233149; x=1735837949; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5MXnyStNwZl37BgSmuBxzAQVENQnr6Y4pRG9WUmvfaY=; b=GPcaQyFEM0+N4HsNIoKnUkILj4OTQxAMZrSrXxLOk38jv06S1w0Yx4Y3GqUKJgR4FV v//QosBWCde+a9eWyH7adMPsxRzmcP3GJxUhyhZf+3nCslj75aFAgamDUNot3cwAW2g/ 9G89i8mfdiWjoh25Ke618WH18m99z7mVShsLiAuez1yFvui6ryGSQjvpH5OzsPxRfJZl PtEObS5RBcSkPeFrN9UpRIQbZAw0utm6V9Fkw+KQI+k+J2ccR0MNH9SuV2SbDuTDQNpw ln9khi1L8tH6TM/5qCUUgD3zzBf2uBzfTGokmjuHiOyRC6k+49pEztk6x6PbOdDagPHb f07w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735233149; x=1735837949; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5MXnyStNwZl37BgSmuBxzAQVENQnr6Y4pRG9WUmvfaY=; b=rBVcbJdumWMUvfHfgEh130c+DZchEwNCwcU/dTEJGGiBJVekrQEOTd1dam4vbOoTje AewbGBY34CvXMZkgLZ8PkERn/f2d+WHnr/9JvwVS0S3dCVD3//Mb4pFLmnHTbNNCWSXV MaruIKK3knEwYjYgGYVNauVTsA2l4QQvOfgu/lJHSfm3pT2GL84IA7xrgRzHPPbAkS4k F0C/9YW/xnvxbhWcfqGMEdbfwVKMrKCHwcPE3vTTM5EIS9lKg2k7IfT5GnIilcwL80E1 rakz4biDyLdGLtqhQro4uoF1m7RmbwGL2J2mYpgOz1Ibm7PrNaifvqOuKRU4KD20mR0L Yy+Q== X-Forwarded-Encrypted: i=1; AJvYcCU5Zpy/HceTNxI5UL7+lKxd4vP3CCDVh4bzCrTvCS5lFsru/qmmBFFtJVssaKneON5P90qjffjPTw==@kvack.org X-Gm-Message-State: AOJu0YzJlT3F9UlLUIdA/rd2ezpk9TzJWwlJZzKwdsYOo9+qhrIvObfg 1/4vuKzDbCHAbNvSj9ezwYlJLqWHckwRsV+VpLTxM8drtDDmYM9zu2SASxmWv0vvTME6+BOALTJ C5PQ4RI/lMLAO3+5Ui2Y/1D3vQMETkwmZ7v2s X-Gm-Gg: ASbGncuGvQicnvX9nzIRWbvWGB1W6ITBfKryLkyiOwxb3DOs6A9vA6psiZ/xaHvbt8R EkRTHxrqxKnU93+HiDyDnn7/mV5rS6sk/t1vDjngP8p26XFKuoBX2EsjeUB0kHkbXhYtG X-Google-Smtp-Source: AGHT+IHdLDVDSP8Q+rE45soveXobHJSLC3HS4Q5maHd7IrRWZB+1jYnPYoAjtlmHFB/nhR3AiWiB7RieNc5P52YqgiM= X-Received: by 2002:a05:622a:110c:b0:466:a3ed:bde7 with SMTP id d75a77b69052e-46a4a8d7ef4mr19662751cf.6.1735233148536; Thu, 26 Dec 2024 09:12:28 -0800 (PST) MIME-Version: 1.0 References: <20241219091334.GC26551@noisy.programming.kicks-ass.net> <20241219112011.GA34942@noisy.programming.kicks-ass.net> <20241219174235.GD26279@noisy.programming.kicks-ass.net> <20241219184642.GF26279@noisy.programming.kicks-ass.net> <6nck2rfwcytqdinsavmewytgcca43mldlczmao3zztrpr5v2ci@4xn6nwp6tcih> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 26 Dec 2024 09:12:17 -0800 Message-ID: Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count To: "Liam R. Howlett" , Peter Zijlstra , Suren Baghdasaryan , akpm@linux-foundation.org, willy@infradead.org, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BF3051C000A X-Stat-Signature: u6n8rjjq4hb14nszx6t154iae67pap3e X-HE-Tag: 1735233130-946431 X-HE-Meta: U2FsdGVkX19GyTjYYLRE/KgyW0fdsdDbHaqD8OX7Gd6n8Zp1JfRXAfdYhCw2ryLEQkW/8DvBufxfMmdDqfIMF5y5tR2BzJfEy95lLaA5TRsUx2jk8PFyGV5Bq99WUXXFONN6Q0kAF19zFwnmZ8UH5y9ReMOvBbf3TNOlYaF7466DNraplI2DX+ea9X2KnRBhdYZgCCOFrwnI8MTs1jOqMqOb3PBG1vFAdc4GJUxGLf2rmi2P85kyMKl1yCFsjPlyAmjU3CTM7z7+C9PAt3+E86u4KsT8u9HG91gESvllfwBWISJfx6hj+xXU3miqAnv9VuNLNS2FSiF6WbNfLMk7RSW8k3/uc7l/+DjaPlhaUk8VtNoBYNqcLbzXP7Ujp64VqTw8eQz3t2Hy3X82h1UGVmwREm6is+bdfIzUcSeiK1z75xvhr9arcdTmvERgq9Y6yOE8GLdYAhLsHM6cWQhCJ6FsltCDK6rdOTEGzJ02jByehiwv/scfWtjzyGbQoGxsHixQAKAroIHk8DIVC+3FVqTCZmjOijkjAFK7TRJGxGiqFmkdDuCho9Pt/tbQ4rb249qDvBBpKxEvUWwps1J1TuOh5DcLlk9YDs/TthpANA0ev6nQwnXMaeXOrhnG3wHdn/Vff7VsD7b2Eriuk/Nh/zHGAFOhlN7mJ2GTdU2loMUEu4mIimp/jjaO62HwEKyNI4DbZDTwnp+WTqgS/UrATxYhlhQ8pccbPoey32RzFcZf2yf2CZa4PHihAUCiJbeSpqvmeLdwXPp0mPD4qdlcPq72C92zmKWA8SXJsL2KrrlscsZ3xyiH5s8LwdNAjnHawUdw+wN+Lmp7qGiAu6oFMVdW2vt2qPln6b8k3GiLrzGxwa6rZyns4G9qcGYHQja3FGJKi52pv0rH9XPULw2vE7lhrLVaAOlGjw6+maMK4jNmSlVYF4DjNoYVgBcBhZg3VnEd2nxzmigWesXj5fj PE+wIx+l jm8VUJXPFa3+VmdARBOxHVwiCXei/q8EBG6lLUJILHZuSYHZlC1Dv15RS6LjjW8JesD5VPlM8XZI2zj/R3hxvRGBlCqkIsKijwiWKyM1MO/jFqgG9HuUD0PZiv4PxEoN9ReVLukhozWWvFlBFijE9bMq7aS9UqoYD+DqwwOFuQmaLmrFX/woAnNQyo1fh/cAJBl1Ojv3F0l/nwj/HagQmpGOEL4jYoskMugrLjzL5c2Q8m5tye3R7fhyOgMAwR6oDz5rww5LyHUDlFlZGmQQBX4oikfc0kODQqnKyKlEr/WaRLn+TaSB3afshQJr9kRSVDIuFRo+D3BmcZcGhiHDJC1sjL+RpWHt6RvzzCK2YXigaNRTeIhdvmplugQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Dec 22, 2024 at 7:03=E2=80=AFPM Suren Baghdasaryan wrote: > > On Thu, Dec 19, 2024 at 10:55=E2=80=AFAM Liam R. Howlett > wrote: > > > > * Peter Zijlstra [241219 13:47]: > > > On Thu, Dec 19, 2024 at 01:18:23PM -0500, Liam R. Howlett wrote: > > > > > > > > For RCU lookups only the mas tree matters -- and its left present= there. > > > > > > > > > > If you really want to block RCU readers, I would suggest punching= a hole > > > > > in the mm_mt. All the traditional code won't notice anyway, this = is all > > > > > with mmap_lock held for writing. > > > > > > > > We don't want to block all rcu readers, we want to block the rcu re= aders > > > > that would see the problem - that is, anyone trying to read a parti= cular > > > > area. > > > > > > > > Right now we can page fault in unpopulated vmas while writing other= vmas > > > > to the tree. We are also moving more users to rcu reading to use t= he > > > > vmas they need without waiting on writes to finish. > > > > > > > > Maybe I don't understand your suggestion, but I would think punchin= g a > > > > hole would lose this advantage? > > > > > > My suggestion was to remove the range stuck in mas_detach from mm_mt. > > > That is exactly the affected range, no? > > > > Yes. > > > > But then looping over the vmas will show a gap where there should not b= e > > a gap. > > > > If we stop rcu readers entirely we lose the advantage. > > > > This is exactly the issue that the locking dance was working around :) > > IOW we write-lock the entire range before removing any part of it for > the whole transaction to be atomic, correct? > > > Peter, you suggested the following pattern for ensuring vma is > detached with no possible readers: > > vma_iter_store() > vma_start_write() > vma_mark_detached() > > What do you think about this alternative? > > vma_start_write() > ... > vma_iter_store() > vma_mark_detached() > vma_assert_write_locked(vma) > if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) > vma_start_write() > > The second vma_start_write() is unlikely to be executed because the > vma is locked, vm_refcnt might be increased only temporarily by > readers before they realize the vma is locked and that's a very narrow > window. I think performance should not visibly suffer? > OTOH this would let us keep current locking patterns and would > guarantee that vma_mark_detached() always exits with a detached and > unused vma (less possibilities for someone not following an exact > pattern and ending up with a detached but still used vma). I posted v7 of this patchset at https://lore.kernel.org/all/20241226170710.1159679-1-surenb@google.com/ >From the things we discussed, I didn't include the following: - Changing vma locking patterns - Changing do_vmi_align_munmap() to avoid reattach_vmas() It seems we need more discussion for the first one and the second one can be done completely independent from this patchset. I feel this patchset is already quite large, so trying to keep its size manageable. Thanks, Suren. > > >