From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9232BE77188 for ; Mon, 23 Dec 2024 03:03:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F29B6B0082; Sun, 22 Dec 2024 22:03:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2799D6B0083; Sun, 22 Dec 2024 22:03:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 141306B0088; Sun, 22 Dec 2024 22:03:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EAB6B6B0082 for ; Sun, 22 Dec 2024 22:03:28 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B05DCA1643 for ; Mon, 23 Dec 2024 03:03:28 +0000 (UTC) X-FDA: 82924727676.28.F9A30C1 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 9C4EA18000D for ; Mon, 23 Dec 2024 03:03:00 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=b3t2K4Cf; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734922964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=udfgoBpU9YC1tkAs9yYIvHXM/pzmQwYqqI8FYG0Q6jo=; b=aoP2nUH940McTILh9un5v99DqUZmGI/TMKZJytctyfaXFh9CXe+RXEXAszD1ysghmllNOA 7qK2TGeriA++fDQqGvkqUFA925xqU8UZsb4Lq8lh4QAalL5mjVKz8a1RzMQnfmzYWZA4pM hcfn27Fzr2ZLxv5UctuwmqnGN6EE1AI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734922964; a=rsa-sha256; cv=none; b=jh6eb4Y1Xyf/GriixdbQOwFZbBh0tHynner0FBzZM/Gg70McpEVbWcnlztNFSaqxP7sxK2 sBTb1YbtBd0rJrGxl6VIPTTuItye/WhuhogGGcA0nMF6YO4/LDEy1TiQAF0JemdGWnd4bY TcRt+8k2iphmqYhEz8m4VATTQzs1APc= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=b3t2K4Cf; spf=pass (imf06.hostedemail.com: domain of surenb@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4679b5c66d0so655911cf.1 for ; Sun, 22 Dec 2024 19:03:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734923006; x=1735527806; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=udfgoBpU9YC1tkAs9yYIvHXM/pzmQwYqqI8FYG0Q6jo=; b=b3t2K4Cfg5NSVaV0bcrDpEAmtFRuWVJT7EEpBMdOQL8cLzz12uFl3hxBGkgNtvtpiR diaHCV5ZhYHByozHny4nGv54hZjgJQ2Bm2+pR1ty9d8EZxqmojMp1Q1oxvMrUMYvf3vq 6ECB6BW0u7PbqSoNXhg6RHdMhf5eR48qmBps0inkoXn6Wcmsx25ls8273OZzI0PIpwQK iX43npwSin6KA18qNZ6hOIOnPTpOvc0kj5pbJB+UrpPdyYqCrsEeflUJn0N2N6RWCHHK MB93sugDPmiYNYI+4vl6Cc+QDXaIC7StvIlcPAYDLYy+Kor7/l6U6LPtr4JYQJv1RcAr UELA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734923006; x=1735527806; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=udfgoBpU9YC1tkAs9yYIvHXM/pzmQwYqqI8FYG0Q6jo=; b=j3S7Ai7wsIEqZiWAgNxABbgzZNkOEN3EEfUbqYEdGxcPoY9UvxwTb9pOiejRvg1yUl FxYInmRGjgHTpQILYDBGvgRR+dR07rFLRi5pWENjwwqIqQ1nHMwERlWePMmBnNQ6R9Uj 5wM5gBvDnwEpAuCh5cEUfi8vyrw28rXMvZ2SGsnnMWVjJLaaDyr8ZHRmXlDEq5Dy83/s hdodRHrzJiH0mdzH6dHLG1hgORTruzBLyXv57G1DuuvNrKoBpqbDijGpZkHqbZ0eFfxB YVyVwSMBB50rj4aYH8/tyDQtbpIFcbGVbW99QSxZeSvsKIhXAurqWpxFJm8ucp3nRj8f vLKw== X-Forwarded-Encrypted: i=1; AJvYcCXtXIoJhppwcwaeGd3B5YR/A+35yhBmQmT/q02nI0uT9bhrL9dFsel5nv+uKt/lCP5CZzn6CCX/JA==@kvack.org X-Gm-Message-State: AOJu0Yw+eUGyo+lhDSnCJaJvfy40HNsLEX8KlkUWOCUWDh11AvdrpPHk YlmU6PAOAZgG/hUi20GzX9UQ4fDx4Ig7UKonW4dQy9O9pes63KovbZS7tjsnJr2BHncIvSSgfX1 6c9M83qcW9QGN8aa/nbQVdilZYDqJ7tLmgRp5 X-Gm-Gg: ASbGncuIBsglmg18ubdK2MhlODlzsx83gNUOZwPhlmPholBBdwHISpdO3jXxJpnwQzC YW0/f2k/CjqWLSXyrgjwX1SFMYSb2AAPj2ALH7A== X-Google-Smtp-Source: AGHT+IH6iUjP7T3YRQZk5dM79hzQPqi3CteufuNTiNR3N7HnbkvTjhN7FGDeZFxrI0CTiNpdaa1qmm48FdBbz54rlO4= X-Received: by 2002:a05:622a:1823:b0:466:a11c:cad2 with SMTP id d75a77b69052e-46a4c00cc98mr8909181cf.7.1734923005705; Sun, 22 Dec 2024 19:03:25 -0800 (PST) MIME-Version: 1.0 References: <20241219091334.GC26551@noisy.programming.kicks-ass.net> <20241219112011.GA34942@noisy.programming.kicks-ass.net> <20241219174235.GD26279@noisy.programming.kicks-ass.net> <20241219184642.GF26279@noisy.programming.kicks-ass.net> <6nck2rfwcytqdinsavmewytgcca43mldlczmao3zztrpr5v2ci@4xn6nwp6tcih> In-Reply-To: <6nck2rfwcytqdinsavmewytgcca43mldlczmao3zztrpr5v2ci@4xn6nwp6tcih> From: Suren Baghdasaryan Date: Sun, 22 Dec 2024 19:03:14 -0800 Message-ID: Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count To: "Liam R. Howlett" , Peter Zijlstra , Suren Baghdasaryan , akpm@linux-foundation.org, willy@infradead.org, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 9C4EA18000D X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 4m4rec4r5au8xid7cbqimedrtxffnm4q X-HE-Tag: 1734922980-831541 X-HE-Meta: U2FsdGVkX1/DZ1RiubZQiDzYygrcMs47Zw89g5ATqiIwTpyAKt1GKZfD/8LfVZpX3Gta573QxQkoOzX9iS3uiJNMECdeogNAH/ty9UWmKb/ObMHxY+PlMaqyKCvth3DkytyHG5ls6f5sK9GPLZ5TCLPxoXiu4x5rli+wk4iKGJ5qWDYeu4kee8OioQnre3LMXEBtcOayRQxnpfLlq4wiibouFhEXRKgZwu9BRb/A79Y3mvNhSHi5EPcbSZKg97aOOUoT0psZkUwVwwb2L2Db+3y3QKk04I9vMlFSh0HnJbmhWNgLfkhI6xBJ2vf93QjCueStnwqOhck4CYFaXeua3atnxHo0kgpklnTjtz636zBd5qF/E8xsLfxVdnUxgUWQZzUGfK8jGnf4aYXDnGPaM4chy2GG8OOXr28zvuvTePSkoW5bz6UAae6O1ZVvRc/YDF8fKhmoF7oXTInZATfborebXI1ulGhxta2VXj4GORZsv4hqHnVTfBK2YFeayDdvUrgf3WYQ/LnQexMl+Au8dpPBhDjOzNp2UMsL6shol6RcTzU3Vh3DdW2nTnbdgLqoD1+PEq3mQxJC7RuoKEWJWXJ2tBqVuNcETvG4PWDNLlYrFHLolqEoTS1rEDGWfdnjTczmEWu8+rEm1dC7miXuj5xvCTwOoHJ06OWjlcfcvN7uWKuaaqVF1eIS0PWBpbSvf4VrWWVNx/Cjn6jcHc+aXTMxbKdixETrXjQa5RgDQTmy/+H+C7JqnE9m1It0F3D6WCPh62vyGW352rx2NHHx3YLpOE67/xdX9xY2fn7g/dCmQ1XOImycpkZSm176JHNzmHQxe3imFi611ElKYR8UCQvC4qeN5xEzmHEYkAsCtHRU9J5pxiREZyHXgFAfvlKwYF8IheQ/6HiiCflbbaMJV/3bLOmIpHpMPmvJyyssX4bgnwHBkc0odw61m+lDDuXjRH2hilvpWWxaBc6cNw/ j+8UA8Nw e/JCQQQCTjg/CcJLTHpMcf/NNLHEYiw4iqyjn8nB1cqURKLgELL1mpL0G+P/zHWqKcGz5m6Su5f6yq+Y6xZM0w98V8zXlvOCoHZjK54peMVwIAD3MJlH3KT5l2ddHarssbdTQSiV/K82FzrC/r0AMqRDHFaoP/8sZ6HoVlwiXWM1bcwMUOZloF3n2Uo5BKnbOItByE5xbiHvGT4KnmKRCL9m4BvTzEtKrW92QxFmcI7aVhZvM5yZGNJkQjCNlI2C/za4/MX4wMi/UCFx82FoDM36seeJ5TvJod8upGOnSez7oIdZ474qNP5aQV1jD5xNFRvp5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.092822, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 19, 2024 at 10:55=E2=80=AFAM Liam R. Howlett wrote: > > * Peter Zijlstra [241219 13:47]: > > On Thu, Dec 19, 2024 at 01:18:23PM -0500, Liam R. Howlett wrote: > > > > > > For RCU lookups only the mas tree matters -- and its left present t= here. > > > > > > > > If you really want to block RCU readers, I would suggest punching a= hole > > > > in the mm_mt. All the traditional code won't notice anyway, this is= all > > > > with mmap_lock held for writing. > > > > > > We don't want to block all rcu readers, we want to block the rcu read= ers > > > that would see the problem - that is, anyone trying to read a particu= lar > > > area. > > > > > > Right now we can page fault in unpopulated vmas while writing other v= mas > > > to the tree. We are also moving more users to rcu reading to use the > > > vmas they need without waiting on writes to finish. > > > > > > Maybe I don't understand your suggestion, but I would think punching = a > > > hole would lose this advantage? > > > > My suggestion was to remove the range stuck in mas_detach from mm_mt. > > That is exactly the affected range, no? > > Yes. > > But then looping over the vmas will show a gap where there should not be > a gap. > > If we stop rcu readers entirely we lose the advantage. > > This is exactly the issue that the locking dance was working around :) IOW we write-lock the entire range before removing any part of it for the whole transaction to be atomic, correct? Peter, you suggested the following pattern for ensuring vma is detached with no possible readers: vma_iter_store() vma_start_write() vma_mark_detached() What do you think about this alternative? vma_start_write() ... vma_iter_store() vma_mark_detached() vma_assert_write_locked(vma) if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) vma_start_write() The second vma_start_write() is unlikely to be executed because the vma is locked, vm_refcnt might be increased only temporarily by readers before they realize the vma is locked and that's a very narrow window. I think performance should not visibly suffer? OTOH this would let us keep current locking patterns and would guarantee that vma_mark_detached() always exits with a detached and unused vma (less possibilities for someone not following an exact pattern and ending up with a detached but still used vma). >