From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52393E7717D for ; Wed, 11 Dec 2024 08:25:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 971756B0283; Wed, 11 Dec 2024 03:25:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F88F6B0284; Wed, 11 Dec 2024 03:25:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7729D6B0285; Wed, 11 Dec 2024 03:25:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 54F676B0283 for ; Wed, 11 Dec 2024 03:25:54 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 08DF414055B for ; Wed, 11 Dec 2024 08:25:54 +0000 (UTC) X-FDA: 82881994146.07.FD0B765 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 781C140009 for ; Wed, 11 Dec 2024 08:25:34 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=OP4CchUu; spf=none (imf01.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733905541; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VRKUf7RHuICLEGBDyGRTHbcfZwuBStKvAlTf7DKhWpk=; b=woz/6ERYkQFdOVSjXIEHjoTuCEMR631T5PTuXgqjBg4kCcVmEhCnpNQJ4DOH0/pE2PL5RU rtYqXQItyNk/dZMI6hfKEgIKxpjhpJxSUDps//EPq1jGU2x4FlymohBrNsjaewHBazOThT 3PmQsETpBIv3GygMyZwefb7TGo35Aes= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733905541; a=rsa-sha256; cv=none; b=pUOQGWiCRK2g71ZA++Aa/7xSsyhz8DE2eDavlBCPz6SlXo6od5B+wux8YBbPJ+uZjWKDOE j+XS9a52xt5LmnJm1/NFhYxWiDie7ff3drefnRIhl6asQvBn+BlQFnI+9uyl4S23HzPG5o 1InaA21UdD4S9lC1KKQwJBoMxX4VMJc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=OP4CchUu; spf=none (imf01.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=VRKUf7RHuICLEGBDyGRTHbcfZwuBStKvAlTf7DKhWpk=; b=OP4CchUuNPmUpanAmTOAUV5JVB 7/v5syDT3urXIOzhtcnJ64z4zrdjq6JY3L/YXSlcCkriiCyPQRkNk5eCZ/vxEGySLxdms67AylgNU b7E+Xu1BlHq21Au3CoIX90qvjKE+yqRFW+29KEMEddtH4cTsX8QRiFt9fPF78bvm6up3pZy7aanZJ YEb1de0uXJOcj1Nkcz4XNGeIWv2JnEX84/XY/pJSNHXptfZnTGf9meg57jg7JgnU5W0CZR9B7MG5W 7SdXh59/nbncVciawp0h1T0zfOccX4sBz6TzobHvOLa52BBOoxeDVr6965bqIf4B2/NeXy7Sc5B+F CbNp96Hw==; Received: from 77-249-17-89.cable.dynamic.v4.ziggo.nl ([77.249.17.89] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1tLI2Q-0000000ExKo-0HS3; Wed, 11 Dec 2024 08:25:42 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 4118630049D; Wed, 11 Dec 2024 09:25:41 +0100 (CET) Date: Wed, 11 Dec 2024 09:25:41 +0100 From: Peter Zijlstra To: Suren Baghdasaryan Cc: Matthew Wilcox , akpm@linux-foundation.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH 3/4] mm: replace rw_semaphore with atomic_t in vma_lock Message-ID: <20241211082541.GQ21636@noisy.programming.kicks-ass.net> References: <20241111205506.3404479-1-surenb@google.com> <20241111205506.3404479-4-surenb@google.com> <20241210223850.GA2484@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 781C140009 X-Stat-Signature: 1h8zi9xp1z4896up9cwxrib1mi94ye1h X-Rspam-User: X-HE-Tag: 1733905534-529489 X-HE-Meta: U2FsdGVkX1/qRhSH11m0hnn24vI48EeFSvSkW/ywVq2O31OTNqiq0vWydyWeRSMDWpE6G3x0fFSWZEiE++D92YuMlqH2UAV26S2B7ezjdoCgm0Xq/q889mZ9T6om3SiU2e36LgrbkGifg9GMAljdhIUJ7UcclheaPZMVTYa2MRnsJVl0E+TflIkdd/t1PVBw2fBhdGILeC0qoeuJ9mh1HrlRb2nLlJmGLjf/gbsnpliFX3g9rVnFywnD0W7PnOyyAQvxs9bJQ9WlGJwTngZzau8ROwRQDl6KRDw8hcWro2QrDi3BJ5MB2Oew560cG6ksLLoUWYmBslJef5SH3g1cjKOV5Q7ZNE1vw8Qhmh+bnPwBMxHFm1Zsrd63vmJa3w8eIYl1T1l5B8GMqO2pDGiAItQNsuHjJteinLK7Y75vaiQKtMahc2oiyffKcMJzL9vUCjdOpvS9xZldmvtQxl4oaVB5LFaZeo/BqNWFtKFo9+s4JZLRT8YnNQ7NFGU4HHxcKYz2VwR3ssIyahMgj0SWmxVws15lYSnSxQRAiFyJuWJRKkk+g1dMummp7L/Zv4RglYYI1xXjdFhs3XwAZxI/4Jlyckn9OumjYTaIMLz8/rdXZkfoAkyAUezidPRT5WQ+ew6O+uHkUOo2Q0/TFidMz5HoLoKl0dJKDerUlZ7F55HbDHDGgMowPYKUmNO7ModgBlXBVVFkXSRJLKzhdb4Ne7QMwJz7isqYKjAT9g3QuDtHBUdHQrpPUxEHpT+SWSRXAiw9TvviZU9zdyw031an03QE6/I4MGj+xCRf1ylPKgFlGgSpDtnRb0+WgOLtsBe9y7WozPdFbuTBGG+3VLBkYKdaGNJQIV2RHDyGe8ZN/5pFOd8bxL5KOXoR0J2f6XYOEn2MzCp641dwbSyPi/TZ5UKZTWkjt7Y7qUI39ab3+fRI4ZoOuAcId8D4GqVH/atLABWUyVN+5uIVygQznUk /H6QlsjZ MMO9xlowO3gVe4WL1YdQKzVk4UFFXkm2QfTillCiF9FgIe2IEq42lFrMgpA3HrtqKI0wa/t7RpIMXqRA5h+58flpohrhWbT0CLpdlpeIGipui0AD7ThxQaC4je9zp5WdjtqRXV6vXe76c/ZJ4zALEl5Onf3WwhGW/DAGlyvgOEJCbJQyfRQqq9kUq0B3YFPmS2fZMEtDNm8BgKN9vOPeVFAW7Aa7D4MQZngrtTheZts1VGphe6ZszgDz25Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 10, 2024 at 03:37:50PM -0800, Suren Baghdasaryan wrote: > > Replace vm_lock with vm_refcnt. Replace vm_detached with vm_refcnt == 0 > > -- that is, attach sets refcount to 1 to indicate it is part of the mas, > > detached is the final 'put'. > > I need to double-check if we ever write-lock a detached vma. I don't > think we do but better be safe. If we do then that wait-until() should > accept 0x8000'0001 as well. vma_start_write() __is_vma_write_locked() mmap_assert_write_locked(vma->vm_mm); So this really should hold afaict. > > RCU lookup does the inc_not_zero thing, when increment succeeds, compare > > mm/addr to validate. > > > > vma_start_write() already relies on mmap_lock being held for writing, > > and thus does not have to worry about writer-vs-writer contention, that > > is fully resolved by mmap_sem. This means we only need to wait for > > readers to drop out. > > > > vma_start_write() > > add(0x8000'0001); // could fetch_add and double check the high > > // bit wasn't already set. > > wait-until(refcnt == 0x8000'0002); // mas + writer ref > > WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); > > sub(0x8000'0000); > > > > vma_end_write() > > put(); > > We don't really have vma_end_write(). Instead it's vma_end_write_all() > which increments mm_lock_seq unlocking all write-locked VMAs. > Therefore in vma_start_write() I think we can sub(0x8000'0001) at the > end. Right, I know you don't, but you should :-), I've suggested adding this before. > > vma_start_read() then becomes something like: > > > > if (vm_lock_seq == mm_lock_seq) > > return false; > > > > cnt = fetch_inc(1); > > if (cnt & msb || vm_lock_seq == mm_lock_seq) { > > put(); > > return false; > > } > > > > return true; > > > > vma_end_read() then becomes: > > put(); > > > > > > and the down_read() from uffffffd requires mmap_read_lock() and thus > > does not have to worry about writers, it can simpy be inc() and put(), > > no? > > I think your proposal should work. Let me try to code it and see if > something breaks. Btw, for the wait-until() and put() you can use rcuwait; that is the simplest wait form we have. It's suitable because we only ever have the one waiter.