From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 078C3E77183 for ; Mon, 16 Dec 2024 21:15:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DEEA6B00CE; Mon, 16 Dec 2024 16:15:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 68E6E6B00D3; Mon, 16 Dec 2024 16:15:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 508AA6B00D5; Mon, 16 Dec 2024 16:15:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2E7636B00CE for ; Mon, 16 Dec 2024 16:15:54 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C05D6C019F for ; Mon, 16 Dec 2024 21:15:53 +0000 (UTC) X-FDA: 82902078840.04.1EBB8AA Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf07.hostedemail.com (Postfix) with ESMTP id 1235F40017 for ; Mon, 16 Dec 2024 21:15:09 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=NfAQT7RC; spf=none (imf07.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734383738; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wqNYOv2fOBvrraLi79AJjU9ppSyLY79/galhJe9yCkU=; b=FX9M/BBcofQeBIoGJEhzPiac3CeAqjNeDxTxi8INyNhFpXwzzOuwgGL8jfoOrgP/TrfABO RzoOd9JGKDFQICcyrIzuNUYLgVhizPm2kXuDBfSJ7uKXIgPvOJuqx3xII3mHcRx/S6VRA6 K2hNn3aXEg4nrDNzg4lKqrsRTZmI8Vg= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=NfAQT7RC; spf=none (imf07.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734383738; a=rsa-sha256; cv=none; b=vdzYUFH/EHp2Jvv0XmJbpC9GlyTNF4JGmWhX8ubnbFIkq5nrCKUSZI9lR0IqN+MFV3NyNX un3AEpvOXSuxlb/b78ryN3Af9PSCbfWq3D445jbwHMsoYNbjoyQEmVQYgVwkG0D2GyFRRB okp/XQJ9Br/lNA3dpGixEkyB7HgXGvI= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=wqNYOv2fOBvrraLi79AJjU9ppSyLY79/galhJe9yCkU=; b=NfAQT7RC312eJHlSA9cGJJbM6k YMsJHVMYQwLfEH/fJ+uQdUatWWB2pmwhcnB+eAE3B3tKIZ5MgKTXOKwNo14ZOfnjRLxRh6HeW3SVX r9oQxU9hjAXWeElN2EwabOa6s+UP3IDHJaZUJNAJLYbk5BxeheDKWBDoJ61Nkp20o0/7d8uX1syvQ tQTSIbhKwrkEq8cpVtuR2nBFUwrZRrME7nIEa0eSNPImDgyJ8H43+Nq7QFe/jHNALSwcOn5hyMF9t MY1MG2pe+AIFxrl2ya7eVRongwJYDCEuLudwQ2fhpowZ4Hn8w5l5wTTY0xO0sPjIuF1pkzZPCEHpc 9/G2u0Tw==; Received: from 77-249-17-89.cable.dynamic.v4.ziggo.nl ([77.249.17.89] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1tNIQz-00000004wfB-1VYk; Mon, 16 Dec 2024 21:15:25 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id A1DCF30031E; Mon, 16 Dec 2024 22:15:20 +0100 (CET) Date: Mon, 16 Dec 2024 22:15:20 +0100 From: Peter Zijlstra To: Suren Baghdasaryan Cc: akpm@linux-foundation.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count Message-ID: <20241216211520.GB9803@noisy.programming.kicks-ass.net> References: <20241216192419.2970941-1-surenb@google.com> <20241216192419.2970941-11-surenb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241216192419.2970941-11-surenb@google.com> X-Rspamd-Queue-Id: 1235F40017 X-Stat-Signature: 9qyack1krk59efuat9isdehaueu8wuok X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1734383709-750021 X-HE-Meta: U2FsdGVkX18FurTRWcBZtkXudWfvC8MS4xPE9IdrKJBiqswOYH4olmoKVu1IRTXxwDcRFyILYtCKS01TxeHeyHCru/54E6NG0RiXAKsaxCWtvZVak/7dNkI2LFUt8dey6n1t+Z+aLgQrlEqO4sVAoeDKQO0CsdBBsho8eDwhXxFQ1+0vYfa0jtuMuuACKFOHUgC43V06ncPnPMg2dtkWh9ZdhwblCADUfZxE2x9s4fjf7sRaIgxQ+SLLhYETf0LY1gm9H5coJe/4ZEEHMg3+bMOjzRk9qIOEoXeH8iRWoGIbT0D/9TcEquHX5K3sheyDkvhZjQAvI4PRzwl3X6RN9azIByQxIpz3QhL5ndFMrkUPf7on6klRAlpaZs7EsGxtZs+CljokcajW7ebgjNPXq6z2yJMc4KZVN+73A4OV/8p4Wvh7w8ecArM+jbUofRXKxjh+RZlPQTkPOvIY27dVeOztgFVU8dW0V+Lw9N3q/qYFuwYB5pyGOs+RDIfbdkp1KEwkZ2Sh7213h9z/X7dqQPsCTnt5eYlHcLXyZTgh2qP0COS5nr2YCkTsgEcxC1M5Xjw7OuErGlVuHpwAf/CFa644rQK/qJgA2oY4UDNojuXyz7nB05hVWKhXhySz9lQwtfi/mj5bhH7H7gFDncqbeDKbAIY+WfBIdDXzuSLFnltzMSB9VhIf2faoue2ntWX1ex8aGejO4fMNAVv7mR2lA1bgXQKRfXWcM9WVGb+uLEo05anRFEaNqlEWcmALeOcQdI/Ex4ab6QKTYpXZoDKbnDp6O4Wc6WAGdyMMedKqArpaoKRH9AdIOimvJBbgk1BYxiKvcbgmUoFLXdLvUNts0/+LhpKo4DVGOzPYvRejjHuweY1eW+gksUsH07Bi57K0lXXgyAGEDEahFptxWXwPlpCGfUPpWiayIWZYBJGGtS1SgGP/i+qm++0XZ/lF2sfCFffy6lTmU3uJm/e8k8c mgHRCUh3 xpSHbWGgKgVba4PfuEEIX6IRP5msvbwfG8HfGbFu4sIVfNjQJZDPwOHQhrDEf9sXwvu0DqpYf4SHHyHFm/29EdUvMZ/xtIX/06RY95wTypSEP8vdL26NTFieICNj58yRgc0OV5Ra3QRZJOfhH2swGAXvLrnepbs3GpeyCx+kLOIXFD8nCOL/xr+PuJ26Cec8jut66udaE3JBlyb6ZJ3wu7y0Xq+rtuICUT83xZmckjDWGI7ADpNWs4W4YUCchZBKqhCemJ8/+gxay7S6UuoMZoje0HkqAZRgK4hPh7SjBV89PXVmGN91hriI2QL/KJ1RfxibxEu2+pAmiWCTLKYZVUBD8aI4i3xyIMMbuM63zAoCuSvc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 16, 2024 at 11:24:13AM -0800, Suren Baghdasaryan wrote: FWIW, I find the whole VMA_STATE_{A,DE}TATCHED thing awkward. And perhaps s/VMA_STATE_LOCKED/VMA_LOCK_OFFSET/ ? Also, perhaps: #define VMA_REF_LIMIT (VMA_LOCK_OFFSET - 2) > @@ -699,10 +700,27 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {} > #ifdef CONFIG_PER_VMA_LOCK > static inline void vma_lock_init(struct vm_area_struct *vma) > { > - init_rwsem(&vma->vm_lock.lock); > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + static struct lock_class_key lockdep_key; > + > + lockdep_init_map(&vma->vmlock_dep_map, "vm_lock", &lockdep_key, 0); > +#endif > + refcount_set(&vma->vm_refcnt, VMA_STATE_DETACHED); > vma->vm_lock_seq = UINT_MAX; Depending on how you do the actual allocation (GFP_ZERO) you might want to avoid that vm_refcount store entirely. Perhaps instead write: VM_WARN_ON(refcount_read(&vma->vm_refcnt)); > @@ -813,25 +849,42 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma) > > static inline void vma_assert_locked(struct vm_area_struct *vma) > { > - if (!rwsem_is_locked(&vma->vm_lock.lock)) > + if (refcount_read(&vma->vm_refcnt) <= VMA_STATE_ATTACHED) if (is_vma_detached(vma)) > vma_assert_write_locked(vma); > } > > -static inline void vma_mark_attached(struct vm_area_struct *vma) > +/* > + * WARNING: to avoid racing with vma_mark_attached(), should be called either > + * under mmap_write_lock or when the object has been isolated under > + * mmap_write_lock, ensuring no competing writers. > + */ > +static inline bool is_vma_detached(struct vm_area_struct *vma) > { > - vma->detached = false; > + return refcount_read(&vma->vm_refcnt) == VMA_STATE_DETACHED; return !refcount_read(&vma->vm_refcnt); > } > > -static inline void vma_mark_detached(struct vm_area_struct *vma) > +static inline void vma_mark_attached(struct vm_area_struct *vma) > { > - /* When detaching vma should be write-locked */ > vma_assert_write_locked(vma); > - vma->detached = true; > + > + if (is_vma_detached(vma)) > + refcount_set(&vma->vm_refcnt, VMA_STATE_ATTACHED); Urgh, so it would be really good to not call this at all them not 0. I've not tried to untangle the mess, but this is really awkward. Surely you don't add it to the mas multiple times either. Also: refcount_set(&vma->vm_refcnt, 1); is so much clearer. That is, should this not live in vma_iter_store*(), right before mas_store_gfp() ? Also, ISTR having to set vm_lock_seq right before it? > } > > -static inline bool is_vma_detached(struct vm_area_struct *vma) > +static inline void vma_mark_detached(struct vm_area_struct *vma) > { > - return vma->detached; > + vma_assert_write_locked(vma); > + > + if (is_vma_detached(vma)) > + return; Again, this just reads like confusion :/ Surely you don't have the same with mas_detach? > + > + /* We are the only writer, so no need to use vma_refcount_put(). */ > + if (!refcount_dec_and_test(&vma->vm_refcnt)) { > + /* > + * Reader must have temporarily raised vm_refcnt but it will > + * drop it without using the vma since vma is write-locked. > + */ > + } > }