From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B91BE7717F for ; Mon, 16 Dec 2024 21:45:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8A41B6B00B8; Mon, 16 Dec 2024 16:45:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 853FF6B00BA; Mon, 16 Dec 2024 16:45:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CE4A6B00B9; Mon, 16 Dec 2024 16:45:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4CE416B0085 for ; Mon, 16 Dec 2024 16:45:00 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id CC493A023E for ; Mon, 16 Dec 2024 21:44:59 +0000 (UTC) X-FDA: 82902151794.23.FD898ED Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf19.hostedemail.com (Postfix) with ESMTP id B410F1A000D for ; Mon, 16 Dec 2024 21:44:26 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PrdGWAes; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734385470; a=rsa-sha256; cv=none; b=wewQtJilha0IibZZqlQYjIeIhhQcbG7efuAvqFKyDjRVG4ctzUSm09Vosq8io3mAzbjESH h92+Al2NpLw7ipKFnxLm4RpC3UWXa/WFx5TBZEcqm2NUmJzDeZWgM84JjogpzWZP54l3/I X4U1eHub9FNulshXDvNTJ/TueijtFYs= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PrdGWAes; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734385470; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cuJwz+yRVvIlEgf04EbTYieZU4XF4v2uk6gzXtst0Jc=; b=b12JSJWx+rfLy5c3ANpvyYKSXgT6pbDTGkQB76y0f3TOFGBBMSM+8xcl9oqueJ8rBYlgNj iQZsb05u+SHHgP74Vfd0nux/tUuRH1KixuzStE/6+l5JnF4rvcmIQPi6XQn0I0/Nap5bQj lgLLno+tZ3LpLLKqpvFvQ8TdkrIz0aM= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-467abce2ef9so74441cf.0 for ; Mon, 16 Dec 2024 13:44:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734385497; x=1734990297; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cuJwz+yRVvIlEgf04EbTYieZU4XF4v2uk6gzXtst0Jc=; b=PrdGWAeszMLcysICseybxBmACxe5RUzoe9sfkU8rGdxrSjv3LFBdlfs0K3E2Vowesk JUgdBRuX+v/4WAZZEoe2Uoy7LVX1J1n7gfn/qiBoQHjZT3UcdXRKcq+KKxrt8lIcdxfh 25cweMJI3OQ7+6qXfWH8UVhGqeOvkYubcU69UAG/ImtayOWfKY1nWp+lQ84mplh5H7aC a8VWI3eXpG0zfczCT3BPmlSTR1cy3IMYQ7AZVAFJJMbzK7M0Vb2NoI4xzmNtS5YWu4wl pqsGTyN5bAHrW6m94ueNGUYJ0xaeCuw6lUcNSktyAsamp64y5sQhtPRm2549bNLL3a55 m+EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734385497; x=1734990297; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cuJwz+yRVvIlEgf04EbTYieZU4XF4v2uk6gzXtst0Jc=; b=k8LDnviLvpGJdtJ3g3UL9dDpjEDFP9Ki+7WZWYeHHzRPtzizdHYR94RJ+ZgoYszdDy +hscr73ZLUOWR5ggFMXJpirN3lXrGamJN414wTKCBkGgcqLafGBkxefPUPvqdhLguxhR Svvnmh11ep4o9S3qogMPR/ornx2hyKOAuFgwlvIhpdpxjcWehXTePpJnpFs1OrJS8N49 kLtEATAaUqmtGABxdlUoNeNF8y4o624mkJNAU3o4jMlG5TeqjOF44BkerRVa39Vt4N1F mQF5y0aVRzPD5bdYK/FTrMpLy4IErNcpSfaSBxgVXvD4qnFJNf6/qwnUmqUwnRXcVeKE fInQ== X-Forwarded-Encrypted: i=1; AJvYcCXV5jA4QxzN50aP8Sg051muqRZHw43g5tlojNf/6S6a5q4YUSZWVN5NFy2Of8uN6935rfpuvCQJsQ==@kvack.org X-Gm-Message-State: AOJu0YzwfmhfI/h3a9xwOMxXM3zqfBIJMrD40bJVzXUjRyX0wV7/6G7u Vp71QZ7z73mFnk6fEiUYI4w5FbbowgfxyxnBPCoJPOyszzkUMWmD0JhAYQfVjlqestbV3j7lrB2 1j+ox7YpiQBv+1pwiAcMLs770P0diJfqRZ7SH X-Gm-Gg: ASbGncu43wrDvpLWVGXG8f0cD1xZZtzPQ3yJHioMaLjJiAA0uN2uKPeoiH2eCB4rPtx +v4QFARdZ5uyoTHYMSKoW25noU5gKQ/wYmwLmHQ== X-Google-Smtp-Source: AGHT+IHSCAO9iQHEYLNXIJAxK4wdh8jBKfn7bro1hAD4EjBpHJqaMGL174tkiD5onRcXrrU1sVKTBcw8Q4f33gkb36w= X-Received: by 2002:ac8:5d93:0:b0:465:18f3:79c8 with SMTP id d75a77b69052e-468fb10f5f3mr134651cf.13.1734385496878; Mon, 16 Dec 2024 13:44:56 -0800 (PST) MIME-Version: 1.0 References: <20241216192419.2970941-1-surenb@google.com> <20241216192419.2970941-11-surenb@google.com> <20241216213753.GD9803@noisy.programming.kicks-ass.net> In-Reply-To: <20241216213753.GD9803@noisy.programming.kicks-ass.net> From: Suren Baghdasaryan Date: Mon, 16 Dec 2024 13:44:45 -0800 Message-ID: Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count To: Peter Zijlstra Cc: akpm@linux-foundation.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B410F1A000D X-Stat-Signature: wpztni6cprkpbrrscxm1t8k7fx3npp7j X-Rspam-User: X-HE-Tag: 1734385466-862512 X-HE-Meta: U2FsdGVkX19B4vy1NJgLvSC9Lno8EOckDDj4DF9ceYntsiKUvNGYRHUUClfdZtRgxWHMoOoADCghf3DpfE2ABQIYcfaFdlgzvAq0RXh+r3GQbVJg/ZhGtTprS/FZCHSMcmm3EO2c3+oYaGzW9m/9CVjhkx2NjRYsiEWUb24geSCO29Z8ifw+/llqfiryCtt728Rap/EDCl+39fatxg7fohEyQDNAGipp6/IA/oVA09y44OAtUBk2SNR12AJlWyEIr0NzBZyMhURdD2SvnD18aD7vHIul/FTfmZXMGZZf0pLOVT1TZQ4EvrwzD0TfCQy98flmA+m+kEyriR7SI/RuMVUJzgOmYRRAv41CvVcg3qoOvsAKhsRAuwSyPnIOfpcXYYOn2rD0DX5Fy3B7LJpnh0tQLT0rO6Z3AiwLvKPjiEsoQCiuygav6Tqw8KRAU3EnQ06DEoQzWmBQSdtgMGX0WDV7VX4SfevbewVlwT3kfL/tgKMCwgKgZfDS8tBo7X6gMIMXx1n8MX1eoB9NvBpbz42dklgT2O+ilps9HGGALW2xdVJbxaUPvLEdoszo8G2biU3Bs9beMKOJ9xDr8iV+kWhcZUJDVxyRT3xNAic5l85eRCfyJ3jbf8RDziUu8pSiqGpxDyoUwG9ZmHWZYkeLCnCeleml8JALkVvs+ugnLbHcMlLC/GNL+uLYvhGLJX+eLy2GcDV+WO6GNdfxqb0++UNQHvwCTTVU0aSy9mZWbZ18oj6kqr+cPFNSnn20QyZcmRfapHdqdM/xl59AL094p0iIOsgr5ceumAgbNBOyjkG8C0nftd+ESYcLe0PkCSZJU/DbRE6Nmt6I5GjJO+HzQQd0dG4HDfWVxJ14FVvP8WkUCmOpDubyotBWuz/SrbqhLTSj7gk9a4bKByfViAPJkCJ3eRIxe+zfQbmp6XfYOPL5W/9gxX2AT3KI2xb33AjdePyJqNdXutl80qmWbfB aCWpgKl2 9HkWfwfJcPvpTHQk6bQF3fxvsDw7ulZWl3+xluEYGV7gZjSu3QNNR3BhiRla7AxeCcy4tQNh8jHCz08b1cqI2YBDjRr7F/o/FPEdMey7PMcvDqFAacRQAykhMqiFJ0AH0E9i5OSVYZjOEHZSbeuwTWLiuCiDsPhKxRysSkysPMGHmryZzaQmhGkn4p2PCcv/nsAmPqwrORbrEd85kyq4sz5AqsuK5RpWrE7tPgejrwLn9/IBzJXyGT1ocS10OUs5HrNJMM+Mjev/mI9NOPSkWkD4qCsIXBI1bgGagAXOdjJ6TlVKa8bbm+lCJDZBzpjmx3TCJvyy6KjnkkoNnzux0+t53PdzeSUjbdPtD X-Bogosity: Ham, tests=bogofilter, spamicity=0.027461, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 16, 2024 at 1:38=E2=80=AFPM Peter Zijlstra wrote: > > On Mon, Dec 16, 2024 at 11:24:13AM -0800, Suren Baghdasaryan wrote: > > +static inline void vma_refcount_put(struct vm_area_struct *vma) > > +{ > > + int refcnt; > > + > > + if (!__refcount_dec_and_test(&vma->vm_refcnt, &refcnt)) { > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); > > + > > + if (refcnt & VMA_STATE_LOCKED) > > + rcuwait_wake_up(&vma->vm_mm->vma_writer_wait); > > + } > > +} > > + > > /* > > * Try to read-lock a vma. The function is allowed to occasionally yie= ld false > > * locked result to avoid performance overhead, in which case we fall = back to > > @@ -710,6 +728,8 @@ static inline void vma_lock_init(struct vm_area_str= uct *vma) > > */ > > static inline bool vma_start_read(struct vm_area_struct *vma) > > { > > + int oldcnt; > > + > > /* > > * Check before locking. A race might cause false locked result. > > * We can use READ_ONCE() for the mm_lock_seq here, and don't nee= d > > @@ -720,13 +740,20 @@ static inline bool vma_start_read(struct vm_area_= struct *vma) > > if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_l= ock_seq.sequence)) > > return false; > > > > + > > + rwsem_acquire_read(&vma->vmlock_dep_map, 0, 0, _RET_IP_); > > + /* Limit at VMA_STATE_LOCKED - 2 to leave one count for a writer = */ > > + if (unlikely(!__refcount_inc_not_zero_limited(&vma->vm_refcnt, &o= ldcnt, > > + VMA_STATE_LOCKED - = 2))) { > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); > > return false; > > + } > > + lock_acquired(&vma->vmlock_dep_map, _RET_IP_); > > > > /* > > + * Overflow of vm_lock_seq/mm_lock_seq might produce false locked= result. > > * False unlocked result is impossible because we modify and chec= k > > + * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lo= ck_seq > > * modification invalidates all existing locks. > > * > > * We must use ACQUIRE semantics for the mm_lock_seq so that if w= e are > > @@ -734,10 +761,12 @@ static inline bool vma_start_read(struct vm_area_= struct *vma) > > * after it has been unlocked. > > * This pairs with RELEASE semantics in vma_end_write_all(). > > */ > > + if (oldcnt & VMA_STATE_LOCKED || > > + unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_m= m->mm_lock_seq))) { > > + vma_refcount_put(vma); > > Suppose we have detach race with a concurrent RCU lookup like: > > vma =3D mas_lookup(); > > vma_start_write(); > mas_detach(); > vma_start_read() > rwsem_acquire_read() > inc // success > vma_mark_detach(); > dec_and_test // assumes 1->0 > // is actually 2->1 > > if (vm_lock_seq =3D=3D vma->vm_mm= _mm_lock_seq) // true > vma_refcount_put > dec_and_test() // 1->0 > *NO* rwsem_release() > Yes, this is possible. I think that's not a problem until we start reusing the vmas and I deal with this race later in this patchset. I think what you described here is the same race I mention in the description of this patch: https://lore.kernel.org/all/20241216192419.2970941-14-surenb@google.com/ I introduce vma_ensure_detached() in that patch to handle this case and ensure that vmas are detached before they are returned into the slab cache for reuse. Does that make sense? > > > > return false; > > } > > + > > return true; > > }