From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18A49C3DA78 for ; Tue, 17 Jan 2023 03:16:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99F3E6B0075; Mon, 16 Jan 2023 22:16:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 94FAF6B0078; Mon, 16 Jan 2023 22:16:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83E216B007B; Mon, 16 Jan 2023 22:16:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7102D6B0075 for ; Mon, 16 Jan 2023 22:16:51 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4870F406CC for ; Tue, 17 Jan 2023 03:16:51 +0000 (UTC) X-FDA: 80362829022.13.D04B033 Received: from mail3-167.sinamail.sina.com.cn (mail3-167.sinamail.sina.com.cn [202.108.3.167]) by imf06.hostedemail.com (Postfix) with ESMTP id 2FB2418000A for ; Tue, 17 Jan 2023 03:16:47 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.167 as permitted sender) smtp.mailfrom=hdanton@sina.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673925409; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qbRcSLuLCK2OS4W2pw1nXF4w9d08FOFknWxQmEV0B3o=; b=21fJ3ynBEJAQj9ZsMoSQhbmTyC/6qv2PjIZYjAAxjE/ihVmPFB3eaNRNzyF4EAnFpPRr1L PE7+iwgHVaCGYfbjycsJ/GQ0W4KNw43CEdYGvX9CjSMJKTktgvV06/mGt6X3SjFe0ROgr2 AyBpeZCq+7JZaSZRhTpCdLoz8EgN8+Y= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.167 as permitted sender) smtp.mailfrom=hdanton@sina.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673925409; a=rsa-sha256; cv=none; b=76zswu5c8opAa8jFI3/STuW31FgVBS4rKDnd79fJ8WoyGlX+thZrIuEyljlCAD1287qc7l I/xLfYUpIaktMIl8AC/a3wp1BTkLnO31GJclUzxFE72aElxi4NuRc3MYesTUAHmFV9f1dv X77RsGLk0+sJFcavfPECqlu30Ea52sQ= Received: from unknown (HELO localhost.localdomain)([114.249.61.130]) by sina.com (172.16.97.32) with ESMTP id 63C6123800012EBE; Tue, 17 Jan 2023 11:12:57 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 557301628978 From: Hillf Danton To: Suren Baghdasaryan Cc: vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, peterz@infradead.org, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock Date: Tue, 17 Jan 2023 11:16:32 +0800 Message-Id: <20230117031632.2321-1-hdanton@sina.com> In-Reply-To: References: <20230109205336.3665937-42-surenb@google.com> <20230116140649.2012-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 2FB2418000A X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 9gxzwuq64b6sawcfct4ym3ghnm61e17z X-HE-Tag: 1673925407-75379 X-HE-Meta: U2FsdGVkX1915ZibeaJxnziaHVoC2+sOhijmFqHR33g0WsYJG01JbTwj92Pp+eDoI2USyxnHhlA4PHXPjU+QWPAWWDRbo1XHIRQ1Cgw97ytHLA3gT2leKRGQoAbOx3aJbAhRInaZphM871F0D3fei7kedZsG3A4ni2WQdEuBzw/nqmKLJY3uoNyAzQBVsCHplcPW6QeC9vFoziq3wQfj9ZWguU8MbJlGsNWi4muf3TfZcCVXP4nm0SkPp8XO3CcvyG/4zltu63GFksUqFrwEO5yNOqvu7xTIbX5URUKNxzHEBFTxDExeS/hK5LM6vWHjZORNGjqEUTYSPUcwM2njLLaydEl4T1DL5lecv8LSwgBRjfCcSa/lsRQJWno5smFp9ZWcdqspORfi+uF+QraI+P6RcCCWsKMGcW21JdX5/uBC96G2BMmnUbPKz6b7p/Y55aX7gBi8ndVpKoTEmAq5VcS8+PlfbOmLufYcrej7eLzToAha4L7H+QO2GCWMoUYrSLjmsIjoi88EHz5101T5gSzGU6CzeNqfT4r0JU/LAQAr6B3mlkJpshJMmAyp37rED+B2pohmlbVUCr+7A85xKomO0bE82Lzuv0xG/Rp33smMt4F//qQyAm8d8LgCD7XRYV62FNumFGtiJUYpHQoXRK9vr4imUztOCyGFEB+dq3byRsqROmXGnMEU9+SmKPnnFyHs90+R8AJgUO3XRNF4vQuVvqXesgmOyKIg4TkJPZAEovgZVUQ32d9gdWvTXk9BwUWXyz4ETDoKvromoNn4Z4dOK63OictzR5P5nY8ByEjzoQK0BFs9AyX0yNqsBdhB+39g3XNH/h/Nx0lCALUV+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 16 Jan 2023 15:08:48 -0800 Suren Baghdasaryan > On Mon, Jan 16, 2023 at 6:07 AM Hillf Danton wrote: > > On Mon, 9 Jan 2023 12:53:36 -0800 Suren Baghdasaryan > > > static inline bool vma_read_trylock(struct vm_area_struct *vma) > > > { > > > /* Check before locking. A race might cause false locked result. */ > > > - if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > > > + if (vma->vm_lock->lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > > > return false; > > > > Add mb to pair with the above wmb like > > The wmb above is to ensure the ordering between updates of lock_seq > and vm_lock->count (lock_seq is updated first and vm_lock->count only > after that). The first access to vm_lock->count in this function is > atomic_inc_unless_negative() and it's an atomic RMW operation with a > return value. According to documentation such functions are fully > ordered, therefore I think we already have an implicit full memory > barrier between reads of lock_seq and vm_lock->count here. Am I wrong? No you are not. Revisit it in case of full mb not ensured. #ifndef arch_atomic_inc_unless_negative static __always_inline bool arch_atomic_inc_unless_negative(atomic_t *v) { int c = arch_atomic_read(v); do { if (unlikely(c < 0)) return false; } while (!arch_atomic_try_cmpxchg(v, &c, c + 1)); return true; } #define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative #endif > > int count = atomic_read(&vma->vm_lock->count); > > for (;;) { > > int new = count + 1; > > > > if (count < 0 || new < 0) > > return false; > > > > new = atomic_cmpxchg(&vma->vm_lock->count, count, new); > > if (new == count) > > break; > > count = new; > > cpu_relax(); > > } > > > > (wake up waiting readers after taking the lock; > > but the write lock owner before this read trylock should be > > responsible for waking waiters up.) > > > > lock acquire for read. > > This schema might cause readers to wait, which is not an exact Could you specify a bit more on wait? > replacement for down_read_trylock(). The requirement to wake up > waiting readers also complicates things If the writer lock owner is preempted by a reader while releasing lock, set count to zero <-- preempt wake up waiters then lock is owned by reader but with read waiters. That is buggy if write waiter starvation is allowed in this patchset. > and since we can always fall > back to mmap_lock, that complication is unnecessary IMHO. I could use > part of your suggestion like this: > > int new = count + 1; > > if (count < 0 || new < 0) > return false; > > new = atomic_cmpxchg(&vma->vm_lock->count, count, new); > if (new == count) > return false; > > Compared to doing atomic_inc_unless_negative() first, like I did > originally, this schema opens a bit wider window for the writer to get > in the middle and cause the reader to fail locking but I don't think > it would result in any visible regression. It is definitely fine for writer to acquire the lock while reader is doing trylock, no?