From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 854F5C63797 for ; Tue, 17 Jan 2023 04:53:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF1276B0071; Mon, 16 Jan 2023 23:52:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BA2356B0073; Mon, 16 Jan 2023 23:52:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A688E6B0074; Mon, 16 Jan 2023 23:52:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 93A8A6B0071 for ; Mon, 16 Jan 2023 23:52:59 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 69AD14015F for ; Tue, 17 Jan 2023 04:52:59 +0000 (UTC) X-FDA: 80363071278.17.CC22A53 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) by imf12.hostedemail.com (Postfix) with ESMTP id DB31540005 for ; Tue, 17 Jan 2023 04:52:57 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Q2yAZjcO; spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673931177; a=rsa-sha256; cv=none; b=3Qhh61ewVsT2AM3SRj/liNMKdFdB6vM0MHiFZ1PKd6FAPXdmwCw0SeguQiJsgvAAZCXYxp Wgm0cyoBK4bnbEjIPJRgINb+IPtK/9sMoZ+wXRUkB2u4/IIoxjHVFNTFOrNjPG8rsyv8FW mN+xAiKXJhV4zI7JconHZQOivtMpTpQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Q2yAZjcO; spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673931177; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=; b=TS1S7jLocExjoxm5dJqxUsITsf3fq3ePvanMGIbuTD6MI0D5+JYfbQu1cEkaauw/ItYH63 AOKv5luvHrHeZlfDKuGzEVcn3k2/2w65w8UirfB9Y6QHzSZ+6PQKSFyeKTGCAVLER4zVmV 9eBmdU9AjKJfKvR+z33FdCTEJZ4yNrg= Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-4b6255ce5baso406629937b3.11 for ; Mon, 16 Jan 2023 20:52:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=; b=Q2yAZjcOitMQvX26vUcaa9AFrjuCsba52/KOG4Iti5Hj+jvI9tbyNfN3Oc1HWbNWZv ljjQ+ykxuh8fFz165trv6V1C3jeu+OmSBNt/pGxUJin2o0zDUdFf9o7NytgGh2yKTSbt 2Kd1W/STFE3dTmDISrylLaLTA/pIsdEOzqDgDI925ScM+SVAV4ufTIg3JYYLt5Hzt4Tp qub/t7eS+VUQjc+hdJ8yQMcVVQgogVoex6HhgmHkOBIU0x+/syByOU9bLHWKV66ejiLD Lcu5PWvj0ZUCXFI5Gr+7UEzaYek7iqfbBrGQnowq+6/7RBUZ8tEN4G3kz2tIhFmgzGUm v2Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6cw4KABvzxyFyTr8U+aFOnDbCHOejbJ4PKoRgeioGcE=; b=LgpCDFnrU6WasKmurVOqv8h/e3Yenr0L3GindMCyTmFnzybu5Ds+ZR73epnSt4DNQu 8sL7NXvzOhvcdCyBqYbGXbedgSDVXUd9+FGrdwpdPnVjNkH9XhGvTLRDROJd4T2GW3En mPKitLxJjgyfYLv0KgFG2tCT2MjRMgkkkBe5D62FC8In4bkOYBc30eqHTnRFXTkY0rQ/ 939Oc0GdU0+2orC7ftCz572vdkgwcV8ubWzfaGuedoWj0EYRp9enpN5BReAu5qDvhArm kEwsgDnTtU9iM06b5A3Ef0OFtqjeXOhBQLD0pNDQBJrD7tLzIdI0iNjejabbT3FJsKjC 1lNA== X-Gm-Message-State: AFqh2kpyIg6NjcpF++77q8cB2xpWNLinFfxbbOUHfmlI8fcYcRzYceUk xbWOrLe0+7IZT1AMRUD45zvKzgCnevvmlnZi+JR85w== X-Google-Smtp-Source: AMrXdXvcMR9BSaGwhNU0+zyHk9pwCldBcAo4dRNPFsCz0h2FJWOV0z6ELu4HGpTdDNaFFvi6tSOBflN/EtknrzMPBDk= X-Received: by 2002:a0d:d484:0:b0:4dc:4113:f224 with SMTP id w126-20020a0dd484000000b004dc4113f224mr275356ywd.455.1673931176846; Mon, 16 Jan 2023 20:52:56 -0800 (PST) MIME-Version: 1.0 References: <20230109205336.3665937-42-surenb@google.com> <20230116140649.2012-1-hdanton@sina.com> <20230117031632.2321-1-hdanton@sina.com> In-Reply-To: <20230117031632.2321-1-hdanton@sina.com> From: Suren Baghdasaryan Date: Mon, 16 Jan 2023 20:52:45 -0800 Message-ID: Subject: Re: [PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock To: Hillf Danton Cc: vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, peterz@infradead.org, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: DB31540005 X-Rspamd-Server: rspam01 X-Stat-Signature: p4f5fhb9as6pkcbm99i3xwbpn745b713 X-HE-Tag: 1673931177-619703 X-HE-Meta: U2FsdGVkX1+WFcU54YTCay9wCCnKajdTP7CM1ElX7WUTsTjTfjaol3+I7HqsfIhiXhQQTRm9vjfiWdehuuAE9BhbPqf8RBfi3+XVX5J/KWH1ykPRlnKwjON/RaBE2g229hhKowutzF+0WKzCd9hW37y8m19DE+b8kaem5PDW+Sq+TdZnQ69dYTopmsJpxmxWL/oX3FCexZXi9VQUP3NdAgNT3s1mkA3MPMLX0pN6S0l98Vf5Ho4cEEESBiaFbMUpw3Y6V7B6zdXYM07lxnDPDYHZBvqkJnXI1opDESgy6Yav+z4qOLaMiinHqnOqmWxBUEFxB6hq7DvrgIySIwQRoiKCzsPBBs8XSagMXlRSH8G9Vml4Ry3j8/gweNhA4+XDDjpPPMYElM06f1lbZO+/OABnoFbUPobM1Z7Wa8HiSIXFBQso6ht+6oMFSrAmowkTUUuTg5SN+VObdMzRrtZUIkXsHVybMkByh38LLt+Rqn8wmKMz9XKWjcyAGSlpjX6pXSH0x9QwmX+QydJTZ5MXnP4ZqIWvJVAzp9cB5YmuhMFM4gW34M3c6zHRl1/YFXV1N7Cowll6MB+vpwhjhA3qX85gWUJgJ6XigIvX37TjKW36THinlHmstKYG5cQjnGy/cc4/ukA61Ts0UKU5KCZ9OKWQZAZZ5UknMyqeYY72/0TJkoqaRe5SbaArcoKhaSdBLMzjcF5i89R5dggXxBYVVqf44vSimMmP05IgLMYIRD4S12SpeASc3bNv9SFcJ35ptbWFUeyCOAeM4GhZNS48vEOXLQgXGsxWi1ySNWpFm8T7lQ+Gq6+/fNz6wiEQfVjHaQXXc/0SkEK2NakSvXkloAnztP1zPXrE9Eqf2NUb2pyfXVQzW4eR/kw+S/Npfv5xV0RQtzrRfanR6hQtXQhXx2rIFPx6afo84imPXgmTKsZMEiDz6LvA5q8O2BzUuRcK0c1PjnRvrU5HxdhXDsy NBVl8vmK Vt1S2d8gdFCfgj5falast/kLDZCeQfFF8gegtezAIJYsAXsd8WdMs/dhEzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000021, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 16, 2023 at 7:16 PM Hillf Danton wrote: > > On Mon, 16 Jan 2023 15:08:48 -0800 Suren Baghdasaryan > > On Mon, Jan 16, 2023 at 6:07 AM Hillf Danton wrote: > > > On Mon, 9 Jan 2023 12:53:36 -0800 Suren Baghdasaryan > > > > static inline bool vma_read_trylock(struct vm_area_struct *vma) > > > > { > > > > /* Check before locking. A race might cause false locked result. */ > > > > - if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > > > > + if (vma->vm_lock->lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > > > > return false; > > > > > > Add mb to pair with the above wmb like > > > > The wmb above is to ensure the ordering between updates of lock_seq > > and vm_lock->count (lock_seq is updated first and vm_lock->count only > > after that). The first access to vm_lock->count in this function is > > atomic_inc_unless_negative() and it's an atomic RMW operation with a > > return value. According to documentation such functions are fully > > ordered, therefore I think we already have an implicit full memory > > barrier between reads of lock_seq and vm_lock->count here. Am I wrong? > > No you are not. I'm not wrong or the other way around? Please expand a bit. > Revisit it in case of full mb not ensured. > > #ifndef arch_atomic_inc_unless_negative > static __always_inline bool > arch_atomic_inc_unless_negative(atomic_t *v) > { > int c = arch_atomic_read(v); > > do { > if (unlikely(c < 0)) > return false; > } while (!arch_atomic_try_cmpxchg(v, &c, c + 1)); > > return true; > } > #define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative > #endif I think your point is that the full mb is not ensured here, specifically if c<0? > > > > int count = atomic_read(&vma->vm_lock->count); > > > for (;;) { > > > int new = count + 1; > > > > > > if (count < 0 || new < 0) > > > return false; > > > > > > new = atomic_cmpxchg(&vma->vm_lock->count, count, new); > > > if (new == count) > > > break; > > > count = new; > > > cpu_relax(); > > > } > > > > > > (wake up waiting readers after taking the lock; > > > but the write lock owner before this read trylock should be > > > responsible for waking waiters up.) > > > > > > lock acquire for read. > > > > This schema might cause readers to wait, which is not an exact > > Could you specify a bit more on wait? Oh, I misunderstood your intent with that for() loop. Indeed, if a writer took the lock, the count will be negative and it terminates with a failure. Yeah, I think that would work. > > > replacement for down_read_trylock(). The requirement to wake up > > waiting readers also complicates things > > If the writer lock owner is preempted by a reader while releasing lock, > > set count to zero > <-- preempt > wake up waiters > > then lock is owned by reader but with read waiters. > > That is buggy if write waiter starvation is allowed in this patchset. I don't quite understand your point here. Readers don't wait, so there can't be "read waiters". Could you please expand with a race diagram maybe? > > > and since we can always fall > > back to mmap_lock, that complication is unnecessary IMHO. I could use > > part of your suggestion like this: > > > > int new = count + 1; > > > > if (count < 0 || new < 0) > > return false; > > > > new = atomic_cmpxchg(&vma->vm_lock->count, count, new); > > if (new == count) > > return false; > > > > Compared to doing atomic_inc_unless_negative() first, like I did > > originally, this schema opens a bit wider window for the writer to get > > in the middle and cause the reader to fail locking but I don't think > > it would result in any visible regression. > > It is definitely fine for writer to acquire the lock while reader is > doing trylock, no? Yes. >